-
Improving Parametric Neural Networks for High-Energy Physics (and Beyond)
Authors:
Luca Anzalone,
Tommaso Diotalevi,
Daniele Bonacorsi
Abstract:
Signal-background classification is a central problem in High-Energy Physics (HEP), that plays a major role for the discovery of new fundamental particles. A recent method -- the Parametric Neural Network (pNN) -- leverages multiple signal mass hypotheses as an additional input feature to effectively replace a whole set of individual classifiers, each providing (in principle) the best response for…
▽ More
Signal-background classification is a central problem in High-Energy Physics (HEP), that plays a major role for the discovery of new fundamental particles. A recent method -- the Parametric Neural Network (pNN) -- leverages multiple signal mass hypotheses as an additional input feature to effectively replace a whole set of individual classifiers, each providing (in principle) the best response for the corresponding mass hypothesis. In this work we aim at deepening the understanding of pNNs in light of real-world usage. We discovered several peculiarities of parametric networks, providing intuition, metrics, and guidelines to them. We further propose an alternative parametrization scheme, resulting in a new parametrized neural network architecture: the AffinePNN; along with many other generally applicable improvements, like the balanced training procedure. Finally, we extensively and empirically evaluate our models on the HEPMASS dataset, along its imbalanced version (called HEPMASS-IMB) we provide here for the first time, to further validate our approach. Provided results are in terms of the impact of the proposed design decisions, classification performance, and interpolation capability, as well.
△ Less
Submitted 14 November, 2022; v1 submitted 1 February, 2022;
originally announced February 2022.
-
Collection and harmonization of system logs and prototypal Analytics services with the Elastic (ELK) suite at the INFN-CNAF computing centre
Authors:
Tommaso Diotalevi,
Antonio Falabella,
Barbara Martelli,
Diego Michelotto,
Lucia Morganti,
Daniele Bonacorsi,
Luca Giommi,
Simone Rossi Tisbeni
Abstract:
The distributed Grid infrastructure for High Energy Physics experiments at the Large Hadron Collider (LHC) in Geneva comprises a set of computing centres, spread all over the world, as part of the Worldwide LHC Computing Grid (WLCG). In Italy, the Tier-1 functionalities are served by the INFN-CNAF data center, which provides also computing and storage resources to more than twenty non-LHC experime…
▽ More
The distributed Grid infrastructure for High Energy Physics experiments at the Large Hadron Collider (LHC) in Geneva comprises a set of computing centres, spread all over the world, as part of the Worldwide LHC Computing Grid (WLCG). In Italy, the Tier-1 functionalities are served by the INFN-CNAF data center, which provides also computing and storage resources to more than twenty non-LHC experiments. For this reason, a high amount of logs are collected each day from various sources, which are highly heterogeneous and difficult to harmonize. In this contribution, a working implementation of a system that collects, parses and displays the log information from CNAF data sources and the investigation of a Machine Learning based predictive maintenance system, is presented.
△ Less
Submitted 13 May, 2021;
originally announced June 2021.
-
Comparison of Evolving Granular Classifiers applied to Anomaly Detection for Predictive Maintenance in Computing Centers
Authors:
Leticia Decker,
Daniel Leite,
Fabio Viola,
Daniele Bonacorsi
Abstract:
Log-based predictive maintenance of computing centers is a main concern regarding the worldwide computing grid that supports the CERN (European Organization for Nuclear Research) physics experiments. A log, as event-oriented adhoc information, is quite often given as unstructured big data. Log data processing is a time-consuming computational task. The goal is to grab essential information from a…
▽ More
Log-based predictive maintenance of computing centers is a main concern regarding the worldwide computing grid that supports the CERN (European Organization for Nuclear Research) physics experiments. A log, as event-oriented adhoc information, is quite often given as unstructured big data. Log data processing is a time-consuming computational task. The goal is to grab essential information from a continuously changeable grid environment to construct a classification model. Evolving granular classifiers are suited to learn from time-varying log streams and, therefore, perform online classification of the severity of anomalies. We formulated a 4-class online anomaly classification problem, and employed time windows between landmarks and two granular computing methods, namely, Fuzzy-set-Based evolving Modeling (FBeM) and evolving Granular Neural Network (eGNN), to model and monitor logging activity rate. The results of classification are of utmost importance for predictive maintenance because priority can be given to specific time intervals in which the classifier indicates the existence of high or medium severity anomalies.
△ Less
Submitted 8 April, 2020;
originally announced May 2020.
-
Real-Time Anomaly Detection in Data Centers for Log-based Predictive Maintenance using an Evolving Fuzzy-Rule-Based Approach
Authors:
Leticia Decker,
Daniel Leite,
Luca Giommi,
Daniele Bonacorsi
Abstract:
Detection of anomalous behaviors in data centers is crucial to predictive maintenance and data safety. With data centers, we mean any computer network that allows users to transmit and exchange data and information. In particular, we focus on the Tier-1 data center of the Italian Institute for Nuclear Physics (INFN), which supports the high-energy physics experiments at the Large Hadron Collider (…
▽ More
Detection of anomalous behaviors in data centers is crucial to predictive maintenance and data safety. With data centers, we mean any computer network that allows users to transmit and exchange data and information. In particular, we focus on the Tier-1 data center of the Italian Institute for Nuclear Physics (INFN), which supports the high-energy physics experiments at the Large Hadron Collider (LHC) in Geneva. The center provides resources and services needed for data processing, storage, analysis, and distribution. Log records in the data center is a stochastic and non-stationary phenomenon in nature. We propose a real-time approach to monitor and classify log records based on sliding time windows, and a time-varying evolving fuzzy-rule-based classification model. The most frequent log pattern according to a control chart is taken as the normal system status. We extract attributes from time windows to gradually develop and update an evolving Gaussian Fuzzy Classifier (eGFC) on the fly. The real-time anomaly monitoring system has to provide encouraging results in terms of accuracy, compactness, and real-time operation.
△ Less
Submitted 25 April, 2020;
originally announced April 2020.
-
Machine Learning in High Energy Physics Community White Paper
Authors:
Kim Albertsson,
Piero Altoe,
Dustin Anderson,
John Anderson,
Michael Andrews,
Juan Pedro Araque Espinosa,
Adam Aurisano,
Laurent Basara,
Adrian Bevan,
Wahid Bhimji,
Daniele Bonacorsi,
Bjorn Burkle,
Paolo Calafiura,
Mario Campanelli,
Louis Capps,
Federico Carminati,
Stefano Carrazza,
Yi-fan Chen,
Taylor Childers,
Yann Coadou,
Elias Coniavitis,
Kyle Cranmer,
Claire David,
Douglas Davis,
Andrea De Simone
, et al. (103 additional authors not shown)
Abstract:
Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We d…
▽ More
Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We detail a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit.
△ Less
Submitted 16 May, 2019; v1 submitted 8 July, 2018;
originally announced July 2018.