-
Majorana Demonstrator Data Release for AI/ML Applications
Authors:
I. J. Arnquist,
F. T. Avignone III,
A. S. Barabash,
C. J. Barton,
K. H. Bhimani,
E. Blalock,
B. Bos,
M. Busch,
M. Buuck,
T. S. Caldwell,
Y. -D. Chan,
C. D. Christofferson,
P. -H. Chu,
M. L. Clark,
C. Cuesta,
J. A. Detwiler,
Yu. Efremenko,
H. Ejiri,
S. R. Elliott,
N. Fuad,
G. K. Giovanetti,
M. P. Green,
J. Gruszko,
I. S. Guinn,
V. E. Guiseppe
, et al. (35 additional authors not shown)
Abstract:
The enclosed data release consists of a subset of the calibration data from the Majorana Demonstrator experiment. Each Majorana event is accompanied by raw Germanium detector waveforms, pulse shape discrimination cuts, and calibrated final energies, all shared in an HDF5 file format along with relevant metadata. This release is specifically designed to support the training and testing of Artificia…
▽ More
The enclosed data release consists of a subset of the calibration data from the Majorana Demonstrator experiment. Each Majorana event is accompanied by raw Germanium detector waveforms, pulse shape discrimination cuts, and calibrated final energies, all shared in an HDF5 file format along with relevant metadata. This release is specifically designed to support the training and testing of Artificial Intelligence (AI) and Machine Learning (ML) algorithms upon our data. This document is structured as follows. Section I provides an overview of the dataset's content and format; Section II outlines the location of this dataset and the method for accessing it; Section III presents the NPML Machine Learning Challenge associated with this dataset; Section IV contains a disclaimer from the Majorana collaboration regarding the use of this dataset; Appendix A contains technical details of this data release. Please direct questions about the material provided within this release to [email protected] (A. Li).
△ Less
Submitted 14 September, 2023; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Interpretable Boosted Decision Tree Analysis for the Majorana Demonstrator
Authors:
I. J. Arnquist,
F. T. Avignone III,
A. S. Barabash,
C. J. Barton,
K. H. Bhimani,
E. Blalock,
B. Bos,
M. Busch,
M. Buuck,
T. S. Caldwell,
Y -D. Chan,
C. D. Christofferson,
P. -H. Chu,
M. L. Clark,
C. Cuesta,
J. A. Detwiler,
Yu. Efremenko,
S. R. Elliott,
G. K. Giovanetti,
M. P. Green,
J. Gruszko,
I. S. Guinn,
V. E. Guiseppe,
C. R. Haufe,
R. Henning
, et al. (30 additional authors not shown)
Abstract:
The Majorana Demonstrator is a leading experiment searching for neutrinoless double-beta decay with high purity germanium detectors (HPGe). Machine learning provides a new way to maximize the amount of information provided by these detectors, but the data-driven nature makes it less interpretable compared to traditional analysis. An interpretability study reveals the machine's decision-making logi…
▽ More
The Majorana Demonstrator is a leading experiment searching for neutrinoless double-beta decay with high purity germanium detectors (HPGe). Machine learning provides a new way to maximize the amount of information provided by these detectors, but the data-driven nature makes it less interpretable compared to traditional analysis. An interpretability study reveals the machine's decision-making logic, allowing us to learn from the machine to feedback to the traditional analysis. In this work, we have presented the first machine learning analysis of the data from the Majorana Demonstrator; this is also the first interpretable machine learning analysis of any germanium detector experiment. Two gradient boosted decision tree models are trained to learn from the data, and a game-theory-based model interpretability study is conducted to understand the origin of the classification power. By learning from data, this analysis recognizes the correlations among reconstruction parameters to further enhance the background rejection performance. By learning from the machine, this analysis reveals the importance of new background categories to reciprocally benefit the standard Majorana analysis. This model is highly compatible with next-generation germanium detector experiments like LEGEND since it can be simultaneously trained on a large number of detectors.
△ Less
Submitted 21 August, 2024; v1 submitted 21 July, 2022;
originally announced July 2022.
-
VirtualIdentity: Privacy-Preserving User Profiling
Authors:
Sisi Wang,
Wing-Sea Poon,
Golnoosh Farnadi,
Caleb Horst,
Kebra Thompson,
Michael Nickels,
Rafael Dowsley,
Anderson C. A. Nascimento,
Martine De Cock
Abstract:
User profiling from user generated content (UGC) is a common practice that supports the business models of many social media companies. Existing systems require that the UGC is fully exposed to the module that constructs the user profiles. In this paper we show that it is possible to build user profiles without ever accessing the user's original data, and without exposing the trained machine learn…
▽ More
User profiling from user generated content (UGC) is a common practice that supports the business models of many social media companies. Existing systems require that the UGC is fully exposed to the module that constructs the user profiles. In this paper we show that it is possible to build user profiles without ever accessing the user's original data, and without exposing the trained machine learning models for user profiling -- which are the intellectual property of the company -- to the users of the social media site. We present VirtualIdentity, an application that uses secure multi-party cryptographic protocols to detect the age, gender and personality traits of users by classifying their user-generated text and personal pictures with trained support vector machine models in a privacy-preserving manner.
△ Less
Submitted 30 August, 2018;
originally announced August 2018.