Search | arXiv e-print repository

arXiv:2407.20643 [pdf]

Generalizing AI-driven Assessment of Immunohistochemistry across Immunostains and Cancer Types: A Universal Immunohistochemistry Analyzer

Authors: Biagio Brattoli, Mohammad Mostafavi, Taebum Lee, Wonkyung Jung, Jeongun Ryu, Seonwook Park, Jongchan Park, Sergio Pereira, Seunghwan Shin, Sangjoon Choi, Hyojin Kim, Donggeun Yoo, Siraj M. Ali, Kyunghyun Paeng, Chan-Young Ock, Soo Ick Cho, Seokhwi Kim

Abstract: Despite advancements in methodologies, immunohistochemistry (IHC) remains the most utilized ancillary test for histopathologic and companion diagnostics in targeted therapies. However, objective IHC assessment poses challenges. Artificial intelligence (AI) has emerged as a potential solution, yet its development requires extensive training for each cancer and IHC type, limiting versatility. We dev… ▽ More Despite advancements in methodologies, immunohistochemistry (IHC) remains the most utilized ancillary test for histopathologic and companion diagnostics in targeted therapies. However, objective IHC assessment poses challenges. Artificial intelligence (AI) has emerged as a potential solution, yet its development requires extensive training for each cancer and IHC type, limiting versatility. We developed a Universal IHC (UIHC) analyzer, an AI model for interpreting IHC images regardless of tumor or IHC types, using training datasets from various cancers stained for PD-L1 and/or HER2. This multi-cohort trained model outperforms conventional single-cohort models in interpreting unseen IHCs (Kappa score 0.578 vs. up to 0.509) and consistently shows superior performance across different positive staining cutoff values. Qualitative analysis reveals that UIHC effectively clusters patches based on expression levels. The UIHC model also quantitatively assesses c-MET expression with MET mutations, representing a significant advancement in AI application in the era of personalized medicine and accumulating novel biomarkers. △ Less

Submitted 30 July, 2024; originally announced July 2024.

arXiv:2407.16427 [pdf, other]

Stochastic weight matrix dynamics during learning and Dyson Brownian motion

Authors: Gert Aarts, Biagio Lucini, Chanju Park

Abstract: We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the mini-batch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and non-uni… ▽ More We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the mini-batch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and non-universal features in the resulting Coulomb gas distribution and identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian restricted Boltzmann machine. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: 17 pages, 16 figures

arXiv:2407.12243 [pdf, other]

Explaining Deep Neural Networks by Leveraging Intrinsic Methods

Authors: Biagio La Rosa

Abstract: Despite their impact on the society, deep neural networks are often regarded as black-box models due to their intricate structures and the absence of explanations for their decisions. This opacity poses a significant challenge to AI systems wider adoption and trustworthiness. This thesis addresses this issue by contributing to the field of eXplainable AI, focusing on enhancing the interpretability… ▽ More Despite their impact on the society, deep neural networks are often regarded as black-box models due to their intricate structures and the absence of explanations for their decisions. This opacity poses a significant challenge to AI systems wider adoption and trustworthiness. This thesis addresses this issue by contributing to the field of eXplainable AI, focusing on enhancing the interpretability of deep neural networks. The core contributions lie in introducing novel techniques aimed at making these networks more interpretable by leveraging an analysis of their inner workings. Specifically, the contributions are threefold. Firstly, the thesis introduces designs for self-explanatory deep neural networks, such as the integration of external memory for interpretability purposes and the usage of prototype and constraint-based layers across several domains. Secondly, this research delves into novel investigations on neurons within trained deep neural networks, shedding light on overlooked phenomena related to their activation values. Lastly, the thesis conducts an analysis of the application of explanatory techniques in the field of visual analytics, exploring the maturity of their adoption and the potential of these systems to convey explanations to users effectively. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: PhD Thesis

arXiv:2407.03700 [pdf, other]

Deep learning architectures for data-driven damage detection in nonlinear dynamic systems

Authors: Harrish Joseph, Giuseppe Quaranta, Biagio Carboni, Walter Lacarbonara

Abstract: The primary goal of structural health monitoring is to detect damage at its onset before it reaches a critical level. The in-depth investigation in the present work addresses deep learning applied to data-driven damage detection in nonlinear dynamic systems. In particular, autoencoders (AEs) and generative adversarial networks (GANs) are implemented leveraging on 1D convolutional neural networks.… ▽ More The primary goal of structural health monitoring is to detect damage at its onset before it reaches a critical level. The in-depth investigation in the present work addresses deep learning applied to data-driven damage detection in nonlinear dynamic systems. In particular, autoencoders (AEs) and generative adversarial networks (GANs) are implemented leveraging on 1D convolutional neural networks. The onset of damage is detected in the investigated nonlinear dynamic systems by exciting random vibrations of varying intensity, without prior knowledge of the system or the excitation and in unsupervised manner. The comprehensive numerical study is conducted on dynamic systems exhibiting different types of nonlinear behavior. An experimental application related to a magneto-elastic nonlinear system is also presented to corroborate the conclusions. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 34 pages, 17 figures, 4 tables

arXiv:2406.13547 [pdf, other]

ModSec-Learn: Boosting ModSecurity with Machine Learning

Authors: Christian Scano, Giuseppe Floris, Biagio Montaruli, Luca Demetrio, Andrea Valenza, Luca Compagna, Davide Ariu, Luca Piras, Davide Balzarotti, Battista Biggio

Abstract: ModSecurity is widely recognized as the standard open-source Web Application Firewall (WAF), maintained by the OWASP Foundation. It detects malicious requests by matching them against the Core Rule Set (CRS), identifying well-known attack patterns. Each rule is manually assigned a weight based on the severity of the corresponding attack, and a request is blocked if the sum of the weights of matche… ▽ More ModSecurity is widely recognized as the standard open-source Web Application Firewall (WAF), maintained by the OWASP Foundation. It detects malicious requests by matching them against the Core Rule Set (CRS), identifying well-known attack patterns. Each rule is manually assigned a weight based on the severity of the corresponding attack, and a request is blocked if the sum of the weights of matched rules exceeds a given threshold. However, we argue that this strategy is largely ineffective against web attacks, as detection is only based on heuristics and not customized on the application to protect. In this work, we overcome this issue by proposing a machine-learning model that uses the CRS rules as input features. Through training, ModSec-Learn is able to tune the contribution of each CRS rule to predictions, thus adapting the severity level to the web applications to protect. Our experiments show that ModSec-Learn achieves a significantly better trade-off between detection and false positive rates. Finally, we analyze how sparse regularization can reduce the number of rules that are relevant at inference time, by discarding more than 30% of the CRS rules. We release our open-source code and the dataset at https://github.com/pralab/modsec-learn and https://github.com/pralab/http-traffic-dataset, respectively. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: arXiv admin note: text overlap with arXiv:2308.04964

arXiv:2403.14695 [pdf, other]

Chain-structured neural architecture search for financial time series forecasting

Authors: Denis Levchenko, Efstratios Rappos, Shabnam Ataee, Biagio Nigro, Stephan Robert

Abstract: We compare three popular neural architecture search strategies on chain-structured search spaces: Bayesian optimization, the hyperband method, and reinforcement learning in the context of financial time series forecasting. We compare three popular neural architecture search strategies on chain-structured search spaces: Bayesian optimization, the hyperband method, and reinforcement learning in the context of financial time series forecasting. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 17 pages, 3 figures

arXiv:2312.08232 [pdf, other]

Green Operations of SWIPT Networks: The Role of End-User Devices

Authors: Gianluca Rizzo, Marco Ajmone Marsan, Christian Esposito, Biagio Boi

Abstract: Internet of Things (IoT) devices often come with batteries of limited capacity that are not easily replaceable or rechargeable, and that constrain significantly the sensing, computing, and communication tasks that they can perform. The Simultaneous Wireless Information and Power Transfer (SWIPT) paradigm addresses this issue by delivering power wirelessly to energy-harvesting IoT devices with the… ▽ More Internet of Things (IoT) devices often come with batteries of limited capacity that are not easily replaceable or rechargeable, and that constrain significantly the sensing, computing, and communication tasks that they can perform. The Simultaneous Wireless Information and Power Transfer (SWIPT) paradigm addresses this issue by delivering power wirelessly to energy-harvesting IoT devices with the same signal used for information transfer. For their peculiarity, these networks require specific energy-efficient planning and management approaches. However, to date, it is not clear what are the most effective strategies for managing a SWIPT network for energy efficiency. In this paper, we address this issue by developing an analytical model based on stochastic geometry, accounting for the statistics of user-perceived performance and base station scheduling. We formulate an optimization problem for deriving the energy optimal configuration as a function of the main system parameters, and we propose a genetic algorithm approach to solve it. Our results enable a first-order evaluation of the most effective strategies for energy-efficient provisioning of power and communications in a SWIPT network. We show that the service capacity brought about by users brings energy-efficient dynamic network provisioning strategies that radically differ from those of networks with no wireless power transfer. △ Less

Submitted 11 July, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

Comments: The manuscript has already been submitted to Journal on 7-12-2023

arXiv:2310.18443 [pdf, other]

Towards a fuller understanding of neurons with Clustered Compositional Explanations

Authors: Biagio La Rosa, Leilani H. Gilpin, Roberto Capobianco

Abstract: Compositional Explanations is a method for identifying logical formulas of concepts that approximate the neurons' behavior. However, these explanations are linked to the small spectrum of neuron activations (i.e., the highest ones) used to check the alignment, thus lacking completeness. In this paper, we propose a generalization, called Clustered Compositional Explanations, that combines Compositi… ▽ More Compositional Explanations is a method for identifying logical formulas of concepts that approximate the neurons' behavior. However, these explanations are linked to the small spectrum of neuron activations (i.e., the highest ones) used to check the alignment, thus lacking completeness. In this paper, we propose a generalization, called Clustered Compositional Explanations, that combines Compositional Explanations with clustering and a novel search heuristic to approximate a broader spectrum of the neurons' behavior. We define and address the problems connected to the application of these methods to multiple ranges of activations, analyze the insights retrievable by using our algorithm, and propose desiderata qualities that can be used to study the explanations returned by different algorithms. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2310.03166 [pdf, other]

doi 10.1145/3605764.3623920

Raze to the Ground: Query-Efficient Adversarial HTML Attacks on Machine-Learning Phishing Webpage Detectors

Authors: Biagio Montaruli, Luca Demetrio, Maura Pintor, Luca Compagna, Davide Balzarotti, Battista Biggio

Abstract: Machine-learning phishing webpage detectors (ML-PWD) have been shown to suffer from adversarial manipulations of the HTML code of the input webpage. Nevertheless, the attacks recently proposed have demonstrated limited effectiveness due to their lack of optimizing the usage of the adopted manipulations, and they focus solely on specific elements of the HTML code. In this work, we overcome these li… ▽ More Machine-learning phishing webpage detectors (ML-PWD) have been shown to suffer from adversarial manipulations of the HTML code of the input webpage. Nevertheless, the attacks recently proposed have demonstrated limited effectiveness due to their lack of optimizing the usage of the adopted manipulations, and they focus solely on specific elements of the HTML code. In this work, we overcome these limitations by first designing a novel set of fine-grained manipulations which allow to modify the HTML code of the input phishing webpage without compromising its maliciousness and visual appearance, i.e., the manipulations are functionality- and rendering-preserving by design. We then select which manipulations should be applied to bypass the target detector by a query-efficient black-box optimization algorithm. Our experiments show that our attacks are able to raze to the ground the performance of current state-of-the-art ML-PWD using just 30 queries, thus overcoming the weaker attacks developed in previous work, and enabling a much fairer robustness evaluation of ML-PWD. △ Less

Submitted 13 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security (AISec '23), November 30, 2023, Copenhagen, Denmark

arXiv:2308.04964 [pdf, other]

Adversarial ModSecurity: Countering Adversarial SQL Injections with Robust Machine Learning

Authors: Biagio Montaruli, Luca Demetrio, Andrea Valenza, Luca Compagna, Davide Ariu, Luca Piras, Davide Balzarotti, Battista Biggio

Abstract: ModSecurity is widely recognized as the standard open-source Web Application Firewall (WAF), maintained by the OWASP Foundation. It detects malicious requests by matching them against the Core Rule Set, identifying well-known attack patterns. Each rule in the CRS is manually assigned a weight, based on the severity of the corresponding attack, and a request is detected as malicious if the sum of t… ▽ More ModSecurity is widely recognized as the standard open-source Web Application Firewall (WAF), maintained by the OWASP Foundation. It detects malicious requests by matching them against the Core Rule Set, identifying well-known attack patterns. Each rule in the CRS is manually assigned a weight, based on the severity of the corresponding attack, and a request is detected as malicious if the sum of the weights of the firing rules exceeds a given threshold. In this work, we show that this simple strategy is largely ineffective for detecting SQL injection (SQLi) attacks, as it tends to block many legitimate requests, while also being vulnerable to adversarial SQLi attacks, i.e., attacks intentionally manipulated to evade detection. To overcome these issues, we design a robust machine learning model, named AdvModSec, which uses the CRS rules as input features, and it is trained to detect adversarial SQLi attacks. Our experiments show that AdvModSec, being trained on the traffic directed towards the protected web services, achieves a better trade-off between detection and false positive rates, improving the detection rate of the vanilla version of ModSecurity with CRS by 21%. Moreover, our approach is able to improve its adversarial robustness against adversarial SQLi attacks by 42%, thereby taking a step forward towards building more robust and trustworthy WAFs. △ Less

Submitted 17 August, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

arXiv:2304.09953 [pdf, other]

doi 10.1145/3587135.3592172

Tunable and Portable Extreme-Scale Drug Discovery Platform at Exascale: the LIGATE Approach

Authors: Gianluca Palermo, Gianmarco Accordi, Davide Gadioli, Emanuele Vitali, Cristina Silvano, Bruno Guindani, Danilo Ardagna, Andrea R. Beccari, Domenico Bonanni, Carmine Talarico, Filippo Lunghini, Jan Martinovic, Paulo Silva, Ada Bohm, Jakub Beranek, Jan Krenek, Branislav Jansik, Luigi Crisci, Biagio, Cosenza, Peter Thoman, Philip Salzmann, Thomas Fahringer, Leila Alexander, Gerardo Tauriello , et al. (10 additional authors not shown)

Abstract: Today digital revolution is having a dramatic impact on the pharmaceutical industry and the entire healthcare system. The implementation of machine learning, extreme-scale computer simulations, and big data analytics in the drug design and development process offers an excellent opportunity to lower the risk of investment and reduce the time to the patient. Within the LIGATE project, we aim to i… ▽ More Today digital revolution is having a dramatic impact on the pharmaceutical industry and the entire healthcare system. The implementation of machine learning, extreme-scale computer simulations, and big data analytics in the drug design and development process offers an excellent opportunity to lower the risk of investment and reduce the time to the patient. Within the LIGATE project, we aim to integrate, extend, and co-design best-in-class European components to design Computer-Aided Drug Design (CADD) solutions exploiting today's high-end supercomputers and tomorrow's Exascale resources, fostering European competitiveness in the field. The proposed LIGATE solution is a fully integrated workflow that enables to deliver the result of a virtual screening campaign for drug discovery with the highest speed along with the highest accuracy. The full automation of the solution and the possibility to run it on multiple supercomputing centers at once permit to run an extreme scale in silico drug discovery campaign in few days to respond promptly for example to a worldwide pandemic crisis. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: Paper Accepted to the 20th ACM International Conference on Computing Frontiers (CF'23)

arXiv:2303.13110 [pdf, other]

OCELOT: Overlapped Cell on Tissue Dataset for Histopathology

Authors: Jeongun Ryu, Aaron Valero Puche, JaeWoong Shin, Seonwook Park, Biagio Brattoli, Jinhee Lee, Wonkyung Jung, Soo Ick Cho, Kyunghyun Paeng, Chan-Young Ock, Donggeun Yoo, Sérgio Pereira

Abstract: Cell detection is a fundamental task in computational pathology that can be used for extracting high-level medical information from whole-slide images. For accurate cell detection, pathologists often zoom out to understand the tissue-level structures and zoom in to classify cells based on their morphology and the surrounding context. However, there is a lack of efforts to reflect such behaviors by… ▽ More Cell detection is a fundamental task in computational pathology that can be used for extracting high-level medical information from whole-slide images. For accurate cell detection, pathologists often zoom out to understand the tissue-level structures and zoom in to classify cells based on their morphology and the surrounding context. However, there is a lack of efforts to reflect such behaviors by pathologists in the cell detection models, mainly due to the lack of datasets containing both cell and tissue annotations with overlapping regions. To overcome this limitation, we propose and publicly release OCELOT, a dataset purposely dedicated to the study of cell-tissue relationships for cell detection in histopathology. OCELOT provides overlapping cell and tissue annotations on images acquired from multiple organs. Within this setting, we also propose multi-task learning approaches that benefit from learning both cell and tissue tasks simultaneously. When compared against a model trained only for the cell detection task, our proposed approaches improve cell detection performance on 3 datasets: proposed OCELOT, public TIGER, and internal CARP datasets. On the OCELOT test set in particular, we show up to 6.79 improvement in F1-score. We believe the contributions of this paper, including the release of the OCELOT dataset at https://lunit-io.github.io/research/publications/ocelot are a crucial starting point toward the important research direction of incorporating cell-tissue relationships in computation pathology. △ Less

Submitted 23 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

Comments: Accepted for publication at CVPR'23

arXiv:2303.06150 [pdf, other]

Improving computation efficiency using input and architecture features for a virtual screening application

Authors: Gianmarco Accordi, Emanuele Vitali, Davide Gadioli, Luigi Crisci, Biagio Cosenza, Mauro Bisson, Massimiliano Fatica, Andrea Beccari, Gianluca Palermo

Abstract: Virtual screening is an early stage of the drug discovery process that selects the most promising candidates. In the urgent computing scenario it is critical to find a solution in a short time frame. In this paper, we focus on a real-world virtual screening application to evaluate out-of-kernel optimizations, that consider input and architecture features to improve the computation efficiency on GP… ▽ More Virtual screening is an early stage of the drug discovery process that selects the most promising candidates. In the urgent computing scenario it is critical to find a solution in a short time frame. In this paper, we focus on a real-world virtual screening application to evaluate out-of-kernel optimizations, that consider input and architecture features to improve the computation efficiency on GPU. Experiment results on a modern supercomputer node show that we can almost double the performance. Moreover, we implemented the optimization using SYCL and it provides a consistent benefit with the CUDA optimization. A virtual screening campaign can use this gain in performance to increase the number of evaluated candidates, improving the probability of finding a drug. △ Less

Submitted 9 March, 2023; originally announced March 2023.

arXiv:2205.11710 [pdf, other]

SCVRL: Shuffled Contrastive Video Representation Learning

Authors: Michael Dorkenwald, Fanyi Xiao, Biagio Brattoli, Joseph Tighe, Davide Modolo

Abstract: We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos. Differently from previous contrast learning based methods that mostly focus on learning visual semantics (e.g., CVRL), SCVRL is capable of learning both semantic and motion patterns. For that, we reformulate the popular shuffling pretext task within a modern contrastive learning paradigm. We show that ou… ▽ More We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos. Differently from previous contrast learning based methods that mostly focus on learning visual semantics (e.g., CVRL), SCVRL is capable of learning both semantic and motion patterns. For that, we reformulate the popular shuffling pretext task within a modern contrastive learning paradigm. We show that our transformer-based network has a natural capacity to learn motion in self-supervised settings and achieves strong performance, outperforming CVRL on four benchmarks. △ Less

Submitted 23 May, 2022; originally announced May 2022.

Comments: CVPR 2022 - L3DIVU workshop

arXiv:2202.05838 [pdf, ps, other]

Applications of Machine Learning to Lattice Quantum Field Theory

Authors: Denis Boyda, Salvatore Calì, Sam Foreman, Lena Funcke, Daniel C. Hackett, Yin Lin, Gert Aarts, Andrei Alexandru, Xiao-Yong Jin, Biagio Lucini, Phiala E. Shanahan

Abstract: There is great potential to apply machine learning in the area of numerical lattice quantum field theory, but full exploitation of that potential will require new strategies. In this white paper for the Snowmass community planning process, we discuss the unique requirements of machine learning for lattice quantum field theory research and outline what is needed to enable exploration and deployment… ▽ More There is great potential to apply machine learning in the area of numerical lattice quantum field theory, but full exploitation of that potential will require new strategies. In this white paper for the Snowmass community planning process, we discuss the unique requirements of machine learning for lattice quantum field theory research and outline what is needed to enable exploration and deployment of this approach in the future. △ Less

Submitted 10 February, 2022; originally announced February 2022.

Comments: 10 pages, contribution to Snowmass 2022

Report number: MIT-CTP/5405

arXiv:2110.10928 [pdf, other]

doi 10.1088/1742-6596/2207/1/012056

Quantum field theories, Markov random fields and machine learning

Authors: Dimitrios Bachtis, Gert Aarts, Biagio Lucini

Abstract: The transition to Euclidean space and the discretization of quantum field theories on spatial or space-time lattices opens up the opportunity to investigate probabilistic machine learning within quantum field theory. Here, we will discuss how discretized Euclidean field theories, such as the $φ^{4}$ lattice field theory on a square lattice, are mathematically equivalent to Markov fields, a notable… ▽ More The transition to Euclidean space and the discretization of quantum field theories on spatial or space-time lattices opens up the opportunity to investigate probabilistic machine learning within quantum field theory. Here, we will discuss how discretized Euclidean field theories, such as the $φ^{4}$ lattice field theory on a square lattice, are mathematically equivalent to Markov fields, a notable class of probabilistic graphical models with applications in a variety of research areas, including machine learning. The results are established based on the Hammersley-Clifford theorem. We will then derive neural networks from quantum field theories and discuss applications pertinent to the minimization of the Kullback-Leibler divergence for the probability distribution of the $φ^{4}$ machine learning algorithms and other probability distributions. △ Less

Submitted 29 March, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

arXiv:2109.10960 [pdf, other]

doi 10.1103/PhysRevE.105.024121

Quantitative analysis of phase transitions in two-dimensional XY models using persistent homology

Authors: Nicholas Sale, Jeffrey Giansiracusa, Biagio Lucini

Abstract: We use persistent homology and persistence images as an observable of three different variants of the two-dimensional XY model in order to identify and study their phase transitions. We examine models with the classical XY action, a topological lattice action, and an action with an additional nematic term. In particular, we introduce a new way of computing the persistent homology of lattice spin m… ▽ More We use persistent homology and persistence images as an observable of three different variants of the two-dimensional XY model in order to identify and study their phase transitions. We examine models with the classical XY action, a topological lattice action, and an action with an additional nematic term. In particular, we introduce a new way of computing the persistent homology of lattice spin model configurations and, by considering the fluctuations in the output of logistic regression and k-nearest neighbours models trained on persistence images, we develop a methodology to extract estimates of the critical temperature and the critical exponent of the correlation length. We put particular emphasis on finite-size scaling behaviour and producing estimates with quantifiable error. For each model we successfully identify its phase transition(s) and are able to get an accurate determination of the critical temperatures and critical exponents of the correlation length. △ Less

Submitted 2 February, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

Comments: 20 pages, 29 figures

arXiv:2109.07804 [pdf, other]

Detection Accuracy for Evaluating Compositional Explanations of Units

Authors: Sayo M. Makinwa, Biagio La Rosa, Roberto Capobianco

Abstract: The recent success of deep learning models in solving complex problems and in different domains has increased interest in understanding what they learn. Therefore, different approaches have been employed to explain these models, one of which uses human-understandable concepts as explanations. Two examples of methods that use this approach are Network Dissection and Compositional explanations. The… ▽ More The recent success of deep learning models in solving complex problems and in different domains has increased interest in understanding what they learn. Therefore, different approaches have been employed to explain these models, one of which uses human-understandable concepts as explanations. Two examples of methods that use this approach are Network Dissection and Compositional explanations. The former explains units using atomic concepts, while the latter makes explanations more expressive, replacing atomic concepts with logical forms. While intuitively, logical forms are more informative than atomic concepts, it is not clear how to quantify this improvement, and their evaluation is often based on the same metric that is optimized during the search-process and on the usage of hyper-parameters to be tuned. In this paper, we propose to use as evaluation metric the Detection Accuracy, which measures units' consistency of detection of their assigned explanations. We show that this metric (1) evaluates explanations of different lengths effectively, (2) can be used as a stopping criterion for the compositional explanation search, eliminating the explanation length hyper-parameter, and (3) exposes new specialized units whose length 1 explanations are the perceptual abstractions of their longer explanations. △ Less

Submitted 28 October, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

Comments: To appear in AIxIA 2021 conference (10 pages, 7 figures)

arXiv:2109.07730 [pdf, other]

doi 10.22323/1.396.0201

Machine learning with quantum field theories

Authors: Dimitrios Bachtis, Gert Aarts, Biagio Lucini

Abstract: The precise equivalence between discretized Euclidean field theories and a certain class of probabilistic graphical models, namely the mathematical framework of Markov random fields, opens up the opportunity to investigate machine learning from the perspective of quantum field theory. In this contribution we will demonstrate, through the Hammersley-Clifford theorem, that the $φ^{4}$ scalar field t… ▽ More The precise equivalence between discretized Euclidean field theories and a certain class of probabilistic graphical models, namely the mathematical framework of Markov random fields, opens up the opportunity to investigate machine learning from the perspective of quantum field theory. In this contribution we will demonstrate, through the Hammersley-Clifford theorem, that the $φ^{4}$ scalar field theory on a square lattice satisfies the local Markov property and can therefore be recast as a Markov random field. We will then derive from the $φ^{4}$ theory machine learning algorithms and neural networks which can be viewed as generalizations of conventional neural network architectures. Finally, we will conclude by presenting applications based on the minimization of an asymmetric distance between the probability distribution of the $φ^{4}$ machine learning algorithms and target probability distributions. △ Less

Submitted 16 September, 2021; originally announced September 2021.

Comments: Presentation at the 38th International Symposium on Lattice Field Theory, 26th-30th July 2021, Massachusetts Institute of Technology, USA

arXiv:2107.14574 [pdf, other]

Surrogate Modelling for Injection Molding Processes using Machine Learning

Authors: Arsenii Uglov, Sergei Nikolaev, Sergei Belov, Daniil Padalitsa, Tatiana Greenkina, Marco San Biagio, Fabio Cacciatori

Abstract: Injection molding is one of the most popular manufacturing methods for the modeling of complex plastic objects. Faster numerical simulation of the technological process would allow for faster and cheaper design cycles of new products. In this work, we propose a baseline for a data processing pipeline that includes the extraction of data from Moldflow simulation projects and the prediction of the f… ▽ More Injection molding is one of the most popular manufacturing methods for the modeling of complex plastic objects. Faster numerical simulation of the technological process would allow for faster and cheaper design cycles of new products. In this work, we propose a baseline for a data processing pipeline that includes the extraction of data from Moldflow simulation projects and the prediction of the fill time and deflection distributions over 3-dimensional surfaces using machine learning models. We propose algorithms for engineering of features, including information of injector gates parameters that will mostly affect the time for plastic to reach the particular point of the form for fill time prediction, and geometrical features for deflection prediction. We propose and evaluate baseline machine learning models for fill time and deflection distribution prediction and provide baseline values of MSE and RMSE metrics. Finally, we measure the execution time of our solution and show that it significantly exceeds the time of simulation with Moldflow software: approximately 17 times and 14 times faster for mean and median total times respectively, comparing the times of all analysis stages for deflection prediction. Our solution has been implemented in a prototype web application that was approved by the management board of Fiat Chrysler Automobiles and Illogic SRL. As one of the promising applications of this surrogate modelling approach, we envision the use of trained models as a fast objective function in the task of optimization of technological parameters of the injection molding process (meaning optimal placement of gates), which could significantly aid engineers in this task, or even automate it. △ Less

Submitted 30 July, 2021; originally announced July 2021.

arXiv:2106.01440 [pdf, other]

doi 10.1007/s10489-022-03886-6

Memory Wrap: a Data-Efficient and Interpretable Extension to Image Classification Models

Authors: Biagio La Rosa, Roberto Capobianco, Daniele Nardi

Abstract: Due to their black-box and data-hungry nature, deep learning techniques are not yet widely adopted for real-world applications in critical domains, like healthcare and justice. This paper presents Memory Wrap, a plug-and-play extension to any image classification model. Memory Wrap improves both data-efficiency and model interpretability, adopting a content-attention mechanism between the input an… ▽ More Due to their black-box and data-hungry nature, deep learning techniques are not yet widely adopted for real-world applications in critical domains, like healthcare and justice. This paper presents Memory Wrap, a plug-and-play extension to any image classification model. Memory Wrap improves both data-efficiency and model interpretability, adopting a content-attention mechanism between the input and some memories of past training samples. We show that Memory Wrap outperforms standard classifiers when it learns from a limited set of data, and it reaches comparable performance when it learns from the full dataset. We discuss how its structure and content-attention mechanisms make predictions interpretable, compared to standard classifiers. To this end, we both show a method to build explanations by examples and counterfactuals, based on the memory content, and how to exploit them to get insights about its decision process. We test our approach on image classification tasks using several architectures on three different datasets, namely CIFAR10, SVHN, and CINIC10. △ Less

Submitted 4 June, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

Comments: 22 pages

arXiv:2104.11746 [pdf, other]

VidTr: Video Transformer Without Convolutions

Authors: Yanyi Zhang, Xinyu Li, Chunhui Liu, Bing Shuai, Yi Zhu, Biagio Brattoli, Hao Chen, Ivan Marsic, Joseph Tighe

Abstract: We introduce Video Transformer (VidTr) with separable-attention for video classification. Comparing with commonly used 3D networks, VidTr is able to aggregate spatio-temporal information via stacked attentions and provide better performance with higher efficiency. We first introduce the vanilla video transformer and show that transformer module is able to perform spatio-temporal modeling from raw… ▽ More We introduce Video Transformer (VidTr) with separable-attention for video classification. Comparing with commonly used 3D networks, VidTr is able to aggregate spatio-temporal information via stacked attentions and provide better performance with higher efficiency. We first introduce the vanilla video transformer and show that transformer module is able to perform spatio-temporal modeling from raw pixels, but with heavy memory usage. We then present VidTr which reduces the memory cost by 3.3$\times$ while keeping the same performance. To further optimize the model, we propose the standard deviation based topK pooling for attention ($pool_{topK\_std}$), which reduces the computation by dropping non-informative features along temporal dimension. VidTr achieves state-of-the-art performance on five commonly used datasets with lower computational requirement, showing both the efficiency and effectiveness of our design. Finally, error analysis and visualization show that VidTr is especially good at predicting actions that require long-term temporal reasoning. △ Less

Submitted 15 October, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

Comments: ICCV 2021 Accepted

arXiv:2102.09449 [pdf, other]

doi 10.1103/PhysRevD.103.074510

Quantum field-theoretic machine learning

Authors: Dimitrios Bachtis, Gert Aarts, Biagio Lucini

Abstract: We derive machine learning algorithms from discretized Euclidean field theories, making inference and learning possible within dynamics described by quantum field theory. Specifically, we demonstrate that the $φ^{4}$ scalar field theory satisfies the Hammersley-Clifford theorem, therefore recasting it as a machine learning algorithm within the mathematically rigorous framework of Markov random fie… ▽ More We derive machine learning algorithms from discretized Euclidean field theories, making inference and learning possible within dynamics described by quantum field theory. Specifically, we demonstrate that the $φ^{4}$ scalar field theory satisfies the Hammersley-Clifford theorem, therefore recasting it as a machine learning algorithm within the mathematically rigorous framework of Markov random fields. We illustrate the concepts by minimizing an asymmetric distance between the probability distribution of the $φ^{4}$ theory and that of target distributions, by quantifying the overlap of statistical ensembles between probability distributions and through reweighting to complex-valued actions with longer-range interactions. Neural network architectures are additionally derived from the $φ^{4}$ theory which can be viewed as generalizations of conventional neural networks and applications are presented. We conclude by discussing how the proposal opens up a new research avenue, that of developing a mathematical and computational framework of machine learning within quantum field theory. △ Less

Submitted 23 April, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

Journal ref: Phys. Rev. D 103, 074510 (2021)

arXiv:2012.09237 [pdf, other]

Unsupervised Behaviour Analysis and Magnification (uBAM) using Deep Learning

Authors: Biagio Brattoli, Uta Buechler, Michael Dorkenwald, Philipp Reiser, Linard Filli, Fritjof Helmchen, Anna-Sophia Wahl, Bjoern Ommer

Abstract: Motor behaviour analysis is essential to biomedical research and clinical diagnostics as it provides a non-invasive strategy for identifying motor impairment and its change caused by interventions. State-of-the-art instrumented movement analysis is time- and cost-intensive, since it requires placing physical or virtual markers. Besides the effort required for marking keypoints or annotations neces… ▽ More Motor behaviour analysis is essential to biomedical research and clinical diagnostics as it provides a non-invasive strategy for identifying motor impairment and its change caused by interventions. State-of-the-art instrumented movement analysis is time- and cost-intensive, since it requires placing physical or virtual markers. Besides the effort required for marking keypoints or annotations necessary for training or finetuning a detector, users need to know the interesting behaviour beforehand to provide meaningful keypoints. We introduce unsupervised behaviour analysis and magnification (uBAM), an automatic deep learning algorithm for analysing behaviour by discovering and magnifying deviations. A central aspect is unsupervised learning of posture and behaviour representations to enable an objective comparison of movement. Besides discovering and quantifying deviations in behaviour, we also propose a generative model for visually magnifying subtle behaviour differences directly in a video without requiring a detour via keypoints or annotations. Essential for this magnification of deviations even across different individuals is a disentangling of appearance and behaviour. Evaluations on rodents and human patients with neurological diseases demonstrate the wide applicability of our approach. Moreover, combining optogenetic stimulation with our unsupervised behaviour analysis shows its suitability as a non-invasive diagnostic tool correlating function to brain plasticity. △ Less

Submitted 6 April, 2021; v1 submitted 16 December, 2020; originally announced December 2020.

Comments: Published in Nature Machine Intelligence (2021), https://rdcu.be/ch6pL

arXiv:2010.00054 [pdf, other]

doi 10.1103/PhysRevResearch.3.013134

Adding machine learning within Hamiltonians: Renormalization group transformations, symmetry breaking and restoration

Authors: Dimitrios Bachtis, Gert Aarts, Biagio Lucini

Abstract: We present a physical interpretation of machine learning functions, opening up the possibility to control properties of statistical systems via the inclusion of these functions in Hamiltonians. In particular, we include the predictive function of a neural network, designed for phase classification, as a conjugate variable coupled to an external field within the Hamiltonian of a system. Results in… ▽ More We present a physical interpretation of machine learning functions, opening up the possibility to control properties of statistical systems via the inclusion of these functions in Hamiltonians. In particular, we include the predictive function of a neural network, designed for phase classification, as a conjugate variable coupled to an external field within the Hamiltonian of a system. Results in the two-dimensional Ising model evidence that the field can induce an order-disorder phase transition by breaking or restoring the symmetry, in contrast with the field of the conventional order parameter which causes explicit symmetry breaking. The critical behavior is then studied by proposing a Hamiltonian-agnostic reweighting approach and forming a renormalization group mapping on quantities derived from the neural network. Accurate estimates of the critical point and of the critical exponents related to the operators that govern the divergence of the correlation length are provided. We conclude by discussing how the method provides an essential step toward bridging machine learning and physics. △ Less

Submitted 10 February, 2021; v1 submitted 30 September, 2020; originally announced October 2020.

Journal ref: Phys. Rev. Research 3, 013134 (2021)

arXiv:2005.13076 [pdf, other]

Using PHAST to port Caffe library: First experiences and lessons learned

Authors: Eduardo José Gómez-Hernández, Pablo Antonio Martínez, Biagio Peccerillo, Sandro Bartolini, José Manuel García, Gregorio Bernabé

Abstract: Performance has always been a hot topic in computing. However, the viable ways to achieve it have taken many forms in the different moments of computing history. Today, technological limits have pushed the adoption of increasingly parallel multi-core and many-core architectures and even the use of highly specific hardware (aka Domain-Specific Architectures, or DSAs) to solve very specific problems… ▽ More Performance has always been a hot topic in computing. However, the viable ways to achieve it have taken many forms in the different moments of computing history. Today, technological limits have pushed the adoption of increasingly parallel multi-core and many-core architectures and even the use of highly specific hardware (aka Domain-Specific Architectures, or DSAs) to solve very specific problems. In this new context, one major problem is how to develop software once, and be able to run it on multiple accelerator architectures, seamlessly. Ideally aiming at a single programming model that can automatically target the code to different kinds of parallel architectures, allowing specific tuning with minimal, if any, changes to the source-code in order to seek performance portability. A comprehensive solution to this is still lacking. In this work, we present the use of the PHAST Library, which allows users to code once, at a high level of abstraction and thus with high productivity, and automatically targeting different parallel devices by changing the compilation process. As a case study, we have worked on the porting of the well-known deep-learning Caffe framework. The framework has been split into different parts and some of them have been ported, obtaining a working straightforward implementation that can be run on both CPUs and GPUs. We conclude discussing the lessons learned during the porting process, and analyzing the obtained performance in the perspective of completing the porting and expanding it to future consequent works. △ Less

Submitted 28 May, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

Comments: Presented at the 13th International Workshop on Programmability and Architectures for Heterogeneous Multicores, 2020 (arXiv:2005.07619)

Report number: MULTIPROG/2020/3

arXiv:2004.14341 [pdf, other]

doi 10.1103/PhysRevE.102.033303

Extending machine learning classification capabilities with histogram reweighting

Authors: Dimitrios Bachtis, Gert Aarts, Biagio Lucini

Abstract: We propose the use of Monte Carlo histogram reweighting to extrapolate predictions of machine learning methods. In our approach, we treat the output from a convolutional neural network as an observable in a statistical system, enabling its extrapolation over continuous ranges in parameter space. We demonstrate our proposal using the phase transition in the two-dimensional Ising model. By interpret… ▽ More We propose the use of Monte Carlo histogram reweighting to extrapolate predictions of machine learning methods. In our approach, we treat the output from a convolutional neural network as an observable in a statistical system, enabling its extrapolation over continuous ranges in parameter space. We demonstrate our proposal using the phase transition in the two-dimensional Ising model. By interpreting the output of the neural network as an order parameter, we explore connections with known observables in the system and investigate its scaling behaviour. A finite size scaling analysis is conducted based on quantities derived from the neural network that yields accurate estimates for the critical exponents and the critical temperature. The method improves the prospects of acquiring precision measurements from machine learning in physical systems without an order parameter and those where direct sampling in regions of parameter space might not be possible. △ Less

Submitted 22 November, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

Journal ref: Phys. Rev. E 102, 033303 (2020)

arXiv:2004.05582 [pdf, other]

doi 10.1109/TPAMI.2020.3009620

Sharing Matters for Generalization in Deep Metric Learning

Authors: Timo Milbich, Karsten Roth, Biagio Brattoli, Björn Ommer

Abstract: Learning the similarity between images constitutes the foundation for numerous vision tasks. The common paradigm is discriminative metric learning, which seeks an embedding that separates different training classes. However, the main challenge is to learn a metric that not only generalizes from training to novel, but related, test samples. It should also transfer to different object classes. So wh… ▽ More Learning the similarity between images constitutes the foundation for numerous vision tasks. The common paradigm is discriminative metric learning, which seeks an embedding that separates different training classes. However, the main challenge is to learn a metric that not only generalizes from training to novel, but related, test samples. It should also transfer to different object classes. So what complementary information is missed by the discriminative paradigm? Besides finding characteristics that separate between classes, we also need them to likely occur in novel categories, which is indicated if they are shared across training classes. This work investigates how to learn such characteristics without the need for extra annotations or training data. By formulating our approach as a novel triplet sampling strategy, it can be easily applied on top of recent ranking loss frameworks. Experiments show that, independent of the underlying network architecture and the specific ranking loss, our approach significantly improves performance in deep metric learning, leading to new the state-of-the-art results on various standard benchmark datasets. Preliminary early access page can be found here: https://ieeexplore.ieee.org/document/9141449 △ Less

Submitted 9 September, 2021; v1 submitted 12 April, 2020; originally announced April 2020.

Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence

arXiv:2003.01455 [pdf, other]

Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

Authors: Biagio Brattoli, Joseph Tighe, Fedor Zhdanov, Pietro Perona, Krzysztof Chalupka

Abstract: Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classificat… ▽ More Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at github.com/bbrattoli/ZeroShotVideoClassification. △ Less

Submitted 20 June, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

Comments: Accepted for publication at CVPR 2020

arXiv:1909.11574 [pdf, other]

MIC: Mining Interclass Characteristics for Improved Metric Learning

Authors: Karsten Roth, Biagio Brattoli, Björn Ommer

Abstract: Metric learning seeks to embed images of objects suchthat class-defined relations are captured by the embeddingspace. However, variability in images is not just due to different depicted object classes, but also depends on other latent characteristics such as viewpoint or illumination. In addition to these structured properties, random noise further obstructs the visual relations of interest. The… ▽ More Metric learning seeks to embed images of objects suchthat class-defined relations are captured by the embeddingspace. However, variability in images is not just due to different depicted object classes, but also depends on other latent characteristics such as viewpoint or illumination. In addition to these structured properties, random noise further obstructs the visual relations of interest. The common approach to metric learning is to enforce a representation that is invariant under all factors but the ones of interest. In contrast, we propose to explicitly learn the latent characteristics that are shared by and go across object classes. We can then directly explain away structured visual variability, rather than assuming it to be unknown random noise. We propose a novel surrogate task to learn visual characteristics shared across classes with a separate encoder. This encoder is trained jointly with the encoder for class information by reducing their mutual information. On five standard image retrieval benchmarks the approach significantly improves upon the state-of-the-art. △ Less

Submitted 25 September, 2019; originally announced September 2019.

Comments: 8 pages, 10 figures, accepted to ICCV 2019

arXiv:1903.06259 [pdf, other]

Conditional GANs For Painting Generation

Authors: Adeel Mufti, Biagio Antonelli, Julius Monello

Abstract: We examined the use of modern Generative Adversarial Nets to generate novel images of oil paintings using the Painter By Numbers dataset. We implemented Spectral Normalization GAN (SN-GAN) and Spectral Normalization GAN with Gradient Penalty, and compared their outputs to a Deep Convolutional GAN. Visually, and quantitatively according to the Sliced Wasserstein Distance metric, we determined that… ▽ More We examined the use of modern Generative Adversarial Nets to generate novel images of oil paintings using the Painter By Numbers dataset. We implemented Spectral Normalization GAN (SN-GAN) and Spectral Normalization GAN with Gradient Penalty, and compared their outputs to a Deep Convolutional GAN. Visually, and quantitatively according to the Sliced Wasserstein Distance metric, we determined that the SN-GAN produced paintings that were most comparable to our training dataset. We then performed a series of experiments to add supervised conditioning to SN-GAN, the culmination of which is what we believe to be a novel architecture that can generate face paintings with user-specified characteristics. △ Less

Submitted 6 March, 2019; originally announced March 2019.

arXiv:1811.03879 [pdf, ps, other]

Cross and Learn: Cross-Modal Self-Supervision

Authors: Nawid Sayed, Biagio Brattoli, Björn Ommer

Abstract: In this paper we present a self-supervised method for representation learning utilizing two different modalities. Based on the observation that cross-modal information has a high semantic meaning we propose a method to effectively exploit this signal. For our approach we utilize video data since it is available on a large scale and provides easily accessible modalities given by RGB and optical flo… ▽ More In this paper we present a self-supervised method for representation learning utilizing two different modalities. Based on the observation that cross-modal information has a high semantic meaning we propose a method to effectively exploit this signal. For our approach we utilize video data since it is available on a large scale and provides easily accessible modalities given by RGB and optical flow. We demonstrate state-of-the-art performance on highly contested action recognition datasets in the context of self-supervised learning. We show that our feature representation also transfers to other tasks and conduct extensive ablation studies to validate our core contributions. Code and model can be found at https://github.com/nawidsayed/Cross-and-Learn. △ Less

Submitted 29 April, 2019; v1 submitted 9 November, 2018; originally announced November 2018.

Comments: GCPR 2018

arXiv:1807.11293 [pdf, other]

Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning

Authors: Uta Büchler, Biagio Brattoli, Björn Ommer

Abstract: Self-supervised learning of convolutional neural networks can harness large amounts of cheap unlabeled data to train powerful feature representations. As surrogate task, we jointly address ordering of visual data in the spatial and temporal domain. The permutations of training samples, which are at the core of self-supervision by ordering, have so far been sampled randomly from a fixed preselected… ▽ More Self-supervised learning of convolutional neural networks can harness large amounts of cheap unlabeled data to train powerful feature representations. As surrogate task, we jointly address ordering of visual data in the spatial and temporal domain. The permutations of training samples, which are at the core of self-supervision by ordering, have so far been sampled randomly from a fixed preselected set. Based on deep reinforcement learning we propose a sampling policy that adapts to the state of the network, which is being trained. Therefore, new permutations are sampled according to their expected utility for updating the convolutional feature representation. Experimental evaluation on unsupervised and transfer learning tasks demonstrates competitive performance on standard benchmarks for image and video classification and nearest neighbor retrieval. △ Less

Submitted 30 July, 2018; originally announced July 2018.

Comments: Accepted for publication at ECCV 2018

arXiv:1609.09251 [pdf, other]

Kernel Methods on Approximate Infinite-Dimensional Covariance Operators for Image Classification

Authors: Hà Quang Minh, Marco San Biagio, Loris Bazzani, Vittorio Murino

Abstract: This paper presents a novel framework for visual object recognition using infinite-dimensional covariance operators of input features in the paradigm of kernel methods on infinite-dimensional Riemannian manifolds. Our formulation provides in particular a rich representation of image features by exploiting their non-linear correlations. Theoretically, we provide a finite-dimensional approximation o… ▽ More This paper presents a novel framework for visual object recognition using infinite-dimensional covariance operators of input features in the paradigm of kernel methods on infinite-dimensional Riemannian manifolds. Our formulation provides in particular a rich representation of image features by exploiting their non-linear correlations. Theoretically, we provide a finite-dimensional approximation of the Log-Hilbert-Schmidt (Log-HS) distance between covariance operators that is scalable to large datasets, while maintaining an effective discriminating capability. This allows us to efficiently approximate any continuous shift-invariant kernel defined using the Log-HS distance. At the same time, we prove that the Log-HS inner product between covariance operators is only approximable by its finite-dimensional counterpart in a very limited scenario. Consequently, kernels defined using the Log-HS inner product, such as polynomial kernels, are not scalable in the same way as shift-invariant kernels. Computationally, we apply the approximate Log-HS distance formulation to covariance operators of both handcrafted and convolutional features, exploiting both the expressiveness of these features and the power of the covariance representation. Empirically, we tested our framework on the task of image classification on twelve challenging datasets. In almost all cases, the results obtained outperform other state of the art methods, demonstrating the competitiveness and potential of our framework. △ Less

Submitted 29 September, 2016; originally announced September 2016.

Comments: 18 double-column pages

arXiv:1604.06582 [pdf, other]

Kernelized Covariance for Action Recognition

Authors: Jacopo Cavazza, Andrea Zunino, Marco San Biagio, Vittorio Murino

Abstract: In this paper we aim at increasing the descriptive power of the covariance matrix, limited in capturing linear mutual dependencies between variables only. We present a rigorous and principled mathematical pipeline to recover the kernel trick for computing the covariance matrix, enhancing it to model more complex, non-linear relationships conveyed by the raw data. To this end, we propose Kernelized… ▽ More In this paper we aim at increasing the descriptive power of the covariance matrix, limited in capturing linear mutual dependencies between variables only. We present a rigorous and principled mathematical pipeline to recover the kernel trick for computing the covariance matrix, enhancing it to model more complex, non-linear relationships conveyed by the raw data. To this end, we propose Kernelized-COV, which generalizes the original covariance representation without compromising the efficiency of the computation. In the experiments, we validate the proposed framework against many previous approaches in the literature, scoring on par or superior with respect to the state of the art on benchmark datasets for 3D action recognition. △ Less

Submitted 2 September, 2016; v1 submitted 22 April, 2016; originally announced April 2016.

Comments: Accepted paper at the 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016

arXiv:1502.03845 [pdf, ps, other]

doi 10.1016/j.jda.2014.12.007

Adaptive Search over Sorted Sets

Authors: Biagio Bonasera, Emilio Ferrara, Giacomo Fiumara, Francesco Pagano, Alessandro Provetti

Abstract: We revisit the classical algorithms for searching over sorted sets to introduce an algorithm refinement, called Adaptive Search, that combines the good features of Interpolation search and those of Binary search. W.r.t. Interpolation search, only a constant number of extra comparisons is introduced. Yet, under diverse input data distributions our algorithm shows costs comparable to that of Interpo… ▽ More We revisit the classical algorithms for searching over sorted sets to introduce an algorithm refinement, called Adaptive Search, that combines the good features of Interpolation search and those of Binary search. W.r.t. Interpolation search, only a constant number of extra comparisons is introduced. Yet, under diverse input data distributions our algorithm shows costs comparable to that of Interpolation search, i.e., O(log log n) while the worst-case cost is always in O(log n), as with Binary search. On benchmarks drawn from large datasets, both synthetic and real-life, Adaptive search scores better times and lesser memory accesses even than Santoro and Sidney's Interpolation-Binary search. △ Less

Submitted 12 February, 2015; originally announced February 2015.

Comments: 9 pages

ACM Class: F.2.2

Journal ref: Journal of Discrete Algorithms, Volume 30, 2015, pp. 128--133

arXiv:1401.3733 [pdf, other]

doi 10.1109/HPCSim.2016.7568421

BSMBench: a flexible and scalable supercomputer benchmark from computational particle physics

Authors: Ed Bennett, Luigi Del Debbio, Kirk Jordan, Biagio Lucini, Agostino Patella, Claudio Pica, Antonio Rago

Abstract: Lattice Quantum ChromoDynamics (QCD), and by extension its parent field, Lattice Gauge Theory (LGT), make up a significant fraction of supercomputing cycles worldwide. As such, it would be irresponsible not to evaluate machines' suitability for such applications. To this end, a benchmark has been developed to assess the performance of LGT applications on modern HPC platforms. Distinct from previou… ▽ More Lattice Quantum ChromoDynamics (QCD), and by extension its parent field, Lattice Gauge Theory (LGT), make up a significant fraction of supercomputing cycles worldwide. As such, it would be irresponsible not to evaluate machines' suitability for such applications. To this end, a benchmark has been developed to assess the performance of LGT applications on modern HPC platforms. Distinct from previous QCD-based benchmarks, this allows probing the behaviour of a variety of theories, which allows varying the ratio of demands between on-node computations and inter-node communications. The results of testing this benchmark on various recent HPC platforms are presented, and directions for future development are discussed. △ Less

Submitted 22 September, 2016; v1 submitted 15 January, 2014; originally announced January 2014.

Comments: 6 pages, 5 figures; version as presented at High Performance Computing and Simulation, HPCS 2016

Report number: CP3-Origins-2014-001 DNRF90 & DIAS-2014-1

Journal ref: 2016 International Conference on High Performance Computing & Simulation (HPCS), Innsbruck, Austria, 2016, pp. 834-839

Showing 1–37 of 37 results for author: Biagio