-
A Reproducible Analysis of Sequential Recommender Systems
Authors:
Filippo Betello,
Antonio Purificato,
Federico Siciliano,
Giovanni Trappolini,
Andrea Bacciu,
Nicola Tonellotto,
Fabrizio Silvestri
Abstract:
Sequential Recommender Systems (SRSs) have emerged as a highly efficient approach to recommendation systems. By leveraging sequential data, SRSs can identify temporal patterns in user behaviour, significantly improving recommendation accuracy and relevance.Ensuring the reproducibility of these models is paramount for advancing research and facilitating comparisons between them. Existing works exhi…
▽ More
Sequential Recommender Systems (SRSs) have emerged as a highly efficient approach to recommendation systems. By leveraging sequential data, SRSs can identify temporal patterns in user behaviour, significantly improving recommendation accuracy and relevance.Ensuring the reproducibility of these models is paramount for advancing research and facilitating comparisons between them. Existing works exhibit shortcomings in reproducibility and replicability of results, leading to inconsistent statements across papers. Our work fills these gaps by standardising data pre-processing and model implementations, providing a comprehensive code resource, including a framework for developing SRSs and establishing a foundation for consistent and reproducible experimentation. We conduct extensive experiments on several benchmark datasets, comparing various SRSs implemented in our resource. We challenge prevailing performance benchmarks, offering new insights into the SR domain. For instance, SASRec does not consistently outperform GRU4Rec. On the contrary, when the number of model parameters becomes substantial, SASRec starts to clearly dominate all the other SRSs. This discrepancy underscores the significant impact that experimental configuration has on the outcomes and the importance of setting it up to ensure precise and comprehensive results. Failure to do so can lead to significantly flawed conclusions, highlighting the need for rigorous experimental design and analysis in SRS research. Our code is available at https://github.com/antoniopurificato/recsys_repro_conf.
△ Less
Submitted 7 August, 2024;
originally announced August 2024.
-
A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems
Authors:
Florin Cuconasu,
Giovanni Trappolini,
Nicola Tonellotto,
Fabrizio Silvestri
Abstract:
Retrieval Augmented Generation (RAG) represents a significant advancement in artificial intelligence combining a retrieval phase with a generative phase, with the latter typically being powered by large language models (LLMs). The current common practices in RAG involve using "instructed" LLMs, which are fine-tuned with supervised training to enhance their ability to follow instructions and are al…
▽ More
Retrieval Augmented Generation (RAG) represents a significant advancement in artificial intelligence combining a retrieval phase with a generative phase, with the latter typically being powered by large language models (LLMs). The current common practices in RAG involve using "instructed" LLMs, which are fine-tuned with supervised training to enhance their ability to follow instructions and are aligned with human preferences using state-of-the-art techniques. Contrary to popular belief, our study demonstrates that base models outperform their instructed counterparts in RAG tasks by 20% on average under our experimental settings. This finding challenges the prevailing assumptions about the superiority of instructed LLMs in RAG applications. Further investigations reveal a more nuanced situation, questioning fundamental aspects of RAG and suggesting the need for broader discussions on the topic; or, as Fromm would have it, "Seldom is a glance at the statistics enough to understand the meaning of the figures".
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Predicting Award Winning Research Papers at Publication Time
Authors:
Riccardo Vella,
Andrea Vitaletti,
Fabrizio Silvestri
Abstract:
In recent years, many studies have been focusing on predicting the scientific impact of research papers. Most of these predictions are based on citations count or rely on features obtainable only from already published papers. In this study, we predict the likelihood for a research paper of winning an award only relying on information available at publication time. For each paper, we build the cit…
▽ More
In recent years, many studies have been focusing on predicting the scientific impact of research papers. Most of these predictions are based on citations count or rely on features obtainable only from already published papers. In this study, we predict the likelihood for a research paper of winning an award only relying on information available at publication time. For each paper, we build the citation subgraph induced from its bibliography. We initially consider some features of this subgraph, such as the density and the global clustering coefficient, to make our prediction. Then, we mix this information with textual features, extracted from the abstract and the title, to obtain a more accurate final prediction. We made our experiments considering the ArnetMiner citation graph, while the ground truth on award-winning papers has been obtained from a collection of best paper awards from 32 computer science conferences. In our experiment, we obtained an encouraging F1 score of 0.694. Remarkably, The high recall and the low false negatives rate, show how the model performs very well at identifying papers that will not win an award. This behavior can help researchers in getting a first evaluation of their work at publication time. Lastly, we made some first experiments on interpretability. Our results highlight some interesting patterns both in topological and textual features.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Graph Neural Re-Ranking via Corpus Graph
Authors:
Andrea Giuseppe Di Francesco,
Christian Giannetti,
Nicola Tonellotto,
Fabrizio Silvestri
Abstract:
Re-ranking systems aim to reorder an initial list of documents to satisfy better the information needs associated with a user-provided query. Modern re-rankers predominantly rely on neural network models, which have proven highly effective in representing samples from various modalities. However, these models typically evaluate query-document pairs in isolation, neglecting the underlying document…
▽ More
Re-ranking systems aim to reorder an initial list of documents to satisfy better the information needs associated with a user-provided query. Modern re-rankers predominantly rely on neural network models, which have proven highly effective in representing samples from various modalities. However, these models typically evaluate query-document pairs in isolation, neglecting the underlying document distribution that could enhance the quality of the re-ranked list. To address this limitation, we propose Graph Neural Re-Ranking (GNRR), a pipeline based on Graph Neural Networks (GNNs), that enables each query to consider documents distribution during inference. Our approach models document relationships through corpus subgraphs and encodes their representations using GNNs. Through extensive experiments, we demonstrate that GNNs effectively capture cross-document interactions, improving performance on popular ranking metrics. In TREC-DL19, we observe a relative improvement of 5.8% in Average Precision compared to our baseline. These findings suggest that integrating the GNN segment offers significant advantages, especially in scenarios where understanding the broader context of documents is crucial.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Generating Query Recommendations via LLMs
Authors:
Andrea Bacciu,
Enrico Palumbo,
Andreas Damianou,
Nicola Tonellotto,
Fabrizio Silvestri
Abstract:
Query recommendation systems are ubiquitous in modern search engines, assisting users in producing effective queries to meet their information needs. However, these systems require a large amount of data to produce good recommendations, such as a large collection of documents to index and query logs. In particular, query logs and user data are not available in cold start scenarios. Query logs are…
▽ More
Query recommendation systems are ubiquitous in modern search engines, assisting users in producing effective queries to meet their information needs. However, these systems require a large amount of data to produce good recommendations, such as a large collection of documents to index and query logs. In particular, query logs and user data are not available in cold start scenarios. Query logs are expensive to collect and maintain and require complex and time-consuming cascading pipelines for creating, combining, and ranking recommendations. To address these issues, we frame the query recommendation problem as a generative task, proposing a novel approach called Generative Query Recommendation (GQR). GQR uses an LLM as its foundation and does not require to be trained or fine-tuned to tackle the query recommendation problem. We design a prompt that enables the LLM to understand the specific recommendation task, even using a single example. We then improved our system by proposing a version that exploits query logs called Retriever-Augmented GQR (RA-GQR). RA-GQr dynamically composes its prompt by retrieving similar queries from query logs. GQR approaches reuses a pre-existing neural architecture resulting in a simpler and more ready-to-market approach, even in a cold start scenario. Our proposed GQR obtains state-of-the-art performance in terms of NDCG@10 and clarity score against two commercial search engines and the previous state-of-the-art approach on the Robust04 and ClueWeb09B collections, improving on average the NDCG@10 performance up to ~4% on Robust04 and ClueWeb09B w.r.t the previous best competitor. RA-GQR further improve the NDCG@10 obtaining an increase of ~11%, ~6\% on Robust04 and ClueWeb09B w.r.t the best competitor. Furthermore, our system obtained ~59% of user preferences in a blind user study, proving that our method produces the most engaging queries.
△ Less
Submitted 4 June, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Debiasing Machine Unlearning with Counterfactual Examples
Authors:
Ziheng Chen,
Jia Wang,
Jun Zhuang,
Abbavaram Gowtham Reddy,
Fabrizio Silvestri,
Jin Huang,
Kaushiki Nag,
Kun Kuang,
Xin Ning,
Gabriele Tolomei
Abstract:
The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques. These techniques facilitate the deletion of previously acquired knowledge without requiring extensive model retraining. However, they often overlook a critical issue: unlearning processes bias. This bias emerges from two main sources: (1…
▽ More
The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques. These techniques facilitate the deletion of previously acquired knowledge without requiring extensive model retraining. However, they often overlook a critical issue: unlearning processes bias. This bias emerges from two main sources: (1) data-level bias, characterized by uneven data removal, and (2) algorithm-level bias, which leads to the contamination of the remaining dataset, thereby degrading model accuracy. In this work, we analyze the causal factors behind the unlearning process and mitigate biases at both data and algorithmic levels. Typically, we introduce an intervention-based approach, where knowledge to forget is erased with a debiased dataset. Besides, we guide the forgetting procedure by leveraging counterfactual examples, as they maintain semantic data consistency without hurting performance on the remaining dataset. Experimental results demonstrate that our method outperforms existing machine unlearning baselines on evaluation metrics.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
$\nabla τ$: Gradient-based and Task-Agnostic machine Unlearning
Authors:
Daniel Trippa,
Cesare Campagnano,
Maria Sofia Bucarelli,
Gabriele Tolomei,
Fabrizio Silvestri
Abstract:
Machine Unlearning, the process of selectively eliminating the influence of certain data examples used during a model's training, has gained significant attention as a means for practitioners to comply with recent data protection regulations. However, existing unlearning methods face critical drawbacks, including their prohibitively high cost, often associated with a large number of hyperparameter…
▽ More
Machine Unlearning, the process of selectively eliminating the influence of certain data examples used during a model's training, has gained significant attention as a means for practitioners to comply with recent data protection regulations. However, existing unlearning methods face critical drawbacks, including their prohibitively high cost, often associated with a large number of hyperparameters, and the limitation of forgetting only relatively small data portions. This often makes retraining the model from scratch a quicker and more effective solution. In this study, we introduce Gradient-based and Task-Agnostic machine Unlearning ($\nabla τ$), an optimization framework designed to remove the influence of a subset of training data efficiently. It applies adaptive gradient ascent to the data to be forgotten while using standard gradient descent for the remaining data. $\nabla τ$ offers multiple benefits over existing approaches. It enables the unlearning of large sections of the training dataset (up to 30%). It is versatile, supporting various unlearning tasks (such as subset forgetting or class removal) and applicable across different domains (images, text, etc.). Importantly, $\nabla τ$ requires no hyperparameter adjustments, making it a more appealing option than retraining the model from scratch. We evaluate our framework's effectiveness using a set of well-established Membership Inference Attack metrics, demonstrating up to 10% enhancements in performance compared to state-of-the-art methods without compromising the original model's accuracy.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Variational Inference of Parameters in Opinion Dynamics Models
Authors:
Jacopo Lenti,
Fabrizio Silvestri,
Gianmarco De Francisci Morales
Abstract:
Despite the frequent use of agent-based models (ABMs) for studying social phenomena, parameter estimation remains a challenge, often relying on costly simulation-based heuristics. This work uses variational inference to estimate the parameters of an opinion dynamics ABM, by transforming the estimation problem into an optimization task that can be solved directly.
Our proposal relies on probabili…
▽ More
Despite the frequent use of agent-based models (ABMs) for studying social phenomena, parameter estimation remains a challenge, often relying on costly simulation-based heuristics. This work uses variational inference to estimate the parameters of an opinion dynamics ABM, by transforming the estimation problem into an optimization task that can be solved directly.
Our proposal relies on probabilistic generative ABMs (PGABMs): we start by synthesizing a probabilistic generative model from the ABM rules. Then, we transform the inference process into an optimization problem suitable for automatic differentiation. In particular, we use the Gumbel-Softmax reparameterization for categorical agent attributes and stochastic variational inference for parameter estimation. Furthermore, we explore the trade-offs of using variational distributions with different complexity: normal distributions and normalizing flows.
We validate our method on a bounded confidence model with agent roles (leaders and followers). Our approach estimates both macroscopic (bounded confidence intervals and backfire thresholds) and microscopic ($200$ categorical, agent-level roles) more accurately than simulation-based and MCMC methods. Consequently, our technique enables experts to tune and validate their ABMs against real-world observations, thus providing insights into human behavior in social systems via data-driven analysis.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks
Authors:
Marco De Nadai,
Francesco Fabbri,
Paul Gigioli,
Alice Wang,
Ang Li,
Fabrizio Silvestri,
Laura Kim,
Shawn Lin,
Vladan Radosavljevic,
Sandeep Ghael,
David Nyhan,
Hugues Bouchard,
Mounia Lalmas-Roelleke,
Andreas Damianou
Abstract:
In the ever-evolving digital audio landscape, Spotify, well-known for its music and talk content, has recently introduced audiobooks to its vast user base. While promising, this move presents significant challenges for personalized recommendations. Unlike music and podcasts, audiobooks, initially available for a fee, cannot be easily skimmed before purchase, posing higher stakes for the relevance…
▽ More
In the ever-evolving digital audio landscape, Spotify, well-known for its music and talk content, has recently introduced audiobooks to its vast user base. While promising, this move presents significant challenges for personalized recommendations. Unlike music and podcasts, audiobooks, initially available for a fee, cannot be easily skimmed before purchase, posing higher stakes for the relevance of recommendations. Furthermore, introducing a new content type into an existing platform confronts extreme data sparsity, as most users are unfamiliar with this new content type. Lastly, recommending content to millions of users requires the model to react fast and be scalable. To address these challenges, we leverage podcast and music user preferences and introduce 2T-HGNN, a scalable recommendation system comprising Heterogeneous Graph Neural Networks (HGNNs) and a Two Tower (2T) model. This novel approach uncovers nuanced item relationships while ensuring low latency and complexity. We decouple users from the HGNN graph and propose an innovative multi-link neighbor sampler. These choices, together with the 2T component, significantly reduce the complexity of the HGNN model. Empirical evaluations involving millions of users show significant improvement in the quality of personalized recommendations, resulting in a +46% increase in new audiobooks start rate and a +23% boost in streaming rates. Intriguingly, our model's impact extends beyond audiobooks, benefiting established products like podcasts.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Link Prediction under Heterophily: A Physics-Inspired Graph Neural Network Approach
Authors:
Andrea Giuseppe Di Francesco,
Francesco Caso,
Maria Sofia Bucarelli,
Fabrizio Silvestri
Abstract:
In the past years, Graph Neural Networks (GNNs) have become the `de facto' standard in various deep learning domains, thanks to their flexibility in modeling real-world phenomena represented as graphs. However, the message-passing mechanism of GNNs faces challenges in learnability and expressivity, hindering high performance on heterophilic graphs, where adjacent nodes frequently have different la…
▽ More
In the past years, Graph Neural Networks (GNNs) have become the `de facto' standard in various deep learning domains, thanks to their flexibility in modeling real-world phenomena represented as graphs. However, the message-passing mechanism of GNNs faces challenges in learnability and expressivity, hindering high performance on heterophilic graphs, where adjacent nodes frequently have different labels. Most existing solutions addressing these challenges are primarily confined to specific benchmarks focused on node classification tasks. This narrow focus restricts the potential impact that link prediction under heterophily could offer in several applications, including recommender systems. For example, in social networks, two users may be connected for some latent reason, making it challenging to predict such connections in advance. Physics-Inspired GNNs such as GRAFF provided a significant contribution to enhance node classification performance under heterophily, thanks to the adoption of physics biases in the message-passing. Drawing inspiration from these findings, we advocate that the methodology employed by GRAFF can improve link prediction performance as well. To further explore this hypothesis, we introduce GRAFF-LP, an extension of GRAFF to link prediction. We evaluate its efficacy within a recent collection of heterophilic graphs, establishing a new benchmark for link prediction under heterophily. Our approach surpasses previous methods, in most of the datasets, showcasing a strong flexibility in different contexts, and achieving relative AUROC improvements of up to 26.7%.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
The Power of Noise: Redefining Retrieval for RAG Systems
Authors:
Florin Cuconasu,
Giovanni Trappolini,
Federico Siciliano,
Simone Filice,
Cesare Campagnano,
Yoelle Maarek,
Nicola Tonellotto,
Fabrizio Silvestri
Abstract:
Retrieval-Augmented Generation (RAG) has recently emerged as a method to extend beyond the pre-trained knowledge of Large Language Models by augmenting the original prompt with relevant passages or documents retrieved by an Information Retrieval (IR) system. RAG has become increasingly important for Generative AI solutions, especially in enterprise settings or in any domain in which knowledge is c…
▽ More
Retrieval-Augmented Generation (RAG) has recently emerged as a method to extend beyond the pre-trained knowledge of Large Language Models by augmenting the original prompt with relevant passages or documents retrieved by an Information Retrieval (IR) system. RAG has become increasingly important for Generative AI solutions, especially in enterprise settings or in any domain in which knowledge is constantly refreshed and cannot be memorized in the LLM. We argue here that the retrieval component of RAG systems, be it dense or sparse, deserves increased attention from the research community, and accordingly, we conduct the first comprehensive and systematic examination of the retrieval strategy of RAG systems. We focus, in particular, on the type of passages IR systems within a RAG solution should retrieve. Our analysis considers multiple factors, such as the relevance of the passages included in the prompt context, their position, and their number. One counter-intuitive finding of this work is that the retriever's highest-scoring documents that are not directly relevant to the query (e.g., do not contain the answer) negatively impact the effectiveness of the LLM. Even more surprising, we discovered that adding random documents in the prompt improves the LLM accuracy by up to 35%. These results highlight the need to investigate the appropriate strategies when integrating retrieval with LLMs, thereby laying the groundwork for future research in this area.
△ Less
Submitted 1 May, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
A topological description of loss surfaces based on Betti Numbers
Authors:
Maria Sofia Bucarelli,
Giuseppe Alessio D'Inverno,
Monica Bianchini,
Franco Scarselli,
Fabrizio Silvestri
Abstract:
In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts to identify spurious minima and characterize gradient dynamics. Our work aims to contribute to thi…
▽ More
In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts to identify spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure to evaluate loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds on the complexity of their loss function and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an $\ell_2$ regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
A graph neural network-based model with Out-of-Distribution Robustness for enhancing Antiretroviral Therapy Outcome Prediction for HIV-1
Authors:
Giulia Di Teodoro,
Federico Siciliano,
Valerio Guarrasi,
Anne-Mieke Vandamme,
Valeria Ghisetti,
Anders Sönnerborg,
Maurizio Zazzi,
Fabrizio Silvestri,
Laura Palagi
Abstract:
Predicting the outcome of antiretroviral therapies for HIV-1 is a pressing clinical challenge, especially when the treatment regimen includes drugs for which limited effectiveness data is available. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings. To tackle this issue, we introduce a novel joint fusion model, whi…
▽ More
Predicting the outcome of antiretroviral therapies for HIV-1 is a pressing clinical challenge, especially when the treatment regimen includes drugs for which limited effectiveness data is available. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings. To tackle this issue, we introduce a novel joint fusion model, which combines features from a Fully Connected (FC) Neural Network and a Graph Neural Network (GNN). The FC network employs tabular data with a feature vector made up of viral mutations identified in the most recent genotypic resistance test, along with the drugs used in therapy. Conversely, the GNN leverages knowledge derived from Stanford drug-resistance mutation tables, which serve as benchmark references for deducing in-vivo treatment efficacy based on the viral genetic sequence, to build informative graphs. We evaluated these models' robustness against Out-of-Distribution drugs in the test set, with a specific focus on the GNN's role in handling such scenarios. Our comprehensive analysis demonstrates that the proposed model consistently outperforms the FC model, especially when considering Out-of-Distribution drugs. These results underscore the advantage of integrating Stanford scores in the model, thereby enhancing its generalizability and robustness, but also extending its utility in real-world applications with limited data availability. This research highlights the potential of our approach to inform antiretroviral therapy outcome prediction and contribute to more informed clinical decisions.
△ Less
Submitted 29 December, 2023;
originally announced December 2023.
-
Adversarial Data Poisoning for Fake News Detection: How to Make a Model Misclassify a Target News without Modifying It
Authors:
Federico Siciliano,
Luca Maiano,
Lorenzo Papa,
Federica Baccini,
Irene Amerini,
Fabrizio Silvestri
Abstract:
Fake news detection models are critical to countering disinformation but can be manipulated through adversarial attacks. In this position paper, we analyze how an attacker can compromise the performance of an online learning detector on specific news content without being able to manipulate the original target news. In some contexts, such as social networks, where the attacker cannot exert complet…
▽ More
Fake news detection models are critical to countering disinformation but can be manipulated through adversarial attacks. In this position paper, we analyze how an attacker can compromise the performance of an online learning detector on specific news content without being able to manipulate the original target news. In some contexts, such as social networks, where the attacker cannot exert complete control over all the information, this scenario can indeed be quite plausible. Therefore, we show how an attacker could potentially introduce poisoning data into the training data to manipulate the behavior of an online learning method. Our initial findings reveal varying susceptibility of logistic regression models based on complexity and attack type.
△ Less
Submitted 4 January, 2024; v1 submitted 23 December, 2023;
originally announced December 2023.
-
Harmonizing Global Voices: Culturally-Aware Models for Enhanced Content Moderation
Authors:
Alex J. Chan,
José Luis Redondo García,
Fabrizio Silvestri,
Colm O'Donnel,
Konstantina Palla
Abstract:
Content moderation at scale faces the challenge of considering local cultural distinctions when assessing content. While global policies aim to maintain decision-making consistency and prevent arbitrary rule enforcement, they often overlook regional variations in interpreting natural language as expressed in content. In this study, we are looking into how moderation systems can tackle this issue b…
▽ More
Content moderation at scale faces the challenge of considering local cultural distinctions when assessing content. While global policies aim to maintain decision-making consistency and prevent arbitrary rule enforcement, they often overlook regional variations in interpreting natural language as expressed in content. In this study, we are looking into how moderation systems can tackle this issue by adapting to local comprehension nuances. We train large language models on extensive datasets of media news and articles to create culturally attuned models. The latter aim to capture the nuances of communication across geographies with the goal of recognizing cultural and societal variations in what is considered offensive content. We further explore the capability of these models to generate explanations for instances of content violation, aiming to shed light on how policy guidelines are perceived when cultural and societal contexts change. We find that training on extensive media datasets successfully induced cultural awareness and resulted in improvements in handling content violations on a regional basis. Additionally, these advancements include the ability to provide explanations that align with the specific local norms and nuances as evidenced by the annotators' preference in our conducted study. This multifaceted success reinforces the critical role of an adaptable content moderation approach in keeping pace with the ever-evolving nature of the content it oversees.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures
Authors:
Fabrizio Ferrandi,
Serena Curzel,
Leandro Fiorin,
Daniele Ielmini,
Cristina Silvano,
Francesco Conti,
Alessio Burrello,
Francesco Barchi,
Luca Benini,
Luciano Lavagno,
Teodoro Urso,
Enrico Calore,
Sebastiano Fabio Schifano,
Cristian Zambelli,
Maurizio Palesi,
Giuseppe Ascia,
Enrico Russo,
Nicola Petra,
Davide De Caro,
Gennaro Di Meo,
Valeria Cardellini,
Salvatore Filippone,
Francesco Lo Presti,
Francesco Silvestri,
Paolo Palazzari
, et al. (1 additional authors not shown)
Abstract:
In recent years, the field of Deep Learning has seen many disruptive and impactful advancements. Given the increasing complexity of deep neural networks, the need for efficient hardware accelerators has become more and more pressing to design heterogeneous HPC platforms. The design of Deep Learning accelerators requires a multidisciplinary approach, combining expertise from several areas, spanning…
▽ More
In recent years, the field of Deep Learning has seen many disruptive and impactful advancements. Given the increasing complexity of deep neural networks, the need for efficient hardware accelerators has become more and more pressing to design heterogeneous HPC platforms. The design of Deep Learning accelerators requires a multidisciplinary approach, combining expertise from several areas, spanning from computer architecture to approximate computing, computational models, and machine learning algorithms. Several methodologies and tools have been proposed to design accelerators for Deep Learning, including hardware-software co-design approaches, high-level synthesis methods, specific customized compilers, and methodologies for design space exploration, modeling, and simulation. These methodologies aim to maximize the exploitable parallelism and minimize data movement to achieve high performance and energy efficiency. This survey provides a holistic review of the most influential design methodologies and EDA tools proposed in recent years to implement Deep Learning accelerators, offering the reader a wide perspective in this rapidly evolving field. In particular, this work complements the previous survey proposed by the same authors in [203], which focuses on Deep Learning hardware accelerators for heterogeneous HPC platforms.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Evading Community Detection via Counterfactual Neighborhood Search
Authors:
Andrea Bernini,
Fabrizio Silvestri,
Gabriele Tolomei
Abstract:
Community detection techniques are useful for social media platforms to discover tightly connected groups of users who share common interests. However, this functionality often comes at the expense of potentially exposing individuals to privacy breaches by inadvertently revealing their tastes or preferences. Therefore, some users may wish to preserve their anonymity and opt out of community detect…
▽ More
Community detection techniques are useful for social media platforms to discover tightly connected groups of users who share common interests. However, this functionality often comes at the expense of potentially exposing individuals to privacy breaches by inadvertently revealing their tastes or preferences. Therefore, some users may wish to preserve their anonymity and opt out of community detection for various reasons, such as affiliation with political or religious organizations, without leaving the platform. In this study, we address the challenge of community membership hiding, which involves strategically altering the structural properties of a network graph to prevent one or more nodes from being identified by a given community detection algorithm. We tackle this problem by formulating it as a constrained counterfactual graph objective, and we solve it via deep reinforcement learning. Extensive experiments demonstrate that our method outperforms existing baselines, striking the best balance between accuracy and cost.
△ Less
Submitted 7 June, 2024; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Prompt-to-OS (P2OS): Revolutionizing Operating Systems and Human-Computer Interaction with Integrated AI Generative Models
Authors:
Gabriele Tolomei,
Cesare Campagnano,
Fabrizio Silvestri,
Giovanni Trappolini
Abstract:
In this paper, we present a groundbreaking paradigm for human-computer interaction that revolutionizes the traditional notion of an operating system.
Within this innovative framework, user requests issued to the machine are handled by an interconnected ecosystem of generative AI models that seamlessly integrate with or even replace traditional software applications. At the core of this paradigm…
▽ More
In this paper, we present a groundbreaking paradigm for human-computer interaction that revolutionizes the traditional notion of an operating system.
Within this innovative framework, user requests issued to the machine are handled by an interconnected ecosystem of generative AI models that seamlessly integrate with or even replace traditional software applications. At the core of this paradigm shift are large generative models, such as language and diffusion models, which serve as the central interface between users and computers. This pioneering approach leverages the abilities of advanced language models, empowering users to engage in natural language conversations with their computing devices. Users can articulate their intentions, tasks, and inquiries directly to the system, eliminating the need for explicit commands or complex navigation. The language model comprehends and interprets the user's prompts, generating and displaying contextual and meaningful responses that facilitate seamless and intuitive interactions.
This paradigm shift not only streamlines user interactions but also opens up new possibilities for personalized experiences. Generative models can adapt to individual preferences, learning from user input and continuously improving their understanding and response generation. Furthermore, it enables enhanced accessibility, as users can interact with the system using speech or text, accommodating diverse communication preferences.
However, this visionary concept raises significant challenges, including privacy, security, trustability, and the ethical use of generative models. Robust safeguards must be in place to protect user data and prevent potential misuse or manipulation of the language model.
While the full realization of this paradigm is still far from being achieved, this paper serves as a starting point for envisioning this transformative potential.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
Sheaf Hypergraph Networks
Authors:
Iulia Duta,
Giulia Cassarà,
Fabrizio Silvestri,
Pietro Liò
Abstract:
Higher-order relations are widespread in nature, with numerous phenomena involving complex interactions that extend beyond simple pairwise connections. As a result, advancements in higher-order processing can accelerate the growth of various fields requiring structured data. Current approaches typically represent these interactions using hypergraphs. We enhance this representation by introducing c…
▽ More
Higher-order relations are widespread in nature, with numerous phenomena involving complex interactions that extend beyond simple pairwise connections. As a result, advancements in higher-order processing can accelerate the growth of various fields requiring structured data. Current approaches typically represent these interactions using hypergraphs. We enhance this representation by introducing cellular sheaves for hypergraphs, a mathematical construction that adds extra structure to the conventional hypergraph while maintaining their local, higherorder connectivity. Drawing inspiration from existing Laplacians in the literature, we develop two unique formulations of sheaf hypergraph Laplacians: linear and non-linear. Our theoretical analysis demonstrates that incorporating sheaves into the hypergraph Laplacian provides a more expressive inductive bias than standard hypergraph diffusion, creating a powerful instrument for effectively modelling complex data structures. We employ these sheaf hypergraph Laplacians to design two categories of models: Sheaf Hypergraph Neural Networks and Sheaf Hypergraph Convolutional Networks. These models generalize classical Hypergraph Networks often found in the literature. Through extensive experimentation, we show that this generalization significantly improves performance, achieving top results on multiple benchmark datasets for hypergraph node classification.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Investigating the Robustness of Sequential Recommender Systems Against Training Data Perturbations
Authors:
Filippo Betello,
Federico Siciliano,
Pushkar Mishra,
Fabrizio Silvestri
Abstract:
Sequential Recommender Systems (SRSs) are widely employed to model user behavior over time. However, their robustness in the face of perturbations in training data remains a largely understudied yet critical issue. A fundamental challenge emerges in previous studies aimed at assessing the robustness of SRSs: the Rank-Biased Overlap (RBO) similarity is not particularly suited for this task as it is…
▽ More
Sequential Recommender Systems (SRSs) are widely employed to model user behavior over time. However, their robustness in the face of perturbations in training data remains a largely understudied yet critical issue. A fundamental challenge emerges in previous studies aimed at assessing the robustness of SRSs: the Rank-Biased Overlap (RBO) similarity is not particularly suited for this task as it is designed for infinite rankings of items and thus shows limitations in real-world scenarios. For instance, it fails to achieve a perfect score of 1 for two identical finite-length rankings. To address this challenge, we introduce a novel contribution: Finite Rank-Biased Overlap (FRBO), an enhanced similarity tailored explicitly for finite rankings. This innovation facilitates a more intuitive evaluation in practical settings. In pursuit of our goal, we empirically investigate the impact of removing items at different positions within a temporally ordered sequence. We evaluate two distinct SRS models across multiple datasets, measuring their performance using metrics such as Normalized Discounted Cumulative Gain (NDCG) and Rank List Sensitivity. Our results demonstrate that removing items at the end of the sequence has a statistically significant impact on performance, with NDCG decreasing up to 60%. Conversely, removing items from the beginning or middle has no significant effect. These findings underscore the criticality of the position of perturbed items in the training data. As we spotlight the vulnerabilities inherent in current SRSs, we fervently advocate for intensified research efforts to fortify their robustness against adversarial perturbations.
△ Less
Submitted 27 December, 2023; v1 submitted 24 July, 2023;
originally announced July 2023.
-
RRAML: Reinforced Retrieval Augmented Machine Learning
Authors:
Andrea Bacciu,
Florin Cuconasu,
Federico Siciliano,
Fabrizio Silvestri,
Nicola Tonellotto,
Giovanni Trappolini
Abstract:
The emergence of large language models (LLMs) has revolutionized machine learning and related fields, showcasing remarkable abilities in comprehending, generating, and manipulating human language. However, their conventional usage through API-based text prompt submissions imposes certain limitations in terms of context constraints and external source availability. To address these challenges, we p…
▽ More
The emergence of large language models (LLMs) has revolutionized machine learning and related fields, showcasing remarkable abilities in comprehending, generating, and manipulating human language. However, their conventional usage through API-based text prompt submissions imposes certain limitations in terms of context constraints and external source availability. To address these challenges, we propose a novel framework called Reinforced Retrieval Augmented Machine Learning (RRAML). RRAML integrates the reasoning capabilities of LLMs with supporting information retrieved by a purpose-built retriever from a vast user-provided database. By leveraging recent advancements in reinforcement learning, our method effectively addresses several critical challenges. Firstly, it circumvents the need for accessing LLM gradients. Secondly, our method alleviates the burden of retraining LLMs for specific tasks, as it is often impractical or impossible due to restricted access to the model and the computational intensity involved. Additionally we seamlessly link the retriever's task with the reasoner, mitigating hallucinations and reducing irrelevant, and potentially damaging retrieved documents. We believe that the research agenda outlined in this paper has the potential to profoundly impact the field of AI, democratizing access to and utilization of LLMs for a wide range of entities.
△ Less
Submitted 27 July, 2023; v1 submitted 24 July, 2023;
originally announced July 2023.
-
Fauno: The Italian Large Language Model that will leave you senza parole!
Authors:
Andrea Bacciu,
Giovanni Trappolini,
Andrea Santilli,
Emanuele Rodolà,
Fabrizio Silvestri
Abstract:
This paper presents Fauno, the first and largest open-source Italian conversational Large Language Model (LLM). Our goal with Fauno is to democratize the study of LLMs in Italian, demonstrating that obtaining a fine-tuned conversational bot with a single GPU is possible. In addition, we release a collection of datasets for conversational AI in Italian. The datasets on which we fine-tuned Fauno inc…
▽ More
This paper presents Fauno, the first and largest open-source Italian conversational Large Language Model (LLM). Our goal with Fauno is to democratize the study of LLMs in Italian, demonstrating that obtaining a fine-tuned conversational bot with a single GPU is possible. In addition, we release a collection of datasets for conversational AI in Italian. The datasets on which we fine-tuned Fauno include various topics such as general question answering, computer science, and medical questions. We release our code and datasets on \url{https://github.com/RSTLess-research/Fauno-Italian-LLM}
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Renormalized Graph Neural Networks
Authors:
Francesco Caso,
Giovanni Trappolini,
Andrea Bacciu,
Pietro Liò,
Fabrizio Silvestri
Abstract:
Graph Neural Networks (GNNs) have become essential for studying complex data, particularly when represented as graphs. Their value is underpinned by their ability to reflect the intricacies of numerous areas, ranging from social to biological networks. GNNs can grapple with non-linear behaviors, emerging patterns, and complex connections; these are also typical characteristics of complex systems.…
▽ More
Graph Neural Networks (GNNs) have become essential for studying complex data, particularly when represented as graphs. Their value is underpinned by their ability to reflect the intricacies of numerous areas, ranging from social to biological networks. GNNs can grapple with non-linear behaviors, emerging patterns, and complex connections; these are also typical characteristics of complex systems. The renormalization group (RG) theory has emerged as the language for studying complex systems. It is recognized as the preferred lens through which to study complex systems, offering a framework that can untangle their intricate dynamics. Despite the clear benefits of integrating RG theory with GNNs, no existing methods have ventured into this promising territory. This paper proposes a new approach that applies RG theory to devise a novel graph rewiring to improve GNNs' performance on graph-related tasks. We support our proposal with extensive experiments on standard benchmarks and baselines. The results demonstrate the effectiveness of our method and its potential to remedy the current limitations of GNNs. Finally, this paper marks the beginning of a new research direction. This path combines the theoretical foundations of RG, the magnifying glass of complex systems, with the structural capabilities of GNNs. By doing so, we aim to enhance the potential of GNNs in modeling and unraveling the complexities inherent in diverse systems.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Integrating Item Relevance in Training Loss for Sequential Recommender Systems
Authors:
Andrea Bacciu,
Federico Siciliano,
Nicola Tonellotto,
Fabrizio Silvestri
Abstract:
Sequential Recommender Systems (SRSs) are a popular type of recommender system that learns from a user's history to predict the next item they are likely to interact with. However, user interactions can be affected by noise stemming from account sharing, inconsistent preferences, or accidental clicks. To address this issue, we (i) propose a new evaluation protocol that takes multiple future items…
▽ More
Sequential Recommender Systems (SRSs) are a popular type of recommender system that learns from a user's history to predict the next item they are likely to interact with. However, user interactions can be affected by noise stemming from account sharing, inconsistent preferences, or accidental clicks. To address this issue, we (i) propose a new evaluation protocol that takes multiple future items into account and (ii) introduce a novel relevance-aware loss function to train a SRS with multiple future items to make it more robust to noise. Our relevance-aware models obtain an improvement of ~1.2% of NDCG@10 and 0.88% in the traditional evaluation protocol, while in the new evaluation protocol, the improvement is ~1.63% of NDCG@10 and ~1.5% of HR w.r.t the best performing models.
△ Less
Submitted 10 June, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Multimodal Neural Databases
Authors:
Giovanni Trappolini,
Andrea Santilli,
Emanuele Rodolà,
Alon Halevy,
Fabrizio Silvestri
Abstract:
The rise in loosely-structured data available through text, images, and other modalities has called for new ways of querying them. Multimedia Information Retrieval has filled this gap and has witnessed exciting progress in recent years. Tasks such as search and retrieval of extensive multimedia archives have undergone massive performance improvements, driven to a large extent by recent development…
▽ More
The rise in loosely-structured data available through text, images, and other modalities has called for new ways of querying them. Multimedia Information Retrieval has filled this gap and has witnessed exciting progress in recent years. Tasks such as search and retrieval of extensive multimedia archives have undergone massive performance improvements, driven to a large extent by recent developments in multimodal deep learning. However, methods in this field remain limited in the kinds of queries they support and, in particular, their inability to answer database-like queries. For this reason, inspired by recent work on neural databases, we propose a new framework, which we name Multimodal Neural Databases (MMNDBs). MMNDBs can answer complex database-like queries that involve reasoning over different input modalities, such as text and images, at scale. In this paper, we present the first architecture able to fulfill this set of requirements and test it with several baselines, showing the limitations of currently available models. The results show the potential of these new techniques to process unstructured data coming from different modalities, paving the way for future research in the area. Code to replicate the experiments will be released at https://github.com/GiovanniTRA/MultimodalNeuralDatabases
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
The Dark Side of Explanations: Poisoning Recommender Systems with Counterfactual Examples
Authors:
Ziheng Chen,
Fabrizio Silvestri,
Jia Wang,
Yongfeng Zhang,
Gabriele Tolomei
Abstract:
Deep learning-based recommender systems have become an integral part of several online platforms. However, their black-box nature emphasizes the need for explainable artificial intelligence (XAI) approaches to provide human-understandable reasons why a specific item gets recommended to a given user. One such method is counterfactual explanation (CF). While CFs can be highly beneficial for users an…
▽ More
Deep learning-based recommender systems have become an integral part of several online platforms. However, their black-box nature emphasizes the need for explainable artificial intelligence (XAI) approaches to provide human-understandable reasons why a specific item gets recommended to a given user. One such method is counterfactual explanation (CF). While CFs can be highly beneficial for users and system designers, malicious actors may also exploit these explanations to undermine the system's security. In this work, we propose H-CARS, a novel strategy to poison recommender systems via CFs. Specifically, we first train a logical-reasoning-based surrogate model on training data derived from counterfactual explanations. By reversing the learning process of the recommendation model, we thus develop a proficient greedy algorithm to generate fabricated user profiles and their associated interaction records for the aforementioned surrogate model. Our experiments, which employ a well-known CF generation method and are conducted on two distinct datasets, show that H-CARS yields significant and successful attack performance.
△ Less
Submitted 30 April, 2023;
originally announced May 2023.
-
Sheaf4Rec: Sheaf Neural Networks for Graph-based Recommender Systems
Authors:
Antonio Purificato,
Giulia Cassarà,
Federico Siciliano,
Pietro Liò,
Fabrizio Silvestri
Abstract:
Recent advancements in Graph Neural Networks (GNN) have facilitated their widespread adoption in various applications, including recommendation systems. GNNs have proven to be effective in addressing the challenges posed by recommendation systems by efficiently modeling graphs in which nodes represent users or items and edges denote preference relationships. However, current GNN techniques represe…
▽ More
Recent advancements in Graph Neural Networks (GNN) have facilitated their widespread adoption in various applications, including recommendation systems. GNNs have proven to be effective in addressing the challenges posed by recommendation systems by efficiently modeling graphs in which nodes represent users or items and edges denote preference relationships. However, current GNN techniques represent nodes by means of a single static vector, which may inadequately capture the intricate complexities of users and items. To overcome these limitations, we propose a solution integrating a cutting-edge model inspired by category theory: Sheaf4Rec. Unlike single vector representations, Sheaf Neural Networks and their corresponding Laplacians represent each node (and edge) using a vector space. Our approach takes advantage from this theory and results in a more comprehensive representation that can be effectively exploited during inference, providing a versatile method applicable to a wide range of graph-related tasks and demonstrating unparalleled performance. Our proposed model exhibits a noteworthy relative improvement of up to 8.53% on F1-Score@10 and an impressive increase of up to 11.29% on NDCG@10, outperforming existing state-of-the-art models such as Neural Graph Collaborative Filtering (NGCF), KGTORe and other recently developed GNN-based models. In addition to its superior predictive capabilities, Sheaf4Rec shows remarkable improvements in terms of efficiency: we observe substantial runtime improvements ranging from 2.5% up to 37% when compared to other GNN-based competitor models, indicating a more efficient way of handling information while achieving better performance. Code is available at https://github.com/antoniopurificato/Sheaf4Rec.
△ Less
Submitted 16 March, 2024; v1 submitted 7 April, 2023;
originally announced April 2023.
-
Learning with Noisy Labels through Learnable Weighting and Centroid Similarity
Authors:
Farooq Ahmad Wani,
Maria Sofia Bucarelli,
Fabrizio Silvestri
Abstract:
We introduce a novel method for training machine learning models in the presence of noisy labels, which are prevalent in domains such as medical diagnosis and autonomous driving and have the potential to degrade a model's generalization performance. Inspired by established literature that highlights how deep learning models are prone to overfitting to noisy samples in the later epochs of training,…
▽ More
We introduce a novel method for training machine learning models in the presence of noisy labels, which are prevalent in domains such as medical diagnosis and autonomous driving and have the potential to degrade a model's generalization performance. Inspired by established literature that highlights how deep learning models are prone to overfitting to noisy samples in the later epochs of training, we propose a strategic approach. This strategy leverages the distance to class centroids in the latent space and incorporates a discounting mechanism, aiming to diminish the influence of samples that lie distant from all class centroids. By doing so, we effectively counteract the adverse effects of noisy labels. The foundational premise of our approach is the assumption that samples situated further from their respective class centroid in the initial stages of training are more likely to be associated with noise. Our methodology is grounded in robust theoretical principles and has been validated empirically through extensive experiments on several benchmark datasets. Our results show that our method consistently outperforms the existing state-of-the-art techniques, achieving significant improvements in classification accuracy in the presence of noisy labels. The code for our proposed loss function and supplementary materials is available at https://github.com/wanifarooq/NCOD
△ Less
Submitted 25 June, 2024; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Attention-likelihood relationship in transformers
Authors:
Valeria Ruscio,
Valentino Maiorca,
Fabrizio Silvestri
Abstract:
We analyze how large language models (LLMs) represent out-of-context words, investigating their reliance on the given context to capture their semantics. Our likelihood-guided text perturbations reveal a correlation between token likelihood and attention values in transformer-based language models. Extensive experiments reveal that unexpected tokens cause the model to attend less to the informatio…
▽ More
We analyze how large language models (LLMs) represent out-of-context words, investigating their reliance on the given context to capture their semantics. Our likelihood-guided text perturbations reveal a correlation between token likelihood and attention values in transformer-based language models. Extensive experiments reveal that unexpected tokens cause the model to attend less to the information coming from themselves to compute their representations, particularly at higher layers. These findings have valuable implications for assessing the robustness of LLMs in real-world scenarios. Fully reproducible codebase at https://github.com/Flegyas/AttentionLikelihood.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Dimensionality reduction on complex vector spaces for Euclidean distance with dynamic weights
Authors:
Paolo Pellizzoni,
Simone Moretti,
Francesco Silvestri
Abstract:
The weighted Euclidean norm $\|x\|_w$ of a vector $x\in \mathbb{R}^d$ with weights $w\in \mathbb{R}^d$ is the Euclidean norm where the contribution of each dimension is scaled by a given weight. Approaches to dimensionality reduction that satisfy the Johnson-Lindenstrauss (JL) lemma can be easily adapted to the weighted Euclidean distance if weights are fixed: it suffices to scale each dimension o…
▽ More
The weighted Euclidean norm $\|x\|_w$ of a vector $x\in \mathbb{R}^d$ with weights $w\in \mathbb{R}^d$ is the Euclidean norm where the contribution of each dimension is scaled by a given weight. Approaches to dimensionality reduction that satisfy the Johnson-Lindenstrauss (JL) lemma can be easily adapted to the weighted Euclidean distance if weights are fixed: it suffices to scale each dimension of the input vectors according to the weights, and then apply any standard approach. However, this is not the case when weights are unknown during the dimensionality reduction or might dynamically change. We address this issue by providing an approach that maps vectors into a smaller complex vector space, but still allows to satisfy a JL-like property for the weighted Euclidean distance when weights are revealed.
Specifically, let $Δ\geq 1, ε\in (0,1)$ be arbitrary values, and let $S\subset \mathbb{R}^d$ be a set of $n$ vectors. We provide a weight-oblivious linear map $g:\mathbb{R}^d \rightarrow \mathbb{C}^k$, with $k=Θ(ε^{-2}Δ^4 \ln{n})$, to reduce vectors in $S$, and an estimator $ρ: \mathbb{C}^k \times \mathbb{R}^d \rightarrow \mathbb R$ with the following property. For any $x\in S$, the value $ρ(g(x), w)$ is an unbiased estimate of $\|x\|^2_w$, and $ρ$ is computed from the reduced vector $g(x)$ and the weights $w$. Moreover, the error of the estimate $ρ((g(x), w)$ depends on the norm distortion due to weights and parameter $Δ$: for any $x\in S$, the estimate has a multiplicative error $ε$ if $\|x\|_2\|w\|_2/\|x\|_w\leq Δ$, otherwise the estimate has an additive $ε\|x\|^2_2\|w\|^2_2/Δ^2$ error.
Finally, we consider the estimation of weighted Euclidean norms in streaming settings: we show how to estimate the weighted norm when the weights are provided either after or concurrently with the input vector.
△ Less
Submitted 14 December, 2023; v1 submitted 13 December, 2022;
originally announced December 2022.
-
Sparse Vicious Attacks on Graph Neural Networks
Authors:
Giovanni Trappolini,
Valentino Maiorca,
Silvio Severino,
Emanuele Rodolà,
Fabrizio Silvestri,
Gabriele Tolomei
Abstract:
Graph Neural Networks (GNNs) have proven to be successful in several predictive modeling tasks for graph-structured data.
Amongst those tasks, link prediction is one of the fundamental problems for many real-world applications, such as recommender systems.
However, GNNs are not immune to adversarial attacks, i.e., carefully crafted malicious examples that are designed to fool the predictive mo…
▽ More
Graph Neural Networks (GNNs) have proven to be successful in several predictive modeling tasks for graph-structured data.
Amongst those tasks, link prediction is one of the fundamental problems for many real-world applications, such as recommender systems.
However, GNNs are not immune to adversarial attacks, i.e., carefully crafted malicious examples that are designed to fool the predictive model.
In this work, we focus on a specific, white-box attack to GNN-based link prediction models, where a malicious node aims to appear in the list of recommended nodes for a given target victim.
To achieve this goal, the attacker node may also count on the cooperation of other existing peers that it directly controls, namely on the ability to inject a number of ``vicious'' nodes in the network.
Specifically, all these malicious nodes can add new edges or remove existing ones, thereby perturbing the original graph.
Thus, we propose SAVAGE, a novel framework and a method to mount this type of link prediction attacks.
SAVAGE formulates the adversary's goal as an optimization task, striking the balance between the effectiveness of the attack and the sparsity of malicious resources required.
Extensive experiments conducted on real-world and synthetic datasets demonstrate that adversarial attacks implemented through SAVAGE indeed achieve high attack success rate yet using a small amount of vicious nodes.
Finally, despite those attacks require full knowledge of the target model, we show that they are successfully transferable to other black-box methods for link prediction.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
GREASE: Generate Factual and Counterfactual Explanations for GNN-based Recommendations
Authors:
Ziheng Chen,
Fabrizio Silvestri,
Jia Wang,
Yongfeng Zhang,
Zhenhua Huang,
Hongshik Ahn,
Gabriele Tolomei
Abstract:
Recently, graph neural networks (GNNs) have been widely used to develop successful recommender systems. Although powerful, it is very difficult for a GNN-based recommender system to attach tangible explanations of why a specific item ends up in the list of suggestions for a given user. Indeed, explaining GNN-based recommendations is unique, and existing GNN explanation methods are inappropriate fo…
▽ More
Recently, graph neural networks (GNNs) have been widely used to develop successful recommender systems. Although powerful, it is very difficult for a GNN-based recommender system to attach tangible explanations of why a specific item ends up in the list of suggestions for a given user. Indeed, explaining GNN-based recommendations is unique, and existing GNN explanation methods are inappropriate for two reasons. First, traditional GNN explanation methods are designed for node, edge, or graph classification tasks rather than ranking, as in recommender systems. Second, standard machine learning explanations are usually intended to support skilled decision-makers. Instead, recommendations are designed for any end-user, and thus their explanations should be provided in user-understandable ways. In this work, we propose GREASE, a novel method for explaining the suggestions provided by any black-box GNN-based recommender system. Specifically, GREASE first trains a surrogate model on a target user-item pair and its $l$-hop neighborhood. Then, it generates both factual and counterfactual explanations by finding optimal adjacency matrix perturbations to capture the sufficient and necessary conditions for an item to be recommended, respectively. Experimental results conducted on real-world datasets demonstrate that GREASE can generate concise and effective explanations for popular GNN-based recommender models.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Encoding Concepts in Graph Neural Networks
Authors:
Lucie Charlotte Magister,
Pietro Barbiero,
Dmitry Kazhdan,
Federico Siciliano,
Gabriele Ciravegna,
Fabrizio Silvestri,
Mateja Jamnik,
Pietro Lio
Abstract:
The opaque reasoning of Graph Neural Networks induces a lack of human trust. Existing graph network explainers attempt to address this issue by providing post-hoc explanations, however, they fail to make the model itself more interpretable. To fill this gap, we introduce the Concept Encoder Module, the first differentiable concept-discovery approach for graph networks. The proposed approach makes…
▽ More
The opaque reasoning of Graph Neural Networks induces a lack of human trust. Existing graph network explainers attempt to address this issue by providing post-hoc explanations, however, they fail to make the model itself more interpretable. To fill this gap, we introduce the Concept Encoder Module, the first differentiable concept-discovery approach for graph networks. The proposed approach makes graph networks explainable by design by first discovering graph concepts and then using these to solve the task. Our results demonstrate that this approach allows graph networks to: (i) attain model accuracy comparable with their equivalent vanilla versions, (ii) discover meaningful concepts that achieve high concept completeness and purity scores, (iii) provide high-quality concept-based logic explanations for their prediction, and (iv) support effective interventions at test time: these can increase human trust as well as significantly improve model performance.
△ Less
Submitted 7 August, 2022; v1 submitted 27 July, 2022;
originally announced July 2022.
-
Detecting and Understanding Harmful Memes: A Survey
Authors:
Shivam Sharma,
Firoj Alam,
Md. Shad Akhtar,
Dimitar Dimitrov,
Giovanni Da San Martino,
Hamed Firooz,
Alon Halevy,
Fabrizio Silvestri,
Preslav Nakov,
Tanmoy Chakraborty
Abstract:
The automatic identification of harmful content online is of major concern for social media platforms, policymakers, and society. Researchers have studied textual, visual, and audio content, but typically in isolation. Yet, harmful content often combines multiple modalities, as in the case of memes, which are of particular interest due to their viral nature. With this in mind, here we offer a comp…
▽ More
The automatic identification of harmful content online is of major concern for social media platforms, policymakers, and society. Researchers have studied textual, visual, and audio content, but typically in isolation. Yet, harmful content often combines multiple modalities, as in the case of memes, which are of particular interest due to their viral nature. With this in mind, here we offer a comprehensive survey with a focus on harmful memes. Based on a systematic analysis of recent literature, we first propose a new typology of harmful memes, and then we highlight and summarize the relevant state of the art. One interesting finding is that many types of harmful memes are not really studied, e.g., such featuring self-harm and extremism, partly due to the lack of suitable datasets. We further find that existing datasets mostly capture multi-class scenarios, which are not inclusive of the affective spectrum that memes can represent. Another observation is that memes can propagate globally through repackaging in different languages and that they can also be multilingual, blending different cultures. We conclude by highlighting several challenges related to multimodal semiotics, technological constraints, and non-trivial social engagement, and we present several open-ended aspects such as delineating online harm and empirically examining related frameworks and assistive interventions, which we believe will motivate and drive future research.
△ Less
Submitted 29 May, 2022; v1 submitted 9 May, 2022;
originally announced May 2022.
-
Blocking Techniques for Sparse Matrix Multiplication on Tensor Accelerators
Authors:
Paolo Sylos Labini,
Massimo Bernaschi,
Francesco Silvestri,
Flavio Vella
Abstract:
Tensor accelerators have gained popularity because they provide a cheap and efficient solution for speeding up computational-expensive tasks in Deep Learning and, more recently, in other Scientific Computing applications. However, since their features are specifically designed for tensor algebra (typically dense matrix-product), it is commonly assumed that they are not suitable for applications wi…
▽ More
Tensor accelerators have gained popularity because they provide a cheap and efficient solution for speeding up computational-expensive tasks in Deep Learning and, more recently, in other Scientific Computing applications. However, since their features are specifically designed for tensor algebra (typically dense matrix-product), it is commonly assumed that they are not suitable for applications with sparse data. To challenge this viewpoint, we discuss methods and present solutions for accelerating sparse matrix multiplication on such architectures. In particular, we present a 1-dimensional blocking algorithm with theoretical guarantees on the density, which builds dense blocks from arbitrary sparse matrices. Experimental results show that, even for unstructured and highly-sparse matrices, our block-based solution which exploits Nvidia Tensor Cores is faster than its sparse counterpart. We observed significant speed-ups of up to two orders of magnitude on real-world sparse matrices.
△ Less
Submitted 11 February, 2022;
originally announced February 2022.
-
ReLAX: Reinforcement Learning Agent eXplainer for Arbitrary Predictive Models
Authors:
Ziheng Chen,
Fabrizio Silvestri,
Jia Wang,
He Zhu,
Hongshik Ahn,
Gabriele Tolomei
Abstract:
Counterfactual examples (CFs) are one of the most popular methods for attaching post-hoc explanations to machine learning (ML) models. However, existing CF generation methods either exploit the internals of specific models or depend on each sample's neighborhood, thus they are hard to generalize for complex models and inefficient for large datasets. This work aims to overcome these limitations and…
▽ More
Counterfactual examples (CFs) are one of the most popular methods for attaching post-hoc explanations to machine learning (ML) models. However, existing CF generation methods either exploit the internals of specific models or depend on each sample's neighborhood, thus they are hard to generalize for complex models and inefficient for large datasets. This work aims to overcome these limitations and introduces ReLAX, a model-agnostic algorithm to generate optimal counterfactual explanations. Specifically, we formulate the problem of crafting CFs as a sequential decision-making task and then find the optimal CFs via deep reinforcement learning (DRL) with discrete-continuous hybrid action space. Extensive experiments conducted on several tabular datasets have shown that ReLAX outperforms existing CF generation baselines, as it produces sparser counterfactuals, is more scalable to complex target models to explain, and generalizes to both classification and regression tasks. Finally, to demonstrate the usefulness of our method in a real-world use case, we leverage CFs generated by ReLAX to suggest actions that a country should take to reduce the risk of mortality due to COVID-19. Interestingly enough, the actions recommended by our method correspond to the strategies that many countries have actually implemented to counter the COVID-19 pandemic.
△ Less
Submitted 8 August, 2022; v1 submitted 22 October, 2021;
originally announced October 2021.
-
NEWRON: A New Generalization of the Artificial Neuron to Enhance the Interpretability of Neural Networks
Authors:
Federico Siciliano,
Maria Sofia Bucarelli,
Gabriele Tolomei,
Fabrizio Silvestri
Abstract:
In this work, we formulate NEWRON: a generalization of the McCulloch-Pitts neuron structure. This new framework aims to explore additional desirable properties of artificial neurons. We show that some specializations of NEWRON allow the network to be interpretable with no change in their expressiveness. By just inspecting the models produced by our NEWRON-based networks, we can understand the rule…
▽ More
In this work, we formulate NEWRON: a generalization of the McCulloch-Pitts neuron structure. This new framework aims to explore additional desirable properties of artificial neurons. We show that some specializations of NEWRON allow the network to be interpretable with no change in their expressiveness. By just inspecting the models produced by our NEWRON-based networks, we can understand the rules governing the task. Extensive experiments show that the quality of the generated models is better than traditional interpretable models and in line or better than standard neural networks.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
Detecting Propaganda Techniques in Memes
Authors:
Dimitar Dimitrov,
Bishr Bin Ali,
Shaden Shaar,
Firoj Alam,
Fabrizio Silvestri,
Hamed Firooz,
Preslav Nakov,
Giovanni Da San Martino
Abstract:
Propaganda can be defined as a form of communication that aims to influence the opinions or the actions of people towards a specific goal; this is achieved by means of well-defined rhetorical and psychological devices. Propaganda, in the form we know it today, can be dated back to the beginning of the 17th century. However, it is with the advent of the Internet and the social media that it has sta…
▽ More
Propaganda can be defined as a form of communication that aims to influence the opinions or the actions of people towards a specific goal; this is achieved by means of well-defined rhetorical and psychological devices. Propaganda, in the form we know it today, can be dated back to the beginning of the 17th century. However, it is with the advent of the Internet and the social media that it has started to spread on a much larger scale than before, thus becoming major societal and political issue. Nowadays, a large fraction of propaganda in social media is multimodal, mixing textual with visual content. With this in mind, here we propose a new multi-label multimodal task: detecting the type of propaganda techniques used in memes. We further create and release a new corpus of 950 memes, carefully annotated with 22 propaganda techniques, which can appear in the text, in the image, or in both. Our analysis of the corpus shows that understanding both modalities together is essential for detecting these techniques. This is further confirmed in our experiments with several state-of-the-art multimodal models.
△ Less
Submitted 7 August, 2021;
originally announced September 2021.
-
On the Bike Spreading Problem
Authors:
Elia Costa,
Francesco Silvestri
Abstract:
A free-floating bike-sharing system (FFBSS) is a dockless rental system where an individual can borrow a bike and returns it anywhere, within the service area. To improve the rental service, available bikes should be distributed over the entire service area: a customer leaving from any position is then more likely to find a near bike and then to use the service. Moreover, spreading bikes among the…
▽ More
A free-floating bike-sharing system (FFBSS) is a dockless rental system where an individual can borrow a bike and returns it anywhere, within the service area. To improve the rental service, available bikes should be distributed over the entire service area: a customer leaving from any position is then more likely to find a near bike and then to use the service. Moreover, spreading bikes among the entire service area increases urban spatial equity since the benefits of FFBSS are not a prerogative of just a few zones. For guaranteeing such distribution, the FFBSS operator can use vans to manually relocate bikes, but it incurs high economic and environmental costs. We propose a novel approach that exploits the existing bike flows generated by customers to distribute bikes. More specifically, by envisioning the problem as an Influence Maximization problem, we show that it is possible to position batches of bikes on a small number of zones, and then the daily use of FFBSS will efficiently spread these bikes on a large area. We show that detecting these zones is NP-complete, but there exists a simple and efficient $1-1/e$ approximation algorithm; our approach is then evaluated on a dataset of rides from the free-floating bike-sharing system of the city of Padova.
△ Less
Submitted 18 August, 2021; v1 submitted 1 July, 2021;
originally announced July 2021.
-
Database Reasoning Over Text
Authors:
James Thorne,
Majid Yazdani,
Marzieh Saeidi,
Fabrizio Silvestri,
Sebastian Riedel,
Alon Halevy
Abstract:
Neural models have shown impressive performance gains in answering queries from natural language text. However, existing works are unable to support database queries, such as "List/Count all female athletes who were born in 20th century", which require reasoning over sets of relevant facts with operations such as join, filtering and aggregation. We show that while state-of-the-art transformer mode…
▽ More
Neural models have shown impressive performance gains in answering queries from natural language text. However, existing works are unable to support database queries, such as "List/Count all female athletes who were born in 20th century", which require reasoning over sets of relevant facts with operations such as join, filtering and aggregation. We show that while state-of-the-art transformer models perform very well for small databases, they exhibit limitations in processing noisy data, numerical operations, and queries that aggregate facts. We propose a modular architecture to answer these database-style queries over multiple spans from text and aggregating these at scale. We evaluate the architecture using WikiNLDB, a novel dataset for exploring such queries. Our architecture scales to databases containing thousands of facts whereas contemporary models are limited by how many facts can be encoded. In direct comparison on small databases, our approach increases overall answer accuracy from 85% to 90%. On larger databases, our approach retains its accuracy whereas transformer baselines could not encode the context.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
SemEval-2021 Task 6: Detection of Persuasion Techniques in Texts and Images
Authors:
Dimitar Dimitrov,
Bishr Bin Ali,
Shaden Shaar,
Firoj Alam,
Fabrizio Silvestri,
Hamed Firooz,
Preslav Nakov,
Giovanni Da San Martino
Abstract:
We describe SemEval-2021 task 6 on Detection of Persuasion Techniques in Texts and Images: the data, the annotation guidelines, the evaluation setup, the results, and the participating systems. The task focused on memes and had three subtasks: (i) detecting the techniques in the text, (ii) detecting the text spans where the techniques are used, and (iii) detecting techniques in the entire meme, i.…
▽ More
We describe SemEval-2021 task 6 on Detection of Persuasion Techniques in Texts and Images: the data, the annotation guidelines, the evaluation setup, the results, and the participating systems. The task focused on memes and had three subtasks: (i) detecting the techniques in the text, (ii) detecting the text spans where the techniques are used, and (iii) detecting techniques in the entire meme, i.e., both in the text and in the image. It was a popular task, attracting 71 registrations, and 22 teams that eventually made an official submission on the test set. The evaluation results for the third subtask confirmed the importance of both modalities, the text and the image. Moreover, some teams reported benefits when not just combining the two modalities, e.g., by using early or late fusion, but rather modeling the interaction between them in a joint model.
△ Less
Submitted 25 April, 2021;
originally announced May 2021.
-
CycleDRUMS: Automatic Drum Arrangement For Bass Lines Using CycleGAN
Authors:
Giorgio Barnabò,
Giovanni Trappolini,
Lorenzo Lastilla,
Cesare Campagnano,
Angela Fan,
Fabio Petroni,
Fabrizio Silvestri
Abstract:
The two main research threads in computer-based music generation are: the construction of autonomous music-making systems, and the design of computer-based environments to assist musicians. In the symbolic domain, the key problem of automatically arranging a piece music was extensively studied, while relatively fewer systems tackled this challenge in the audio domain. In this contribution, we prop…
▽ More
The two main research threads in computer-based music generation are: the construction of autonomous music-making systems, and the design of computer-based environments to assist musicians. In the symbolic domain, the key problem of automatically arranging a piece music was extensively studied, while relatively fewer systems tackled this challenge in the audio domain. In this contribution, we propose CycleDRUMS, a novel method for generating drums given a bass line. After converting the waveform of the bass into a mel-spectrogram, we are able to automatically generate original drums that follow the beat, sound credible and can be directly mixed with the input bass. We formulated this task as an unpaired image-to-image translation problem, and we addressed it with CycleGAN, a well-established unsupervised style transfer framework, originally designed for treating images. The choice to deploy raw audio and mel-spectrograms enabled us to better represent how humans perceive music, and to potentially draw sounds for new arrangements from the vast collection of music recordings accumulated in the last century. In absence of an objective way of evaluating the output of both generative adversarial networks and music generative systems, we further defined a possible metric for the proposed task, partially based on human (and expert) judgement. Finally, as a comparison, we replicated our results with Pix2Pix, a paired image-to-image translation network, and we showed that our approach outperforms it.
△ Less
Submitted 9 April, 2021; v1 submitted 1 April, 2021;
originally announced April 2021.
-
A Survey on Multimodal Disinformation Detection
Authors:
Firoj Alam,
Stefano Cresci,
Tanmoy Chakraborty,
Fabrizio Silvestri,
Dimiter Dimitrov,
Giovanni Da San Martino,
Shaden Shaar,
Hamed Firooz,
Preslav Nakov
Abstract:
Recent years have witnessed the proliferation of offensive content online such as fake news, propaganda, misinformation, and disinformation. While initially this was mostly about textual content, over time images and videos gained popularity, as they are much easier to consume, attract more attention, and spread further than text. As a result, researchers started leveraging different modalities an…
▽ More
Recent years have witnessed the proliferation of offensive content online such as fake news, propaganda, misinformation, and disinformation. While initially this was mostly about textual content, over time images and videos gained popularity, as they are much easier to consume, attract more attention, and spread further than text. As a result, researchers started leveraging different modalities and combinations thereof to tackle online multimodal offensive content. In this study, we offer a survey on the state-of-the-art on multimodal disinformation detection covering various combinations of modalities: text, images, speech, video, social media network structure, and temporal information. Moreover, while some studies focused on factuality, others investigated how harmful the content is. While these two components in the definition of disinformation (i) factuality, and (ii) harmfulness, are equally important, they are typically studied in isolation. Thus, we argue for the need to tackle disinformation detection by taking into account multiple modalities as well as both factuality and harmfulness, in the same framework. Finally, we discuss current challenges and future research directions
△ Less
Submitted 28 September, 2022; v1 submitted 13 March, 2021;
originally announced March 2021.
-
CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks
Authors:
Ana Lucic,
Maartje ter Hoeve,
Gabriele Tolomei,
Maarten de Rijke,
Fabrizio Silvestri
Abstract:
Given the increasing promise of graph neural networks (GNNs) in real-world applications, several methods have been developed for explaining their predictions. Existing methods for interpreting predictions from GNNs have primarily focused on generating subgraphs that are especially relevant for a particular prediction. However, such methods are not counterfactual (CF) in nature: given a prediction,…
▽ More
Given the increasing promise of graph neural networks (GNNs) in real-world applications, several methods have been developed for explaining their predictions. Existing methods for interpreting predictions from GNNs have primarily focused on generating subgraphs that are especially relevant for a particular prediction. However, such methods are not counterfactual (CF) in nature: given a prediction, we want to understand how the prediction can be changed in order to achieve an alternative outcome. In this work, we propose a method for generating CF explanations for GNNs: the minimal perturbation to the input (graph) data such that the prediction changes. Using only edge deletions, we find that our method, CF-GNNExplainer, can generate CF explanations for the majority of instances across three widely used datasets for GNN explanations, while removing less than 3 edges on average, with at least 94\% accuracy. This indicates that CF-GNNExplainer primarily removes edges that are crucial for the original predictions, resulting in minimal CF explanations.
△ Less
Submitted 22 February, 2022; v1 submitted 5 February, 2021;
originally announced February 2021.
-
Sampling a Near Neighbor in High Dimensions -- Who is the Fairest of Them All?
Authors:
Martin Aumüller,
Sariel Har-Peled,
Sepideh Mahabadi,
Rasmus Pagh,
Francesco Silvestri
Abstract:
Similarity search is a fundamental algorithmic primitive, widely used in many computer science disciplines. Given a set of points $S$ and a radius parameter $r>0$, the $r$-near neighbor ($r$-NN) problem asks for a data structure that, given any query point $q$, returns a point $p$ within distance at most $r$ from $q$. In this paper, we study the $r$-NN problem in the light of individual fairness a…
▽ More
Similarity search is a fundamental algorithmic primitive, widely used in many computer science disciplines. Given a set of points $S$ and a radius parameter $r>0$, the $r$-near neighbor ($r$-NN) problem asks for a data structure that, given any query point $q$, returns a point $p$ within distance at most $r$ from $q$. In this paper, we study the $r$-NN problem in the light of individual fairness and providing equal opportunities: all points that are within distance $r$ from the query should have the same probability to be returned. In the low-dimensional case, this problem was first studied by Hu, Qiao, and Tao (PODS 2014). Locality sensitive hashing (LSH), the theoretically strongest approach to similarity search in high dimensions, does not provide such a fairness guarantee. In this work, we show that LSH based algorithms can be made fair, without a significant loss in efficiency. We propose several efficient data structures for the exact and approximate variants of the fair NN problem. Our approach works more generally for sampling uniformly from a sub-collection of sets of a given collection and can be used in a few other applications. We also develop a data structure for fair similarity search under inner product that requires nearly-linear space and exploits locality sensitive filters. The paper concludes with an experimental evaluation that highlights the inherent unfairness of NN data structures and shows the performance of our algorithms on real-world datasets.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Neural Databases
Authors:
James Thorne,
Majid Yazdani,
Marzieh Saeidi,
Fabrizio Silvestri,
Sebastian Riedel,
Alon Halevy
Abstract:
In recent years, neural networks have shown impressive performance gains on long-standing AI problems, and in particular, answering queries from natural language text. These advances raise the question of whether they can be extended to a point where we can relax the fundamental assumption of database management, namely, that our data is represented as fields of a pre-defined schema.
This paper…
▽ More
In recent years, neural networks have shown impressive performance gains on long-standing AI problems, and in particular, answering queries from natural language text. These advances raise the question of whether they can be extended to a point where we can relax the fundamental assumption of database management, namely, that our data is represented as fields of a pre-defined schema.
This paper presents a first step in answering that question. We describe NeuralDB, a database system with no pre-defined schema, in which updates and queries are given in natural language. We develop query processing techniques that build on the primitives offered by the state of the art Natural Language Processing methods.
We begin by demonstrating that at the core, recent NLP transformers, powered by pre-trained language models, can answer select-project-join queries if they are given the exact set of relevant facts. However, they cannot scale to non-trivial databases and cannot perform aggregation queries. Based on these findings, we describe a NeuralDB architecture that runs multiple Neural SPJ operators in parallel, each with a set of database sentences that can produce one of the answers to the query. The result of these operators is fed to an aggregation operator if needed. We describe an algorithm that learns how to create the appropriate sets of facts to be fed into each of the Neural SPJ operators. Importantly, this algorithm can be trained by the Neural SPJ operator itself. We experimentally validate the accuracy of NeuralDB and its components, showing that we can answer queries over thousands of sentences with very high accuracy.
△ Less
Submitted 14 October, 2020;
originally announced October 2020.
-
Preserving Integrity in Online Social Networks
Authors:
Alon Halevy,
Cristian Canton Ferrer,
Hao Ma,
Umut Ozertem,
Patrick Pantel,
Marzieh Saeidi,
Fabrizio Silvestri,
Ves Stoyanov
Abstract:
Online social networks provide a platform for sharing information and free expression. However, these networks are also used for malicious purposes, such as distributing misinformation and hate speech, selling illegal drugs, and coordinating sex trafficking or child exploitation. This paper surveys the state of the art in keeping online platforms and their users safe from such harm, also known as…
▽ More
Online social networks provide a platform for sharing information and free expression. However, these networks are also used for malicious purposes, such as distributing misinformation and hate speech, selling illegal drugs, and coordinating sex trafficking or child exploitation. This paper surveys the state of the art in keeping online platforms and their users safe from such harm, also known as the problem of preserving integrity. This survey comes from the perspective of having to combat a broad spectrum of integrity violations at Facebook. We highlight the techniques that have been proven useful in practice and that deserve additional attention from the academic community. Instead of discussing the many individual violation types, we identify key aspects of the social-media eco-system, each of which is common to a wide variety violation types. Furthermore, each of these components represents an area for research and development, and the innovations that are found can be applied widely.
△ Less
Submitted 25 September, 2020; v1 submitted 22 September, 2020;
originally announced September 2020.
-
Similarity Search with Tensor Core Units
Authors:
Thomas D. Ahle,
Francesco Silvestri
Abstract:
Tensor Core Units (TCUs) are hardware accelerators developed for deep neural networks, which efficiently support the multiplication of two dense $\sqrt{m}\times \sqrt{m}$ matrices, where $m$ is a given hardware parameter. In this paper, we show that TCUs can speed up similarity search problems as well. We propose algorithms for the Johnson-Lindenstrauss dimensionality reduction and for similarity…
▽ More
Tensor Core Units (TCUs) are hardware accelerators developed for deep neural networks, which efficiently support the multiplication of two dense $\sqrt{m}\times \sqrt{m}$ matrices, where $m$ is a given hardware parameter. In this paper, we show that TCUs can speed up similarity search problems as well. We propose algorithms for the Johnson-Lindenstrauss dimensionality reduction and for similarity join that, by leveraging TCUs, achieve a $\sqrt{m}$ speedup up with respect to traditional approaches.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
Concept Matching for Low-Resource Classification
Authors:
Federico Errica,
Ludovic Denoyer,
Bora Edizel,
Fabio Petroni,
Vassilis Plachouras,
Fabrizio Silvestri,
Sebastian Riedel
Abstract:
We propose a model to tackle classification tasks in the presence of very little training data. To this aim, we approximate the notion of exact match with a theoretically sound mechanism that computes a probability of matching in the input space. Importantly, the model learns to focus on elements of the input that are relevant for the task at hand; by leveraging highlighted portions of the trainin…
▽ More
We propose a model to tackle classification tasks in the presence of very little training data. To this aim, we approximate the notion of exact match with a theoretically sound mechanism that computes a probability of matching in the input space. Importantly, the model learns to focus on elements of the input that are relevant for the task at hand; by leveraging highlighted portions of the training data, an error boosting technique guides the learning process. In practice, it increases the error associated with relevant parts of the input by a given factor. Remarkable results on text classification tasks confirm the benefits of the proposed approach in both balanced and unbalanced cases, thus being of practical use when labeling new examples is expensive. In addition, by inspecting its weights, it is often possible to gather insights on what the model has learned.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Node Masking: Making Graph Neural Networks Generalize and Scale Better
Authors:
Pushkar Mishra,
Aleksandra Piktus,
Gerard Goossen,
Fabrizio Silvestri
Abstract:
Graph Neural Networks (GNNs) have received a lot of interest in the recent times. From the early spectral architectures that could only operate on undirected graphs per a transductive learning paradigm to the current state of the art spatial ones that can apply inductively to arbitrary graphs, GNNs have seen significant contributions from the research community. In this paper, we utilize some theo…
▽ More
Graph Neural Networks (GNNs) have received a lot of interest in the recent times. From the early spectral architectures that could only operate on undirected graphs per a transductive learning paradigm to the current state of the art spatial ones that can apply inductively to arbitrary graphs, GNNs have seen significant contributions from the research community. In this paper, we utilize some theoretical tools to better visualize the operations performed by state of the art spatial GNNs. We analyze the inner workings of these architectures and introduce a simple concept, Node Masking, that allows them to generalize and scale better. To empirically validate the concept, we perform several experiments on some widely-used datasets for node classification in both the transductive and inductive settings, hence laying down strong benchmarks for future research.
△ Less
Submitted 16 May, 2021; v1 submitted 17 January, 2020;
originally announced January 2020.