-
Topological Embedding of Human Brain Networks with Applications to Dynamics of Temporal Lobe Epilepsy
Authors:
Moo K. Chung,
Ji Bi Che,
Veena A. Nair,
Camille Garcia Ramos,
Jedidiah Ray Mathis,
Vivek Prabhakaran,
Elizabeth Meyerand,
Bruce P. Hermann,
Jeffrey R. Binder,
Aaron F. Struck
Abstract:
We introduce a novel, data-driven topological data analysis (TDA) approach for embedding brain networks into a lower-dimensional space in quantifying the dynamics of temporal lobe epilepsy (TLE) obtained from resting-state functional magnetic resonance imaging (rs-fMRI). This embedding facilitates the orthogonal projection of 0D and 1D topological features, allowing for the visualization and model…
▽ More
We introduce a novel, data-driven topological data analysis (TDA) approach for embedding brain networks into a lower-dimensional space in quantifying the dynamics of temporal lobe epilepsy (TLE) obtained from resting-state functional magnetic resonance imaging (rs-fMRI). This embedding facilitates the orthogonal projection of 0D and 1D topological features, allowing for the visualization and modeling of the dynamics of functional human brain networks in a resting state. We then quantify the topological disparities between networks to determine the coordinates for embedding. This framework enables us to conduct a coherent statistical inference within the embedded space. Our results indicate that brain network topology in TLE patients exhibits increased rigidity in 0D topology but more rapid flections compared to that of normal controls in 1D topology.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems
Authors:
Ivan Sekulić,
Silvia Terragni,
Victor Guimarães,
Nghia Khau,
Bruna Guedes,
Modestas Filipavicius,
André Ferreira Manso,
Roland Mathis
Abstract:
In the realm of dialogue systems, user simulation techniques have emerged as a game-changer, redefining the evaluation and enhancement of task-oriented dialogue (TOD) systems. These methods are crucial for replicating real user interactions, enabling applications like synthetic data augmentation, error detection, and robust evaluation. However, existing approaches often rely on rigid rule-based me…
▽ More
In the realm of dialogue systems, user simulation techniques have emerged as a game-changer, redefining the evaluation and enhancement of task-oriented dialogue (TOD) systems. These methods are crucial for replicating real user interactions, enabling applications like synthetic data augmentation, error detection, and robust evaluation. However, existing approaches often rely on rigid rule-based methods or on annotated data. This paper introduces DAUS, a Domain-Aware User Simulator. Leveraging large language models, we fine-tune DAUS on real examples of task-oriented dialogues. Results on two relevant benchmarks showcase significant improvements in terms of user goal fulfillment. Notably, we have observed that fine-tuning enhances the simulator's coherence with user goals, effectively mitigating hallucinations -- a major source of inconsistencies in simulator responses.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
In-Context Learning User Simulators for Task-Oriented Dialog Systems
Authors:
Silvia Terragni,
Modestas Filipavicius,
Nghia Khau,
Bruna Guedes,
André Manso,
Roland Mathis
Abstract:
This paper presents a novel application of large language models in user simulation for task-oriented dialog systems, specifically focusing on an in-context learning approach. By harnessing the power of these models, the proposed approach generates diverse utterances based on user goals and limited dialog examples. Unlike traditional simulators, this method eliminates the need for labor-intensive…
▽ More
This paper presents a novel application of large language models in user simulation for task-oriented dialog systems, specifically focusing on an in-context learning approach. By harnessing the power of these models, the proposed approach generates diverse utterances based on user goals and limited dialog examples. Unlike traditional simulators, this method eliminates the need for labor-intensive rule definition or extensive annotated data, making it more efficient and accessible. Additionally, an error analysis of the interaction between the user simulator and dialog system uncovers common mistakes, providing valuable insights into areas that require improvement. Our implementation is available at https://github.com/telepathylabsai/prompt-based-user-simulator.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Network-based Biased Tree Ensembles (NetBiTE) for Drug Sensitivity Prediction and Drug Sensitivity Biomarker Identification in Cancer
Authors:
Ali Oskooei,
Matteo Manica,
Roland Mathis,
Maria Rodriguez Martinez
Abstract:
We present the Network-based Biased Tree Ensembles (NetBiTE) method for drug sensitivity prediction and drug sensitivity biomarker identification in cancer using a combination of prior knowledge and gene expression data. Our devised method consists of a biased tree ensemble that is built according to a probabilistic bias weight distribution. The bias weight distribution is obtained from the assign…
▽ More
We present the Network-based Biased Tree Ensembles (NetBiTE) method for drug sensitivity prediction and drug sensitivity biomarker identification in cancer using a combination of prior knowledge and gene expression data. Our devised method consists of a biased tree ensemble that is built according to a probabilistic bias weight distribution. The bias weight distribution is obtained from the assignment of high weights to the drug targets and propagating the assigned weights over a protein-protein interaction network such as STRING. The propagation of weights, defines neighborhoods of influence around the drug targets and as such simulates the spread of perturbations within the cell, following drug administration. Using a synthetic dataset, we showcase how application of biased tree ensembles (BiTE) results in significant accuracy gains at a much lower computational cost compared to the unbiased random forests (RF) algorithm. We then apply NetBiTE to the Genomics of Drug Sensitivity in Cancer (GDSC) dataset and demonstrate that NetBiTE outperforms RF in predicting IC50 drug sensitivity, only for drugs that target membrane receptor pathways (MRPs): RTK, EGFR and IGFR signaling pathways. We propose based on the NetBiTE results, that for drugs that inhibit MRPs, the expression of target genes prior to drug administration is a biomarker for IC50 drug sensitivity following drug administration. We further verify and reinforce this proposition through control studies on, PI3K/MTOR signaling pathway inhibitors, a drug category that does not target MRPs, and through assignment of dummy targets to MRP inhibiting drugs and investigating the variation in NetBiTE accuracy.
△ Less
Submitted 26 April, 2019; v1 submitted 18 August, 2018;
originally announced August 2018.
-
PIMKL: Pathway Induced Multiple Kernel Learning
Authors:
Matteo Manica,
Joris Cadow,
Roland Mathis,
María Rodríguez Martínez
Abstract:
Reliable identification of molecular biomarkers is essential for accurate patient stratification. While state-of-the-art machine learning approaches for sample classification continue to push boundaries in terms of performance, most of these methods are not able to integrate different data types and lack generalization power, limiting their application in a clinical setting. Furthermore, many meth…
▽ More
Reliable identification of molecular biomarkers is essential for accurate patient stratification. While state-of-the-art machine learning approaches for sample classification continue to push boundaries in terms of performance, most of these methods are not able to integrate different data types and lack generalization power, limiting their application in a clinical setting. Furthermore, many methods behave as black boxes, and we have very little understanding about the mechanisms that lead to the prediction. While opaqueness concerning machine behaviour might not be a problem in deterministic domains, in health care, providing explanations about the molecular factors and phenotypes that are driving the classification is crucial to build trust in the performance of the predictive system. We propose Pathway Induced Multiple Kernel Learning (PIMKL), a novel methodology to reliably classify samples that can also help gain insights into the molecular mechanisms that underlie the classification. PIMKL exploits prior knowledge in the form of a molecular interaction network and annotated gene sets, by optimizing a mixture of pathway-induced kernels using a Multiple Kernel Learning (MKL) algorithm, an approach that has demonstrated excellent performance in different machine learning applications. After optimizing the combination of kernels for prediction of a specific phenotype, the model provides a stable molecular signature that can be interpreted in the light of the ingested prior knowledge and that can be used in transfer learning tasks.
△ Less
Submitted 5 July, 2018; v1 submitted 29 March, 2018;
originally announced March 2018.
-
INtERAcT: Interaction Network Inference from Vector Representations of Words
Authors:
Matteo Manica,
Roland Mathis,
María Rodríguez Martínez
Abstract:
In recent years, the number of biomedical publications has steadfastly grown, resulting in a rich source of untapped new knowledge. Most biomedical facts are however not readily available, but buried in the form of unstructured text, and hence their exploitation requires the time-consuming manual curation of published articles. Here we present INtERAcT, a novel approach to extract protein-protein…
▽ More
In recent years, the number of biomedical publications has steadfastly grown, resulting in a rich source of untapped new knowledge. Most biomedical facts are however not readily available, but buried in the form of unstructured text, and hence their exploitation requires the time-consuming manual curation of published articles. Here we present INtERAcT, a novel approach to extract protein-protein interactions from a corpus of biomedical articles related to a broad range of scientific domains in a completely unsupervised way. INtERAcT exploits vector representation of words, computed on a corpus of domain specific knowledge, and implements a new metric that estimates an interaction score between two molecules in the space where the corresponding words are embedded. We demonstrate the power of INtERAcT by reconstructing the molecular pathways associated to 10 different cancer types using a corpus of disease-specific articles for each cancer type. We evaluate INtERAcT using STRING database as a benchmark, and show that our metric outperforms currently adopted approaches for similarity computation at the task of identifying known molecular interactions in all studied cancer types. Furthermore, our approach does not require text annotation, manual curation or the definition of semantic rules based on expert knowledge, and hence it can be easily and efficiently applied to different scientific domains. Our findings suggest that INtERAcT may increase our capability to summarize the understanding of a specific disease using the published literature in an automated and completely unsupervised fashion.
△ Less
Submitted 16 April, 2018; v1 submitted 9 January, 2018;
originally announced January 2018.
-
Inferring clonal composition from multiple tumor biopsies
Authors:
Matteo Manica,
Hyunjae Ryan Kim,
Roland Mathis,
Philippe Chouvarine,
Dorothea Rutishauser,
Laura De Vargas Roditi,
Bence Szalai,
Ulrich Wagner,
Kathrin Oehl,
Karim Saba,
Arati Pati,
Julio Saez-Rodriguez,
Angshumoy Roy,
Donald W. Parsons,
Peter J. Wild,
María Rodríguez Martínez,
Pavel Sumazin
Abstract:
Explicit accounting for copy number alterations can dramatically improve mutation frequency estimates, leading to more accurate phylogeny reconstructions and subclone characterizations.
Explicit accounting for copy number alterations can dramatically improve mutation frequency estimates, leading to more accurate phylogeny reconstructions and subclone characterizations.
△ Less
Submitted 22 November, 2019; v1 submitted 26 January, 2017;
originally announced January 2017.
-
Mixed-Precision In-Memory Computing
Authors:
Manuel Le Gallo,
Abu Sebastian,
Roland Mathis,
Matteo Manica,
Heiner Giefers,
Tomas Tuma,
Costas Bekas,
Alessandro Curioni,
Evangelos Eleftheriou
Abstract:
As CMOS scaling reaches its technological limits, a radical departure from traditional von Neumann systems, which involve separate processing and memory units, is needed in order to significantly extend the performance of today's computers. In-memory computing is a promising approach in which nanoscale resistive memory devices, organized in a computational memory unit, are used for both processing…
▽ More
As CMOS scaling reaches its technological limits, a radical departure from traditional von Neumann systems, which involve separate processing and memory units, is needed in order to significantly extend the performance of today's computers. In-memory computing is a promising approach in which nanoscale resistive memory devices, organized in a computational memory unit, are used for both processing and memory. However, to reach the numerical accuracy typically required for data analytics and scientific computing, limitations arising from device variability and non-ideal device characteristics need to be addressed. Here we introduce the concept of mixed-precision in-memory computing, which combines a von Neumann machine with a computational memory unit. In this hybrid system, the computational memory unit performs the bulk of a computational task, while the von Neumann machine implements a backward method to iteratively improve the accuracy of the solution. The system therefore benefits from both the high precision of digital computing and the energy/areal efficiency of in-memory computing. We experimentally demonstrate the efficacy of the approach by accurately solving systems of linear equations, in particular, a system of 5,000 equations using 998,752 phase-change memory devices.
△ Less
Submitted 4 October, 2018; v1 submitted 16 January, 2017;
originally announced January 2017.
-
Quasi-steady description of modulation effects in wall turbulence
Authors:
Sergei I. Chernyshenko,
Ivan Marusic,
Romain Mathis
Abstract:
A theoretical description of the phenomenon of modulation of near-wall turbulence by large scale structures is investigated. The description given is simple in that the effect of large-scale structures is limited to a quasi-steady response of the near-wall turbulence to slow large-scale fluctuations of the skin friction. The most natural and compact form of expressing this mechanism is given by th…
▽ More
A theoretical description of the phenomenon of modulation of near-wall turbulence by large scale structures is investigated. The description given is simple in that the effect of large-scale structures is limited to a quasi-steady response of the near-wall turbulence to slow large-scale fluctuations of the skin friction. The most natural and compact form of expressing this mechanism is given by the usual Reynolds-number-independent representation of the total skin friction and velocity, scaled in wall variables, where the mean quantities are replaced by large-scale low-pass-filtered fluctuating components. The theory is rewritten in terms of fuctuations via a universal mean velocity and random mean square fluctuation velocity profiles of the small-scales and then linearised assuming that the large-scale fluctuations are small as compared to the mean components. This allows us to express the superposition and modulation coefficients of the empirical predictive models of the skin friction and streamwise fluctuating velocity given respectively by Marusic et al. (13th Eur. Turb. Conf., 2011) and Mathis et al. (J. Fluid Mech. 2011, vol. 681, pp. 537-566). It is found that the theoretical quantities agree well with experimentally determined coefficients.
△ Less
Submitted 16 March, 2012;
originally announced March 2012.