Search | arXiv e-print repository

Process Mining Embeddings: Learning Vector Representations for Petri Nets

Authors: Juan G. Colonna, Ahmed A. Fares, Márcio Duarte, Ricardo Sousa

Abstract: Process Mining offers a powerful framework for uncovering, analyzing, and optimizing real-world business processes. Petri nets provide a versatile means of modeling process behavior. However, traditional methods often struggle to effectively compare complex Petri nets, hindering their potential for process enhancement. To address this challenge, we introduce PetriNet2Vec, an unsupervised methodolo… ▽ More Process Mining offers a powerful framework for uncovering, analyzing, and optimizing real-world business processes. Petri nets provide a versatile means of modeling process behavior. However, traditional methods often struggle to effectively compare complex Petri nets, hindering their potential for process enhancement. To address this challenge, we introduce PetriNet2Vec, an unsupervised methodology inspired by Doc2Vec. This approach converts Petri nets into embedding vectors, facilitating the comparison, clustering, and classification of process models. We validated our approach using the PDC Dataset, comprising 96 diverse Petri net models. The results demonstrate that PetriNet2Vec effectively captures the structural properties of process models, enabling accurate process classification and efficient process retrieval. Specifically, our findings highlight the utility of the learned embeddings in two key downstream tasks: process classification and process retrieval. In process classification, the embeddings allowed for accurate categorization of process models based on their structural properties. In process retrieval, the embeddings enabled efficient retrieval of similar process models using cosine distance. These results demonstrate the potential of PetriNet2Vec to significantly enhance process mining capabilities. △ Less

Submitted 31 July, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.14970 [pdf, other]

Integrating Heterogeneous Gene Expression Data through Knowledge Graphs for Improving Diabetes Prediction

Authors: Rita T. Sousa, Heiko Paulheim

Abstract: Diabetes is a worldwide health issue affecting millions of people. Machine learning methods have shown promising results in improving diabetes prediction, particularly through the analysis of diverse data types, namely gene expression data. While gene expression data can provide valuable insights, challenges arise from the fact that the sample sizes in expression datasets are usually limited, and… ▽ More Diabetes is a worldwide health issue affecting millions of people. Machine learning methods have shown promising results in improving diabetes prediction, particularly through the analysis of diverse data types, namely gene expression data. While gene expression data can provide valuable insights, challenges arise from the fact that the sample sizes in expression datasets are usually limited, and the data from different datasets with different gene expressions cannot be easily combined. This work proposes a novel approach to address these challenges by integrating multiple gene expression datasets and domain-specific knowledge using knowledge graphs, a unique tool for biomedical data integration. KG embedding methods are then employed to generate vector representations, serving as inputs for a classifier. Experiments demonstrated the efficacy of our approach, revealing improvements in diabetes prediction when integrating multiple gene expression datasets and domain-specific knowledge about protein functions and interactions. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 11 pages, 4 figures, 7th Workshop on Semantic Web Solutions for Large-scale Biomedical Data Analytics at ESWC2024

ACM Class: J.3

arXiv:2310.19250 [pdf, other]

Assessment of Differentially Private Synthetic Data for Utility and Fairness in End-to-End Machine Learning Pipelines for Tabular Data

Authors: Mayana Pereira, Meghana Kshirsagar, Sumit Mukherjee, Rahul Dodhia, Juan Lavista Ferres, Rafael de Sousa

Abstract: Differentially private (DP) synthetic data sets are a solution for sharing data while preserving the privacy of individual data providers. Understanding the effects of utilizing DP synthetic data in end-to-end machine learning pipelines impacts areas such as health care and humanitarian action, where data is scarce and regulated by restrictive privacy laws. In this work, we investigate the extent… ▽ More Differentially private (DP) synthetic data sets are a solution for sharing data while preserving the privacy of individual data providers. Understanding the effects of utilizing DP synthetic data in end-to-end machine learning pipelines impacts areas such as health care and humanitarian action, where data is scarce and regulated by restrictive privacy laws. In this work, we investigate the extent to which synthetic data can replace real, tabular data in machine learning pipelines and identify the most effective synthetic data generation techniques for training and evaluating machine learning models. We investigate the impacts of differentially private synthetic data on downstream classification tasks from the point of view of utility as well as fairness. Our analysis is comprehensive and includes representatives of the two main types of synthetic data generation algorithms: marginal-based and GAN-based. To the best of our knowledge, our work is the first that: (i) proposes a training and evaluation framework that does not assume that real data is available for testing the utility and fairness of machine learning models trained on synthetic data; (ii) presents the most extensive analysis of synthetic data set generation algorithms in terms of utility and fairness when used for training machine learning models; and (iii) encompasses several different definitions of fairness. Our findings demonstrate that marginal-based synthetic data generators surpass GAN-based ones regarding model training utility for tabular data. Indeed, we show that models trained using data generated by marginal-based algorithms can exhibit similar utility to models trained using real data. Our analysis also reveals that the marginal-based synthetic data generator MWEM PGM can train models that simultaneously achieve utility and fairness characteristics close to those obtained by models trained with real data. △ Less

Submitted 29 October, 2023; originally announced October 2023.

Comments: arXiv admin note: text overlap with arXiv:2106.10241

arXiv:2308.06165 [pdf, other]

Task Conditioned BERT for Joint Intent Detection and Slot-filling

Authors: Diogo Tavares, Pedro Azevedo, David Semedo, Ricardo Sousa, João Magalhães

Abstract: Dialogue systems need to deal with the unpredictability of user intents to track dialogue state and the heterogeneity of slots to understand user preferences. In this paper we investigate the hypothesis that solving these challenges as one unified model will allow the transfer of parameter support data across the different tasks. The proposed principled model is based on a Transformer encoder, tra… ▽ More Dialogue systems need to deal with the unpredictability of user intents to track dialogue state and the heterogeneity of slots to understand user preferences. In this paper we investigate the hypothesis that solving these challenges as one unified model will allow the transfer of parameter support data across the different tasks. The proposed principled model is based on a Transformer encoder, trained on multiple tasks, and leveraged by a rich input that conditions the model on the target inferences. Conditioning the Transformer encoder on multiple target inferences over the same corpus, i.e., intent and multiple slot types, allows learning richer language interactions than a single-task model would be able to. In fact, experimental results demonstrate that conditioning the model on an increasing number of dialogue inference tasks leads to improved results: on the MultiWOZ dataset, the joint intent and slot detection can be improved by 3.2\% by conditioning on intent, 10.8\% by conditioning on slot and 14.4\% by conditioning on both intent and slots. Moreover, on real conversations with Farfetch costumers, the proposed conditioned BERT can achieve high joint-goal and intent detection performance throughout a dialogue. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2308.03447 [pdf, other]

Biomedical Knowledge Graph Embeddings with Negative Statements

Authors: Rita T. Sousa, Sara Silva, Heiko Paulheim, Catia Pesquita

Abstract: A knowledge graph is a powerful representation of real-world entities and their relations. The vast majority of these relations are defined as positive statements, but the importance of negative statements is increasingly recognized, especially under an Open World Assumption. Explicitly considering negative statements has been shown to improve performance on tasks such as entity summarization and… ▽ More A knowledge graph is a powerful representation of real-world entities and their relations. The vast majority of these relations are defined as positive statements, but the importance of negative statements is increasingly recognized, especially under an Open World Assumption. Explicitly considering negative statements has been shown to improve performance on tasks such as entity summarization and question answering or domain-specific tasks such as protein function prediction. However, no attention has been given to the exploration of negative statements by knowledge graph embedding approaches despite the potential of negative statements to produce more accurate representations of entities in a knowledge graph. We propose a novel approach, TrueWalks, to incorporate negative statements into the knowledge graph representation learning process. In particular, we present a novel walk-generation method that is able to not only differentiate between positive and negative statements but also take into account the semantic implications of negation in ontology-rich knowledge graphs. This is of particular importance for applications in the biomedical domain, where the inadequacy of embedding approaches regarding negative statements at the ontology level has been identified as a crucial limitation. We evaluate TrueWalks in ontology-rich biomedical knowledge graphs in two different predictive tasks based on KG embeddings: protein-protein interaction prediction and gene-disease association prediction. We conduct an extensive analysis over established benchmarks and demonstrate that our method is able to improve the performance of knowledge graph embeddings on all tasks. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 19 pages, 4 figures

arXiv:2307.11719 [pdf, other]

doi 10.24963/kr.2023/62

Benchmark datasets for biomedical knowledge graphs with negative statements

Authors: Rita T. Sousa, Sara Silva, Catia Pesquita

Abstract: Knowledge graphs represent facts about real-world entities. Most of these facts are defined as positive statements. The negative statements are scarce but highly relevant under the open-world assumption. Furthermore, they have been demonstrated to improve the performance of several applications, namely in the biomedical domain. However, no benchmark dataset supports the evaluation of the methods t… ▽ More Knowledge graphs represent facts about real-world entities. Most of these facts are defined as positive statements. The negative statements are scarce but highly relevant under the open-world assumption. Furthermore, they have been demonstrated to improve the performance of several applications, namely in the biomedical domain. However, no benchmark dataset supports the evaluation of the methods that consider these negative statements. We present a collection of datasets for three relation prediction tasks - protein-protein interaction prediction, gene-disease association prediction and disease prediction - that aim at circumventing the difficulties in building benchmarks for knowledge graphs with negative statements. These datasets include data from two successful biomedical ontologies, Gene Ontology and Human Phenotype Ontology, enriched with negative statements. We also generate knowledge graph embeddings for each dataset with two popular path-based methods and evaluate the performance in each task. The results show that the negative statements can improve the performance of knowledge graph embeddings. △ Less

Submitted 21 July, 2023; originally announced July 2023.

Journal ref: International Conference on Principles of Knowledge Representation and Reasoning 2023

arXiv:2306.12687 [pdf, other]

Explainable Representations for Relation Prediction in Knowledge Graphs

Authors: Rita T. Sousa, Sara Silva, Catia Pesquita

Abstract: Knowledge graphs represent real-world entities and their relations in a semantically-rich structure supported by ontologies. Exploring this data with machine learning methods often relies on knowledge graph embeddings, which produce latent representations of entities that preserve structural and local graph neighbourhood properties, but sacrifice explainability. However, in tasks such as link or r… ▽ More Knowledge graphs represent real-world entities and their relations in a semantically-rich structure supported by ontologies. Exploring this data with machine learning methods often relies on knowledge graph embeddings, which produce latent representations of entities that preserve structural and local graph neighbourhood properties, but sacrifice explainability. However, in tasks such as link or relation prediction, understanding which specific features better explain a relation is crucial to support complex or critical applications. We propose SEEK, a novel approach for explainable representations to support relation prediction in knowledge graphs. It is based on identifying relevant shared semantic aspects (i.e., subgraphs) between entities and learning representations for each subgraph, producing a multi-faceted and explainable representation. We evaluate SEEK on two real-world highly complex relation prediction tasks: protein-protein interaction prediction and gene-disease association prediction. Our extensive analysis using established benchmarks demonstrates that SEEK achieves significantly better performance than standard learning representation methods while identifying both sufficient and necessary explanations based on shared semantic aspects. △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: 16 pages, 3 figures

arXiv:2304.03013 [pdf, other]

doi 10.1016/j.jpdc.2022.12.008

Tensor Slicing and Optimization for Multicore NPUs

Authors: Rafael Sousa, Marcio Pereira, Yongin Kwon, Taeho Kim, Namsoon Jung, Chang Soo Kim, Michael Frank, Guido Araujo

Abstract: Although code generation for Convolution Neural Network (CNN) models has been extensively studied, performing efficient data slicing and parallelization for highly-constrai\-ned Multicore Neural Processor Units (NPUs) is still a challenging problem. Given the size of convolutions' input/output tensors and the small footprint of NPU on-chip memories, minimizing memory transactions while maximizing… ▽ More Although code generation for Convolution Neural Network (CNN) models has been extensively studied, performing efficient data slicing and parallelization for highly-constrai\-ned Multicore Neural Processor Units (NPUs) is still a challenging problem. Given the size of convolutions' input/output tensors and the small footprint of NPU on-chip memories, minimizing memory transactions while maximizing parallelism and MAC utilization are central to any effective solution. This paper proposes a TensorFlow XLA/LLVM compiler optimization pass for Multicore NPUs, called Tensor Slicing Optimization (TSO), which: (a) maximizes convolution parallelism and memory usage across NPU cores; and (b) reduces data transfers between host and NPU on-chip memories by using DRAM memory burst time estimates to guide tensor slicing. To evaluate the proposed approach, a set of experiments was performed using the NeuroMorphic Processor (NMP), a multicore NPU containing 32 RISC-V cores extended with novel CNN instructions. Experimental results show that TSO is capable of identifying the best tensor slicing that minimizes execution time for a set of CNN models. Speed-ups of up to 21.7\% result when comparing the TSO burst-based technique to a no-burst data slicing approach. To validate the generality of the TSO approach, the algorithm was also ported to the Glow Machine Learning framework. The performance of the models were measured on both Glow and TensorFlow XLA/LLVM compilers, revealing similar results. △ Less

Submitted 6 April, 2023; originally announced April 2023.

Journal ref: Journal of Parallel and Distributed Computing Journal of Parallel and Distributed Computing, Volume 175, May 2023, Pages 66-79

arXiv:2303.04739 [pdf, other]

Advancing Direct Convolution using Convolution Slicing Optimization and ISA Extensions

Authors: Victor Ferrari, Rafael Sousa, Marcio Pereira, João P. L. de Carvalho, José Nelson Amaral, José Moreira, Guido Araujo

Abstract: Convolution is one of the most computationally intensive operations that must be performed for machine-learning model inference. A traditional approach to compute convolutions is known as the Im2Col + BLAS method. This paper proposes SConv: a direct-convolution algorithm based on a MLIR/LLVM code-generation toolchain that can be integrated into machine-learning compilers . This algorithm introduce… ▽ More Convolution is one of the most computationally intensive operations that must be performed for machine-learning model inference. A traditional approach to compute convolutions is known as the Im2Col + BLAS method. This paper proposes SConv: a direct-convolution algorithm based on a MLIR/LLVM code-generation toolchain that can be integrated into machine-learning compilers . This algorithm introduces: (a) Convolution Slicing Analysis (CSA) - a convolution-specific 3D cache-blocking analysis pass that focuses on tile reuse over the cache hierarchy; (b) Convolution Slicing Optimization (CSO) - a code-generation pass that uses CSA to generate a tiled direct-convolution macro-kernel; and (c) Vector-Based Packing (VBP) - an architecture-specific optimized input-tensor packing solution based on vector-register shift instructions for convolutions with unitary stride. Experiments conducted on 393 convolutions from full ONNX-MLIR machine-learning models indicate that the elimination of the Im2Col transformation and the use of fast packing routines result in a total packing time reduction, on full model inference, of 2.0x - 3.9x on Intel x86 and 3.6x - 7.2x on IBM POWER10. The speed-up over an Im2Col + BLAS method based on current BLAS implementations for end-to-end machine-learning model inference is in the range of 9% - 25% for Intel x86 and 10% - 42% for IBM POWER10 architectures. The total convolution speedup for model inference is 12% - 27% on Intel x86 and 26% - 46% on IBM POWER10. SConv also outperforms BLAS GEMM, when computing pointwise convolutions, in more than 83% of the 219 tested instances. △ Less

Submitted 8 March, 2023; originally announced March 2023.

Comments: 15 pages, 11 figures

arXiv:2210.07332 [pdf, other]

Secure Multiparty Computation for Synthetic Data Generation from Distributed Data

Authors: Mayana Pereira, Sikha Pentyala, Anderson Nascimento, Rafael T. de Sousa Jr., Martine De Cock

Abstract: Legal and ethical restrictions on accessing relevant data inhibit data science research in critical domains such as health, finance, and education. Synthetic data generation algorithms with privacy guarantees are emerging as a paradigm to break this data logjam. Existing approaches, however, assume that the data holders supply their raw data to a trusted curator, who uses it as fuel for synthetic… ▽ More Legal and ethical restrictions on accessing relevant data inhibit data science research in critical domains such as health, finance, and education. Synthetic data generation algorithms with privacy guarantees are emerging as a paradigm to break this data logjam. Existing approaches, however, assume that the data holders supply their raw data to a trusted curator, who uses it as fuel for synthetic data generation. This severely limits the applicability, as much of the valuable data in the world is locked up in silos, controlled by entities who cannot show their data to each other or a central aggregator without raising privacy concerns. To overcome this roadblock, we propose the first solution in which data holders only share encrypted data for differentially private synthetic data generation. Data holders send shares to servers who perform Secure Multiparty Computation (MPC) computations while the original data stays encrypted. We instantiate this idea in an MPC protocol for the Multiplicative Weights with Exponential Mechanism (MWEM) algorithm to generate synthetic data based on real data originating from many data holders without reliance on a single point of failure. △ Less

Submitted 28 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

arXiv:2105.04944 [pdf]

Predicting Gene-Disease Associations with Knowledge Graph Embeddings over Multiple Ontologies

Authors: Susana Nunes, Rita T. Sousa, Catia Pesquita

Abstract: Ontology-based approaches for predicting gene-disease associations include the more classical semantic similarity methods and more recently knowledge graph embeddings. While semantic similarity is typically restricted to hierarchical relations within the ontology, knowledge graph embeddings consider their full breadth. However, embeddings are produced over a single graph and complex tasks such as… ▽ More Ontology-based approaches for predicting gene-disease associations include the more classical semantic similarity methods and more recently knowledge graph embeddings. While semantic similarity is typically restricted to hierarchical relations within the ontology, knowledge graph embeddings consider their full breadth. However, embeddings are produced over a single graph and complex tasks such as gene-disease association may require additional ontologies. We investigate the impact of employing richer semantic representations that are based on more than one ontology, able to represent both genes and diseases and consider multiple kinds of relations within the ontologies. Our experiments demonstrate the value of employing knowledge graph embeddings based on random-walks and highlight the need for a closer integration of different ontologies. △ Less

Submitted 31 May, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

Comments: 4 pages, 1 figure, 2 tables

arXiv:2010.14558 [pdf, other]

Characterizing Human Mobility Patterns During COVID-19 using Cellular Network Data

Authors: Necati A. Ayan, Nilson L. Damasceno, Sushil Chaskar, Peron R. de Sousa, Arti Ramesh, Anand Seetharam, Antonio A. de A. Rocha

Abstract: In this paper, our goal is to analyze and compare cellular network usage data from pre-lockdown, during lockdown, and post-lockdown phases surrounding the COVID-19 pandemic to understand and model human mobility patterns during the pandemic, and evaluate the effect of lockdowns on mobility. To this end, we collaborate with one of the main cellular network providers in Brazil, and collect and analy… ▽ More In this paper, our goal is to analyze and compare cellular network usage data from pre-lockdown, during lockdown, and post-lockdown phases surrounding the COVID-19 pandemic to understand and model human mobility patterns during the pandemic, and evaluate the effect of lockdowns on mobility. To this end, we collaborate with one of the main cellular network providers in Brazil, and collect and analyze cellular network connections from 1400 antennas for all users in the city of Rio de Janeiro and its suburbs from March 1, 2020 to July 1, 2020. Our analysis reveals that the total number of cellular connections decreases to 78% during the lockdown phase and then increases to 85% of the pre-COVID era as the lockdown eases. We observe that as more people work remotely, there is a shift in the antennas incurring top 10% of the total traffic, with the number of connections made to antennas in downtown Rio reducing drastically and antennas at other locations taking their place. We also observe that while nearly 40-45% users connected to only 1 antenna each day during the lockdown phase indicating no mobility, there are around 4% users (i.e., 80K users) who connected to more than 10 antennas, indicating very high mobility. Finally, we design an interactive tool that showcases mobility patterns in different granularities that can potentially help people and government officials understand the mobility of individuals and the number of COVID cases in a particular neighborhood. Our analysis, inferences, and interactive showcasing of mobility patterns based on large-scale data can be extrapolated to other cities of the world and has the potential to help in designing more effective pandemic management measures in the future. △ Less

Submitted 27 October, 2020; originally announced October 2020.

Comments: 12 pages

arXiv:2006.16701 [pdf, other]

Hierarchical Qualitative Clustering: clustering mixed datasets with critical qualitative information

Authors: Diogo Seca, João Mendes-Moreira, Tiago Mendes-Neves, Ricardo Sousa

Abstract: Clustering can be used to extract insights from data or to verify some of the assumptions held by the domain experts, namely data segmentation. In the literature, few methods can be applied in clustering qualitative values using the context associated with other variables present in the data, without losing interpretability. Moreover, the metrics for calculating dissimilarity between qualitative v… ▽ More Clustering can be used to extract insights from data or to verify some of the assumptions held by the domain experts, namely data segmentation. In the literature, few methods can be applied in clustering qualitative values using the context associated with other variables present in the data, without losing interpretability. Moreover, the metrics for calculating dissimilarity between qualitative values often scale poorly for high dimensional mixed datasets. In this study, we propose a novel method for clustering qualitative values, based on Hierarchical Clustering (HQC), and using Maximum Mean Discrepancy. HQC maintains the original interpretability of the qualitative information present in the dataset. We apply HQC to two datasets. Using a mixed dataset provided by Spotify, we showcase how our method can be used for clustering music artists based on the quantitative features of thousands of songs. In addition, using financial features of companies, we cluster company industries, and discuss the implications in investment portfolios diversification. △ Less

Submitted 6 July, 2020; v1 submitted 30 June, 2020; originally announced June 2020.

Comments: 12 pages, 3 figures, 1 table. For more info see https://github.com/diogoseca/qualitative-clustering

MSC Class: 68T05 (Primary) 62P05 (Secondary) ACM Class: I.2.6; J.1; G.3

arXiv:2003.09291 [pdf, other]

Improving Irregularly Sampled Time Series Learning with Dense Descriptors of Time

Authors: Rafael T. Sousa, Lucas A. Pereira, Anderson S. Soares

Abstract: Supervised learning with irregularly sampled time series have been a challenge to Machine Learning methods due to the obstacle of dealing with irregular time intervals. Some papers introduced recently recurrent neural network models that deals with irregularity, but most of them rely on complex mechanisms to achieve a better performance. This work propose a novel method to represent timestamps (ho… ▽ More Supervised learning with irregularly sampled time series have been a challenge to Machine Learning methods due to the obstacle of dealing with irregular time intervals. Some papers introduced recently recurrent neural network models that deals with irregularity, but most of them rely on complex mechanisms to achieve a better performance. This work propose a novel method to represent timestamps (hours or dates) as dense vectors using sinusoidal functions, called Time Embeddings. As a data input method it and can be applied to most machine learning models. The method was evaluated with two predictive tasks from MIMIC III, a dataset of irregularly sampled time series of electronic health records. Our tests showed an improvement to LSTM-based and classical machine learning models, specially with very irregular data. △ Less

Submitted 20 March, 2020; originally announced March 2020.

arXiv:2003.05377 [pdf, other]

Brazilian Lyrics-Based Music Genre Classification Using a BLSTM Network

Authors: Raul de Araújo Lima, Rômulo César Costa de Sousa, Simone Diniz Junqueira Barbosa, Hélio Cortês Vieira Lopes

Abstract: Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distrib… ▽ More Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distributed in 14 genres. We apply SVM, Random Forest and a Bidirectional Long Short-Term Memory (BLSTM) network combined with different word embeddings techniques to address this classification task. Our experiments show that the BLSTM method outperforms the other models with an F1-score average of $0.48$. Some genres like "gospel", "funk-carioca" and "sertanejo", which obtained 0.89, 0.70 and 0.69 of F1-score, respectively, can be defined as the most distinct and easy to classify in the Brazilian musical genres context. △ Less

Submitted 6 March, 2020; originally announced March 2020.

Comments: 7 pages, 4 figures, 3 tables

MSC Class: 68T50(Primary); 68T05 (Secondary) ACM Class: I.2.7; I.2.6

arXiv:2001.04798 [pdf, other]

doi 10.1016/j.neucom.2020.01.116

Parametric Probabilistic Quantum Memory

Authors: Rodrigo S. Sousa, Priscila G. M. dos Santos, Tiago M. L. Veras, Wilson R. de Oliveira, Adenilton J. da Silva

Abstract: Probabilistic Quantum Memory (PQM) is a data structure that computes the distance from a binary input to all binary patterns stored in superposition on the memory. This data structure allows the development of heuristics to speed up artificial neural networks architecture selection. In this work, we propose an improved parametric version of the PQM to perform pattern classification, and we also pr… ▽ More Probabilistic Quantum Memory (PQM) is a data structure that computes the distance from a binary input to all binary patterns stored in superposition on the memory. This data structure allows the development of heuristics to speed up artificial neural networks architecture selection. In this work, we propose an improved parametric version of the PQM to perform pattern classification, and we also present a PQM quantum circuit suitable for Noisy Intermediate Scale Quantum (NISQ) computers. We present a classical evaluation of a parametric PQM network classifier on public benchmark datasets. We also perform experiments to verify the viability of PQM on a 5-qubit quantum computer. △ Less

Submitted 11 January, 2020; originally announced January 2020.

Journal ref: Neurocomputing 416 (2020): 360-369

arXiv:1911.05024 [pdf, other]

Pose Guided Attention for Multi-label Fashion Image Classification

Authors: Beatriz Quintino Ferreira, João P. Costeira, Ricardo G. Sousa, Liang-Yan Gui, João P. Gomes

Abstract: We propose a compact framework with guided attention for multi-label classification in the fashion domain. Our visual semantic attention model (VSAM) is supervised by automatic pose extraction creating a discriminative feature space. VSAM outperforms the state of the art for an in-house dataset and performs on par with previous works on the DeepFashion dataset, even without using any landmark anno… ▽ More We propose a compact framework with guided attention for multi-label classification in the fashion domain. Our visual semantic attention model (VSAM) is supervised by automatic pose extraction creating a discriminative feature space. VSAM outperforms the state of the art for an in-house dataset and performs on par with previous works on the DeepFashion dataset, even without using any landmark annotations. Additionally, we show that our semantic attention module brings robustness to large quantities of wrong annotations and provides more interpretable results. △ Less

Submitted 12 November, 2019; originally announced November 2019.

Comments: Published at ICCV 2019 Workshop on Computer Vision for Fashion, Art and Design

arXiv:1811.09350 [pdf, other]

Predicting Diabetes Disease Evolution Using Financial Records and Recurrent Neural Networks

Authors: Rafael T. Sousa, Lucas A. Pereira, Anderson S. Soares

Abstract: Managing patients with chronic diseases is a major and growing healthcare challenge in several countries. A chronic condition, such as diabetes, is an illness that lasts a long time and does not go away, and often leads to the patient's health gradually getting worse. While recent works involve raw electronic health record (EHR) from hospitals, this work uses only financial records from health pla… ▽ More Managing patients with chronic diseases is a major and growing healthcare challenge in several countries. A chronic condition, such as diabetes, is an illness that lasts a long time and does not go away, and often leads to the patient's health gradually getting worse. While recent works involve raw electronic health record (EHR) from hospitals, this work uses only financial records from health plan providers (medical claims) to predict diabetes disease evolution with a self-attentive recurrent neural network. The use of financial data is due to the possibility of being an interface to international standards, as the records standard encodes medical procedures. The main goal was to assess high risk diabetics, so we predict records related to diabetes acute complications such as amputations and debridements, revascularization and hemodialysis. Our work succeeds to anticipate complications between 60 to 240 days with an area under ROC curve ranging from 0.81 to 0.94. In this paper we describe the first half of a work-in-progress developed within a health plan provider with ROC curve ranging from 0.81 to 0.83. This assessment will give healthcare providers the chance to intervene earlier and head off hospitalizations. We are aiming to deliver personalized predictions and personalized recommendations to individual patients, with the goal of improving outcomes and reducing costs △ Less

Submitted 20 March, 2020; v1 submitted 22 November, 2018; originally announced November 2018.

Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

Report number: ML4H/2018/70

arXiv:1808.09058 [pdf, ps, other]

doi 10.1142/S0219749918400051

Quantum enhanced cross-validation for near-optimal neural networks architecture selection

Authors: Priscila G. M. dos Santos, Rodrigo S. Sousa, Ismael C. S. Araujo, Adenilton J. da Silva

Abstract: This paper proposes a quantum-classical algorithm to evaluate and select classical artificial neural networks architectures. The proposed algorithm is based on a probabilistic quantum memory and the possibility to train artificial neural networks in superposition. We obtain an exponential quantum speedup in the evaluation of neural networks. We also verify experimentally through a reduced experime… ▽ More This paper proposes a quantum-classical algorithm to evaluate and select classical artificial neural networks architectures. The proposed algorithm is based on a probabilistic quantum memory and the possibility to train artificial neural networks in superposition. We obtain an exponential quantum speedup in the evaluation of neural networks. We also verify experimentally through a reduced experimental analysis that the proposed algorithm can be used to select near-optimal neural networks. △ Less

Submitted 27 August, 2018; originally announced August 2018.

Journal ref: International Journal of Quantum Information, Volume 16, No. 06, 1840005 (2018)

arXiv:1806.09511 [pdf, other]

A Hierarchical Deep Learning Natural Language Parser for Fashion

Authors: José Marcelino, João Faria, Luís Baía, Ricardo Gamelas Sousa

Abstract: This work presents a hierarchical deep learning natural language parser for fashion. Our proposal intends not only to recognize fashion-domain entities but also to expose syntactic and morphologic insights. We leverage the usage of an architecture of specialist models, each one for a different task (from parsing to entity recognition). Such architecture renders a hierarchical model able to capture… ▽ More This work presents a hierarchical deep learning natural language parser for fashion. Our proposal intends not only to recognize fashion-domain entities but also to expose syntactic and morphologic insights. We leverage the usage of an architecture of specialist models, each one for a different task (from parsing to entity recognition). Such architecture renders a hierarchical model able to capture the nuances of the fashion language. The natural language parser is able to deal with textual ambiguities which are left unresolved by our currently existing solution. Our empirical results establish a robust baseline, which justifies the use of hierarchical architectures of deep learning models while opening new research avenues to explore. △ Less

Submitted 25 June, 2018; originally announced June 2018.

Comments: In Proceedings of KDD 2018 (KDD Workshop on AI for Fashion)

arXiv:1806.09445 [pdf, other]

A Unified Model with Structured Output for Fashion Images Classification

Authors: Beatriz Quintino Ferreira, Luís Baía, João Faria, Ricardo Gamelas Sousa

Abstract: A picture is worth a thousand words. Albeit a cliché, for the fashion industry, an image of a clothing piece allows one to perceive its category (e.g., dress), sub-category (e.g., day dress) and properties (e.g., white colour with floral patterns). The seasonal nature of the fashion industry creates a highly dynamic and creative domain with evermore data, making it unpractical to manually describe… ▽ More A picture is worth a thousand words. Albeit a cliché, for the fashion industry, an image of a clothing piece allows one to perceive its category (e.g., dress), sub-category (e.g., day dress) and properties (e.g., white colour with floral patterns). The seasonal nature of the fashion industry creates a highly dynamic and creative domain with evermore data, making it unpractical to manually describe a large set of images (of products). In this paper, we explore the concept of visual recognition for fashion images through an end-to-end architecture embedding the hierarchical nature of the annotations directly into the model. Towards that goal, and inspired by the work of [7], we have modified and adapted the original architecture proposal. Namely, we have removed the message passing layer symmetry to cope with Farfetch category tree, added extra layers for hierarchy level specificity, and moved the message passing layer into an enriched latent space. We compare the proposed unified architecture against state-of-the-art models and demonstrate the performance advantage of our model for structured multi-level categorization on a dataset of about 350k fashion product images. △ Less

Submitted 25 June, 2018; originally announced June 2018.

Comments: Accepted in KDD 2018's AI for Fashion workshop

arXiv:1712.02824 [pdf, ps, other]

Stacked Denoising Autoencoders and Transfer Learning for Immunogold Particles Detection and Recognition

Authors: Ricardo Gamelas Sousa, Jorge M. Santos, Luís M. Silva, Luís A. Alexandre, Tiago Esteves, Sara Rocha, Paulo Monjardino, Joaquim Marques de Sá, Francisco Figueiredo, Pedro Quelhas

Abstract: In this paper we present a system for the detection of immunogold particles and a Transfer Learning (TL) framework for the recognition of these immunogold particles. Immunogold particles are part of a high-magnification method for the selective localization of biological molecules at the subcellular level only visible through Electron Microscopy. The number of immunogold particles in the cell wall… ▽ More In this paper we present a system for the detection of immunogold particles and a Transfer Learning (TL) framework for the recognition of these immunogold particles. Immunogold particles are part of a high-magnification method for the selective localization of biological molecules at the subcellular level only visible through Electron Microscopy. The number of immunogold particles in the cell walls allows the assessment of the differences in their compositions providing a tool to analise the quality of different plants. For its quantization one requires a laborious manual labeling (or annotation) of images containing hundreds of particles. The system that is proposed in this paper can leverage significantly the burden of this manual task. For particle detection we use a LoG filter coupled with a SDA. In order to improve the recognition, we also study the applicability of TL settings for immunogold recognition. TL reuses the learning model of a source problem on other datasets (target problems) containing particles of different sizes. The proposed system was developed to solve a particular problem on maize cells, namely to determine the composition of cell wall ingrowths in endosperm transfer cells. This novel dataset as well as the code for reproducing our experiments is made publicly available. We determined that the LoG detector alone attained more than 84\% of accuracy with the F-measure. Developing immunogold recognition with TL also provided superior performance when compared with the baseline models augmenting the accuracy rates by 10\%. △ Less

Submitted 7 December, 2017; originally announced December 2017.

arXiv:1712.02159 [pdf, ps, other]

Distribution-Based Categorization of Classifier Transfer Learning

Authors: Ricardo Gamelas Sousa, Luís A. Alexandre, Jorge M. Santos, Luís M. Silva, Joaquim Marques de Sá

Abstract: Transfer Learning (TL) aims to transfer knowledge acquired in one problem, the source problem, onto another problem, the target problem, dispensing with the bottom-up construction of the target model. Due to its relevance, TL has gained significant interest in the Machine Learning community since it paves the way to devise intelligent learning models that can easily be tailored to many different a… ▽ More Transfer Learning (TL) aims to transfer knowledge acquired in one problem, the source problem, onto another problem, the target problem, dispensing with the bottom-up construction of the target model. Due to its relevance, TL has gained significant interest in the Machine Learning community since it paves the way to devise intelligent learning models that can easily be tailored to many different applications. As it is natural in a fast evolving area, a wide variety of TL methods, settings and nomenclature have been proposed so far. However, a wide range of works have been reporting different names for the same concepts. This concept and terminology mixture contribute however to obscure the TL field, hindering its proper consideration. In this paper we present a review of the literature on the majority of classification TL methods, and also a distribution-based categorization of TL with a common nomenclature suitable to classification problems. Under this perspective three main TL categories are presented, discussed and illustrated with examples. △ Less

Submitted 6 December, 2017; originally announced December 2017.

arXiv:1703.00856 [pdf, ps, other]

Araguaia Medical Vision Lab at ISIC 2017 Skin Lesion Classification Challenge

Authors: Rafael Teixeira Sousa, Larissa Vasconcellos de Moraes

Abstract: This paper describes the participation of Araguaia Medical Vision Lab at the International Skin Imaging Collaboration 2017 Skin Lesion Challenge. We describe the use of deep convolutional neural networks in attempt to classify images of Melanoma and Seborrheic Keratosis lesions. With use of finetuned GoogleNet and AlexNet we attained results of 0.950 and 0.846 AUC on Seborrheic Keratosis and Melan… ▽ More This paper describes the participation of Araguaia Medical Vision Lab at the International Skin Imaging Collaboration 2017 Skin Lesion Challenge. We describe the use of deep convolutional neural networks in attempt to classify images of Melanoma and Seborrheic Keratosis lesions. With use of finetuned GoogleNet and AlexNet we attained results of 0.950 and 0.846 AUC on Seborrheic Keratosis and Melanoma respectively. △ Less

Submitted 2 March, 2017; originally announced March 2017.

Comments: Abstract submitted as a requirement to ISIC2017 challenge

arXiv:1509.01682 [pdf, other]

Bounded Model Checking of C++ Programs Based on the Qt Framework (extended version)

Authors: Felipe R. M. Sousa, Lucas C. Cordeiro, Eddie B. de Lima Filho

Abstract: The software development process for embedded systems is getting faster and faster, which generally incurs an increase in the associated complexity. As a consequence, consumer electronics companies usually invest a lot of resources in fast and automatic verification processes, in order to create robust systems and reduce product recall rates. Because of that, the present paper proposes a simplifie… ▽ More The software development process for embedded systems is getting faster and faster, which generally incurs an increase in the associated complexity. As a consequence, consumer electronics companies usually invest a lot of resources in fast and automatic verification processes, in order to create robust systems and reduce product recall rates. Because of that, the present paper proposes a simplified version of the Qt framework, which is integrated into the Efficient SMT-Based Bounded Model Checking tool to verify actual applications that use the mentioned framework. The method proposed in this paper presents a success rate of 94.45%, for the developed test suite. △ Less

Submitted 5 September, 2015; originally announced September 2015.

Comments: extended version of paper published at GCCE'15

arXiv:1011.3177 [pdf, ps, other]

The Data Replication Method for the Classification with Reject Option

Authors: Ricardo Sousa, Jaime S. Cardoso

Abstract: Classification is one of the most important tasks of machine learning. Although the most well studied model is the two-class problem, in many scenarios there is the opportunity to label critical items for manual revision, instead of trying to automatically classify every item. In this paper we adapt a paradigm initially proposed for the classification of ordinal data to address the classification… ▽ More Classification is one of the most important tasks of machine learning. Although the most well studied model is the two-class problem, in many scenarios there is the opportunity to label critical items for manual revision, instead of trying to automatically classify every item. In this paper we adapt a paradigm initially proposed for the classification of ordinal data to address the classification problem with reject option. The technique reduces the problem of classifying with reject option to the standard two-class problem. The introduced method is then mapped into support vector machines and neural networks. Finally, the framework is extended to multiclass ordinal data with reject option. An experimental study with synthetic and real data sets, verifies the usefulness of the proposed approach. △ Less

Submitted 15 July, 2011; v1 submitted 13 November, 2010; originally announced November 2010.

Showing 1–26 of 26 results for author: Sousa, R