Zum Hauptinhalt springen

Showing 1–26 of 26 results for author: Sousa, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.17129  [pdf, other

    cs.AI

    Process Mining Embeddings: Learning Vector Representations for Petri Nets

    Authors: Juan G. Colonna, Ahmed A. Fares, Márcio Duarte, Ricardo Sousa

    Abstract: Process Mining offers a powerful framework for uncovering, analyzing, and optimizing real-world business processes. Petri nets provide a versatile means of modeling process behavior. However, traditional methods often struggle to effectively compare complex Petri nets, hindering their potential for process enhancement. To address this challenge, we introduce PetriNet2Vec, an unsupervised methodolo… ▽ More

    Submitted 31 July, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  2. arXiv:2404.14970  [pdf, other

    cs.LG

    Integrating Heterogeneous Gene Expression Data through Knowledge Graphs for Improving Diabetes Prediction

    Authors: Rita T. Sousa, Heiko Paulheim

    Abstract: Diabetes is a worldwide health issue affecting millions of people. Machine learning methods have shown promising results in improving diabetes prediction, particularly through the analysis of diverse data types, namely gene expression data. While gene expression data can provide valuable insights, challenges arise from the fact that the sample sizes in expression datasets are usually limited, and… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 11 pages, 4 figures, 7th Workshop on Semantic Web Solutions for Large-scale Biomedical Data Analytics at ESWC2024

    ACM Class: J.3

  3. arXiv:2310.19250  [pdf, other

    cs.LG cs.CR

    Assessment of Differentially Private Synthetic Data for Utility and Fairness in End-to-End Machine Learning Pipelines for Tabular Data

    Authors: Mayana Pereira, Meghana Kshirsagar, Sumit Mukherjee, Rahul Dodhia, Juan Lavista Ferres, Rafael de Sousa

    Abstract: Differentially private (DP) synthetic data sets are a solution for sharing data while preserving the privacy of individual data providers. Understanding the effects of utilizing DP synthetic data in end-to-end machine learning pipelines impacts areas such as health care and humanitarian action, where data is scarce and regulated by restrictive privacy laws. In this work, we investigate the extent… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:2106.10241

  4. arXiv:2308.06165  [pdf, other

    cs.CL

    Task Conditioned BERT for Joint Intent Detection and Slot-filling

    Authors: Diogo Tavares, Pedro Azevedo, David Semedo, Ricardo Sousa, João Magalhães

    Abstract: Dialogue systems need to deal with the unpredictability of user intents to track dialogue state and the heterogeneity of slots to understand user preferences. In this paper we investigate the hypothesis that solving these challenges as one unified model will allow the transfer of parameter support data across the different tasks. The proposed principled model is based on a Transformer encoder, tra… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  5. arXiv:2308.03447  [pdf, other

    cs.AI

    Biomedical Knowledge Graph Embeddings with Negative Statements

    Authors: Rita T. Sousa, Sara Silva, Heiko Paulheim, Catia Pesquita

    Abstract: A knowledge graph is a powerful representation of real-world entities and their relations. The vast majority of these relations are defined as positive statements, but the importance of negative statements is increasingly recognized, especially under an Open World Assumption. Explicitly considering negative statements has been shown to improve performance on tasks such as entity summarization and… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: 19 pages, 4 figures

  6. Benchmark datasets for biomedical knowledge graphs with negative statements

    Authors: Rita T. Sousa, Sara Silva, Catia Pesquita

    Abstract: Knowledge graphs represent facts about real-world entities. Most of these facts are defined as positive statements. The negative statements are scarce but highly relevant under the open-world assumption. Furthermore, they have been demonstrated to improve the performance of several applications, namely in the biomedical domain. However, no benchmark dataset supports the evaluation of the methods t… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Journal ref: International Conference on Principles of Knowledge Representation and Reasoning 2023

  7. arXiv:2306.12687  [pdf, other

    cs.LG cs.AI

    Explainable Representations for Relation Prediction in Knowledge Graphs

    Authors: Rita T. Sousa, Sara Silva, Catia Pesquita

    Abstract: Knowledge graphs represent real-world entities and their relations in a semantically-rich structure supported by ontologies. Exploring this data with machine learning methods often relies on knowledge graph embeddings, which produce latent representations of entities that preserve structural and local graph neighbourhood properties, but sacrifice explainability. However, in tasks such as link or r… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 16 pages, 3 figures

  8. Tensor Slicing and Optimization for Multicore NPUs

    Authors: Rafael Sousa, Marcio Pereira, Yongin Kwon, Taeho Kim, Namsoon Jung, Chang Soo Kim, Michael Frank, Guido Araujo

    Abstract: Although code generation for Convolution Neural Network (CNN) models has been extensively studied, performing efficient data slicing and parallelization for highly-constrai\-ned Multicore Neural Processor Units (NPUs) is still a challenging problem. Given the size of convolutions' input/output tensors and the small footprint of NPU on-chip memories, minimizing memory transactions while maximizing… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Journal ref: Journal of Parallel and Distributed Computing Journal of Parallel and Distributed Computing, Volume 175, May 2023, Pages 66-79

  9. arXiv:2303.04739  [pdf, other

    cs.CV cs.AR cs.LG cs.PF

    Advancing Direct Convolution using Convolution Slicing Optimization and ISA Extensions

    Authors: Victor Ferrari, Rafael Sousa, Marcio Pereira, João P. L. de Carvalho, José Nelson Amaral, José Moreira, Guido Araujo

    Abstract: Convolution is one of the most computationally intensive operations that must be performed for machine-learning model inference. A traditional approach to compute convolutions is known as the Im2Col + BLAS method. This paper proposes SConv: a direct-convolution algorithm based on a MLIR/LLVM code-generation toolchain that can be integrated into machine-learning compilers . This algorithm introduce… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: 15 pages, 11 figures

  10. arXiv:2210.07332  [pdf, other

    cs.CR cs.LG

    Secure Multiparty Computation for Synthetic Data Generation from Distributed Data

    Authors: Mayana Pereira, Sikha Pentyala, Anderson Nascimento, Rafael T. de Sousa Jr., Martine De Cock

    Abstract: Legal and ethical restrictions on accessing relevant data inhibit data science research in critical domains such as health, finance, and education. Synthetic data generation algorithms with privacy guarantees are emerging as a paradigm to break this data logjam. Existing approaches, however, assume that the data holders supply their raw data to a trusted curator, who uses it as fuel for synthetic… ▽ More

    Submitted 28 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

  11. arXiv:2105.04944  [pdf

    cs.LG

    Predicting Gene-Disease Associations with Knowledge Graph Embeddings over Multiple Ontologies

    Authors: Susana Nunes, Rita T. Sousa, Catia Pesquita

    Abstract: Ontology-based approaches for predicting gene-disease associations include the more classical semantic similarity methods and more recently knowledge graph embeddings. While semantic similarity is typically restricted to hierarchical relations within the ontology, knowledge graph embeddings consider their full breadth. However, embeddings are produced over a single graph and complex tasks such as… ▽ More

    Submitted 31 May, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

    Comments: 4 pages, 1 figure, 2 tables

  12. arXiv:2010.14558  [pdf, other

    cs.SI

    Characterizing Human Mobility Patterns During COVID-19 using Cellular Network Data

    Authors: Necati A. Ayan, Nilson L. Damasceno, Sushil Chaskar, Peron R. de Sousa, Arti Ramesh, Anand Seetharam, Antonio A. de A. Rocha

    Abstract: In this paper, our goal is to analyze and compare cellular network usage data from pre-lockdown, during lockdown, and post-lockdown phases surrounding the COVID-19 pandemic to understand and model human mobility patterns during the pandemic, and evaluate the effect of lockdowns on mobility. To this end, we collaborate with one of the main cellular network providers in Brazil, and collect and analy… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: 12 pages

  13. arXiv:2006.16701  [pdf, other

    cs.LG stat.ML

    Hierarchical Qualitative Clustering: clustering mixed datasets with critical qualitative information

    Authors: Diogo Seca, João Mendes-Moreira, Tiago Mendes-Neves, Ricardo Sousa

    Abstract: Clustering can be used to extract insights from data or to verify some of the assumptions held by the domain experts, namely data segmentation. In the literature, few methods can be applied in clustering qualitative values using the context associated with other variables present in the data, without losing interpretability. Moreover, the metrics for calculating dissimilarity between qualitative v… ▽ More

    Submitted 6 July, 2020; v1 submitted 30 June, 2020; originally announced June 2020.

    Comments: 12 pages, 3 figures, 1 table. For more info see https://github.com/diogoseca/qualitative-clustering

    MSC Class: 68T05 (Primary) 62P05 (Secondary) ACM Class: I.2.6; J.1; G.3

  14. arXiv:2003.09291  [pdf, other

    cs.LG stat.ML

    Improving Irregularly Sampled Time Series Learning with Dense Descriptors of Time

    Authors: Rafael T. Sousa, Lucas A. Pereira, Anderson S. Soares

    Abstract: Supervised learning with irregularly sampled time series have been a challenge to Machine Learning methods due to the obstacle of dealing with irregular time intervals. Some papers introduced recently recurrent neural network models that deals with irregularity, but most of them rely on complex mechanisms to achieve a better performance. This work propose a novel method to represent timestamps (ho… ▽ More

    Submitted 20 March, 2020; originally announced March 2020.

  15. arXiv:2003.05377  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    Brazilian Lyrics-Based Music Genre Classification Using a BLSTM Network

    Authors: Raul de Araújo Lima, Rômulo César Costa de Sousa, Simone Diniz Junqueira Barbosa, Hélio Cortês Vieira Lopes

    Abstract: Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distrib… ▽ More

    Submitted 6 March, 2020; originally announced March 2020.

    Comments: 7 pages, 4 figures, 3 tables

    MSC Class: 68T50(Primary); 68T05 (Secondary) ACM Class: I.2.7; I.2.6

  16. arXiv:2001.04798  [pdf, other

    quant-ph cs.LG stat.ML

    Parametric Probabilistic Quantum Memory

    Authors: Rodrigo S. Sousa, Priscila G. M. dos Santos, Tiago M. L. Veras, Wilson R. de Oliveira, Adenilton J. da Silva

    Abstract: Probabilistic Quantum Memory (PQM) is a data structure that computes the distance from a binary input to all binary patterns stored in superposition on the memory. This data structure allows the development of heuristics to speed up artificial neural networks architecture selection. In this work, we propose an improved parametric version of the PQM to perform pattern classification, and we also pr… ▽ More

    Submitted 11 January, 2020; originally announced January 2020.

    Journal ref: Neurocomputing 416 (2020): 360-369

  17. arXiv:1911.05024  [pdf, other

    cs.CV

    Pose Guided Attention for Multi-label Fashion Image Classification

    Authors: Beatriz Quintino Ferreira, João P. Costeira, Ricardo G. Sousa, Liang-Yan Gui, João P. Gomes

    Abstract: We propose a compact framework with guided attention for multi-label classification in the fashion domain. Our visual semantic attention model (VSAM) is supervised by automatic pose extraction creating a discriminative feature space. VSAM outperforms the state of the art for an in-house dataset and performs on par with previous works on the DeepFashion dataset, even without using any landmark anno… ▽ More

    Submitted 12 November, 2019; originally announced November 2019.

    Comments: Published at ICCV 2019 Workshop on Computer Vision for Fashion, Art and Design

  18. arXiv:1811.09350  [pdf, other

    cs.LG stat.ML

    Predicting Diabetes Disease Evolution Using Financial Records and Recurrent Neural Networks

    Authors: Rafael T. Sousa, Lucas A. Pereira, Anderson S. Soares

    Abstract: Managing patients with chronic diseases is a major and growing healthcare challenge in several countries. A chronic condition, such as diabetes, is an illness that lasts a long time and does not go away, and often leads to the patient's health gradually getting worse. While recent works involve raw electronic health record (EHR) from hospitals, this work uses only financial records from health pla… ▽ More

    Submitted 20 March, 2020; v1 submitted 22 November, 2018; originally announced November 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

    Report number: ML4H/2018/70

  19. Quantum enhanced cross-validation for near-optimal neural networks architecture selection

    Authors: Priscila G. M. dos Santos, Rodrigo S. Sousa, Ismael C. S. Araujo, Adenilton J. da Silva

    Abstract: This paper proposes a quantum-classical algorithm to evaluate and select classical artificial neural networks architectures. The proposed algorithm is based on a probabilistic quantum memory and the possibility to train artificial neural networks in superposition. We obtain an exponential quantum speedup in the evaluation of neural networks. We also verify experimentally through a reduced experime… ▽ More

    Submitted 27 August, 2018; originally announced August 2018.

    Journal ref: International Journal of Quantum Information, Volume 16, No. 06, 1840005 (2018)

  20. arXiv:1806.09511  [pdf, other

    cs.IR cs.AI cs.CL

    A Hierarchical Deep Learning Natural Language Parser for Fashion

    Authors: José Marcelino, João Faria, Luís Baía, Ricardo Gamelas Sousa

    Abstract: This work presents a hierarchical deep learning natural language parser for fashion. Our proposal intends not only to recognize fashion-domain entities but also to expose syntactic and morphologic insights. We leverage the usage of an architecture of specialist models, each one for a different task (from parsing to entity recognition). Such architecture renders a hierarchical model able to capture… ▽ More

    Submitted 25 June, 2018; originally announced June 2018.

    Comments: In Proceedings of KDD 2018 (KDD Workshop on AI for Fashion)

  21. arXiv:1806.09445  [pdf, other

    cs.CV

    A Unified Model with Structured Output for Fashion Images Classification

    Authors: Beatriz Quintino Ferreira, Luís Baía, João Faria, Ricardo Gamelas Sousa

    Abstract: A picture is worth a thousand words. Albeit a cliché, for the fashion industry, an image of a clothing piece allows one to perceive its category (e.g., dress), sub-category (e.g., day dress) and properties (e.g., white colour with floral patterns). The seasonal nature of the fashion industry creates a highly dynamic and creative domain with evermore data, making it unpractical to manually describe… ▽ More

    Submitted 25 June, 2018; originally announced June 2018.

    Comments: Accepted in KDD 2018's AI for Fashion workshop

  22. arXiv:1712.02824  [pdf, ps, other

    cs.CV

    Stacked Denoising Autoencoders and Transfer Learning for Immunogold Particles Detection and Recognition

    Authors: Ricardo Gamelas Sousa, Jorge M. Santos, Luís M. Silva, Luís A. Alexandre, Tiago Esteves, Sara Rocha, Paulo Monjardino, Joaquim Marques de Sá, Francisco Figueiredo, Pedro Quelhas

    Abstract: In this paper we present a system for the detection of immunogold particles and a Transfer Learning (TL) framework for the recognition of these immunogold particles. Immunogold particles are part of a high-magnification method for the selective localization of biological molecules at the subcellular level only visible through Electron Microscopy. The number of immunogold particles in the cell wall… ▽ More

    Submitted 7 December, 2017; originally announced December 2017.

  23. arXiv:1712.02159  [pdf, ps, other

    cs.LG

    Distribution-Based Categorization of Classifier Transfer Learning

    Authors: Ricardo Gamelas Sousa, Luís A. Alexandre, Jorge M. Santos, Luís M. Silva, Joaquim Marques de Sá

    Abstract: Transfer Learning (TL) aims to transfer knowledge acquired in one problem, the source problem, onto another problem, the target problem, dispensing with the bottom-up construction of the target model. Due to its relevance, TL has gained significant interest in the Machine Learning community since it paves the way to devise intelligent learning models that can easily be tailored to many different a… ▽ More

    Submitted 6 December, 2017; originally announced December 2017.

  24. arXiv:1703.00856  [pdf, ps, other

    cs.CV

    Araguaia Medical Vision Lab at ISIC 2017 Skin Lesion Classification Challenge

    Authors: Rafael Teixeira Sousa, Larissa Vasconcellos de Moraes

    Abstract: This paper describes the participation of Araguaia Medical Vision Lab at the International Skin Imaging Collaboration 2017 Skin Lesion Challenge. We describe the use of deep convolutional neural networks in attempt to classify images of Melanoma and Seborrheic Keratosis lesions. With use of finetuned GoogleNet and AlexNet we attained results of 0.950 and 0.846 AUC on Seborrheic Keratosis and Melan… ▽ More

    Submitted 2 March, 2017; originally announced March 2017.

    Comments: Abstract submitted as a requirement to ISIC2017 challenge

  25. arXiv:1509.01682  [pdf, other

    cs.LO cs.SE

    Bounded Model Checking of C++ Programs Based on the Qt Framework (extended version)

    Authors: Felipe R. M. Sousa, Lucas C. Cordeiro, Eddie B. de Lima Filho

    Abstract: The software development process for embedded systems is getting faster and faster, which generally incurs an increase in the associated complexity. As a consequence, consumer electronics companies usually invest a lot of resources in fast and automatic verification processes, in order to create robust systems and reduce product recall rates. Because of that, the present paper proposes a simplifie… ▽ More

    Submitted 5 September, 2015; originally announced September 2015.

    Comments: extended version of paper published at GCCE'15

  26. arXiv:1011.3177  [pdf, ps, other

    cs.CV

    The Data Replication Method for the Classification with Reject Option

    Authors: Ricardo Sousa, Jaime S. Cardoso

    Abstract: Classification is one of the most important tasks of machine learning. Although the most well studied model is the two-class problem, in many scenarios there is the opportunity to label critical items for manual revision, instead of trying to automatically classify every item. In this paper we adapt a paradigm initially proposed for the classification of ordinal data to address the classification… ▽ More

    Submitted 15 July, 2011; v1 submitted 13 November, 2010; originally announced November 2010.