Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Pratas, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.01786  [pdf, other

    cs.IT q-bio.GN

    An experimental sorting method for improving metagenomic data encoding

    Authors: Diogo Pratas, Armando J. Pinho

    Abstract: Minimizing data storage poses a significant challenge in large-scale metagenomic projects. In this paper, we present a new method for improving the encoding of FASTQ files generated by metagenomic sequencing. This method incorporates metagenomic classification followed by a recursive filter for clustering reads by DNA sequence similarity to improve the overall reference-free compression. In the re… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  2. Automatic analysis of artistic paintings using information-based measures

    Authors: Jorge Miguel Silva, Diogo Pratas, Rui Antunes, Sérgio Matos, Armando J. Pinho

    Abstract: The artistic community is increasingly relying on automatic computational analysis for authentication and classification of artistic paintings. In this paper, we identify hidden patterns and relationships present in artistic paintings by analysing their complexity, a measure that quantifies the sum of characteristics of an object. Specifically, we apply Normalized Compression (NC) and the Block De… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: Website: http://panther.web.ua.pt 24 Pages; 19 pages article; 5 pages supplementary material

    Journal ref: Pattern Recognition (2021) 107864

  3. Extended-Alphabet Finite-Context Models

    Authors: João M. Carvalho, Susana Brás, Diogo Pratas, Jacqueline Ferreira, Sandra C. Soares, Armando J. Pinho

    Abstract: The Normalized Relative Compression (NRC) is a recent dissimilarity measure, related to the Kolmogorov Complexity. It has been successfully used in different applications, like DNA sequences, images or even ECG (electrocardiographic) signal. It uses a compressor that compresses a target string using exclusively the information contained in a reference string. One possible approach is to use finite… ▽ More

    Submitted 15 March, 2018; v1 submitted 21 September, 2017; originally announced September 2017.

  4. arXiv:1401.4725  [pdf, ps, other

    q-bio.GN cs.IT

    Information profiles for DNA pattern discovery

    Authors: Armando J. Pinho, Diogo Pratas, Paulo J. S. G. Ferreira

    Abstract: Finite-context modeling is a powerful tool for compressing and hence for representing DNA sequences. We describe an algorithm to detect genomic regularities, within a blind discovery strategy. The algorithm uses information profiles built using suitable combinations of finite-context models. We used the genome of the fission yeast Schizosaccharomyces pombe strain 972 h- for illustration, unveillin… ▽ More

    Submitted 19 January, 2014; originally announced January 2014.

    Comments: Full version of DCC 2014 paper "Information profiles for DNA pattern discovery"

  5. arXiv:1401.4134  [pdf, ps, other

    q-bio.GN cs.IT

    A conditional compression distance that unveils insights of the genomic evolution

    Authors: Diogo Pratas, Armando J. Pinho

    Abstract: We describe a compression-based distance for genomic sequences. Instead of using the usual conjoint information content, as in the classical Normalized Compression Distance (NCD), it uses the conditional information content. To compute this Normalized Conditional Compression Distance (NCCD), we need a normal conditional compressor, that we built using a mixture of static and dynamic finite-context… ▽ More

    Submitted 16 January, 2014; originally announced January 2014.

    Comments: Full version of DCC 2014 paper "A conditional compression distance that unveils insights of the genomic evolution"