Zum Hauptinhalt springen

Showing 1–32 of 32 results for author: Charalampopoulos, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.03610  [pdf, other

    cs.DS

    Longest Common Extensions with Wildcards: Trade-off and Applications

    Authors: Gabriel Bathie, Panagiotis Charalampopoulos, Tatiana Starikovskaya

    Abstract: We study the Longest Common Extension (LCE) problem in a string containing wildcards. Wildcards (also called "don't cares" or "holes") are special characters that match any other character in the alphabet, similar to the character "?" in Unix commands or "." in regular expression engines. We consider the problem parametrized by $G$, the number of maximal contiguous groups of wildcards in the inp… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: Accepted to ESA 2024

  2. Internal Pattern Matching in Small Space and Applications

    Authors: Gabriel Bathie, Panagiotis Charalampopoulos, Tatiana Starikovskaya

    Abstract: In this work, we consider pattern matching variants in small space, that is, in the read-only setting, where we want to bound the space usage on top of storing the strings. Our main contribution is a space-time trade-off for the Internal Pattern Matching (IPM) problem, where the goal is to construct a data structure over a string $S$ of length $n$ that allows one to answer the following type of qu… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: To be published in CPM 2024

  3. arXiv:2403.06667  [pdf, other

    cs.DS math.CO

    Optimal Bounds for Distinct Quartics

    Authors: Panagiotis Charalampopoulos, Paweł Gawrychowski, Samah Ghazawi

    Abstract: A fundamental concept related to strings is that of repetitions. It has been extensively studied in many versions, from both purely combinatorial and algorithmic angles. One of the most basic questions is how many distinct squares, i.e., distinct strings of the form $UU$, a string of length $n$ can contain as fragments. It turns out that this is always $\mathcal{O}(n)$, and the bound cannot be imp… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: Abstract abridged due to arXiv requirements. 33 pages, 11 figures

  4. arXiv:2402.14550  [pdf, other

    cs.DS

    Approximate Circular Pattern Matching under Edit Distance

    Authors: Panagiotis Charalampopoulos, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, Wiktor Zuba

    Abstract: In the $k$-Edit Circular Pattern Matching ($k$-Edit CPM) problem, we are given a length-$n$ text $T$, a length-$m$ pattern $P$, and a positive integer threshold $k$, and we are to report all starting positions of the substrings of $T$ that are at edit distance at most $k$ from some cyclic rotation of $P$. In the decision version of the problem, we are to check if any such substring exists. Very re… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Full version of a paper accepted to STACS 2024

  5. arXiv:2402.07732  [pdf, other

    cs.DS

    Pattern Matching with Mismatches and Wildcards

    Authors: Gabriel Bathie, Panagiotis Charalampopoulos, Tatiana Starikovskaya

    Abstract: In this work, we address the problem of approximate pattern matching with wildcards. Given a pattern $P$ of length $m$ containing $D$ wildcards, a text $T$ of length $n$, and an integer $k$, our objective is to identify all fragments of $T$ within Hamming distance $k$ from $P$. Our primary contribution is an algorithm with runtime $O(n+(D+k)(G+k)\cdot n/m)$ for this problem. Here, $G \le D$ repr… ▽ More

    Submitted 21 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: This version contains a fix in the proof of Theorem 3.10 and other minor changes

  6. arXiv:2302.01373  [pdf, ps, other

    cs.DS

    Optimal Heaviest Induced Ancestors

    Authors: Panagiotis Charalampopoulos, Bartłomiej Dudek, Paweł Gawrychowski, Karol Pokorski

    Abstract: We revisit the Heaviest Induced Ancestors (HIA) problem that was introduced by Gagie, Gawrychowski, and Nekrich [CCCG 2013] and has a number of applications in string algorithms. Let $T_1$ and $T_2$ be two rooted trees whose nodes have weights that are increasing in all root-to-leaf paths, and labels on the leaves, such that no two leaves of a tree have the same label. A pair of nodes… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  7. arXiv:2208.08915  [pdf, other

    cs.DS

    Approximate Circular Pattern Matching

    Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Jakub Radoszewski, Solon P. Pissis, Wojciech Rytter, Tomasz Waleń, Wiktor Zuba

    Abstract: We consider approximate circular pattern matching (CPM, in short) under the Hamming and edit distance, in which we are given a length-$n$ text $T$, a length-$m$ pattern $P$, and a threshold $k>0$, and we are to report all starting positions of fragments of $T$ (called occurrences) that are at distance at most $k$ from some cyclic rotation of $P$. In the decision version of the problem, we are to c… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.

    Comments: Accepted to ESA 2022. Abstract abridged to meet arXiv requirements

  8. arXiv:2204.03087  [pdf, other

    cs.DS

    Faster Pattern Matching under Edit Distance

    Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Philip Wellnitz

    Abstract: We consider the approximate pattern matching problem under the edit distance. Given a text $T$ of length $n$, a pattern $P$ of length $m$, and a threshold $k$, the task is to find the starting positions of all substrings of $T$ that can be transformed to $P$ with at most $k$ edits. More than 20 years ago, Cole and Hariharan [SODA'98, J. Comput.'02] gave an $\mathcal{O}(n+k^4 \cdot n/ m)$-time algo… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: 94 pages, 7 figures

  9. arXiv:2106.01763  [pdf, other

    cs.DS

    Internal Shortest Absent Word Queries in Constant Time and Linear Space

    Authors: Golnaz Badkobeh, Panagiotis Charalampopoulos, Dmitry Kosolobov, Solon P. Pissis

    Abstract: Given a string $T$ of length $n$ over an alphabet $Σ\subset \{1,2,\ldots,n^{O(1)}\}$ of size $σ$, we are to preprocess $T$ so that given a range $[i,j]$, we can return a representation of a shortest string over $Σ$ that is absent in the fragment $T[i]\cdots T[j]$ of $T$. We present an $O(n)$-space data structure that answers such queries in constant time and can be constructed in $O(n\log_σn)$ tim… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: 13 pages, 1 figure, 4 tables

  10. arXiv:2105.03106  [pdf, other

    cs.DS

    Faster Algorithms for Longest Common Substring

    Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski

    Abstract: In the classic longest common substring (LCS) problem, we are given two strings $S$ and $T$, each of length at most $n$, over an alphabet of size $σ$, and we are asked to find a longest string occurring as a fragment of both $S$ and $T$. Weiner, in his seminal paper that introduced the suffix tree, presented an $\mathcal{O}(n \log σ)$-time algorithm for this problem [SWAT 1973]. For polynomially-b… ▽ More

    Submitted 7 May, 2021; originally announced May 2021.

  11. arXiv:2103.03294  [pdf, other

    cs.DS

    An Almost Optimal Edit Distance Oracle

    Authors: Panagiotis Charalampopoulos, Paweł Gawrychowski, Shay Mozes, Oren Weimann

    Abstract: We consider the problem of preprocessing two strings $S$ and $T$, of lengths $m$ and $n$, respectively, in order to be able to efficiently answer the following queries: Given positions $i,j$ in $S$ and positions $a,b$ in $T$, return the optimal alignment of $S[i \mathinner{.\,.} j]$ and $T[a \mathinner{.\,.} b]$. Let $N=mn$. We present an oracle with preprocessing time $N^{1+o(1)}$ and space… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

  12. arXiv:2102.07154  [pdf, ps, other

    cs.DS

    Fault-Tolerant Distance Labeling for Planar Graphs

    Authors: Aviv Bar-Natan, Panagiotis Charalampopoulos, Paweł Gawrychowski, Shay Mozes, Oren Weimann

    Abstract: In fault-tolerant distance labeling we wish to assign short labels to the vertices of a graph $G$ such that from the labels of any three vertices $u,v,f$ we can infer the $u$-to-$v$ distance in the graph $G\setminus \{f\}$. We show that any directed weighted planar graph (and in fact any graph in a graph family with $O(\sqrt{n})$-size separators, such as minor-free graphs) admits fault-tolerant di… ▽ More

    Submitted 14 February, 2021; originally announced February 2021.

  13. arXiv:2006.16137  [pdf, other

    cs.DS

    Pattern Masking for Dictionary Matching

    Authors: Panagiotis Charalampopoulos, Huiping Chen, Peter Christen, Grigorios Loukides, Nadia Pisanti, Solon P. Pissis, Jakub Radoszewski

    Abstract: In the Pattern Masking for Dictionary Matching (PMDM) problem, we are given a dictionary $\mathcal{D}$ of $d$ strings, each of length $\ell$, a query string $q$ of length $\ell$, and a positive integer $z$, and we are asked to compute a smallest set $K\subseteq\{1,\ldots,\ell\}$, so that if $q[i]$, for all $i\in K$, is replaced by a wildcard, then $q$ matches at least $z$ strings from… ▽ More

    Submitted 8 March, 2024; v1 submitted 29 June, 2020; originally announced June 2020.

    Comments: Published in Algorithmica. Abstract abridged due to arXiv requirements

  14. arXiv:2006.15999  [pdf, ps, other

    cs.DS cs.DM

    The Number of Repetitions in 2D-Strings

    Authors: Panagiotis Charalampopoulos, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, Wiktor Zuba

    Abstract: The notions of periodicity and repetitions in strings, and hence these of runs and squares, naturally extend to two-dimensional strings. We consider two types of repetitions in 2D-strings: 2D-runs and quartics (quartics are a 2D-version of squares in standard strings). Amir et al. introduced 2D-runs, showed that there are $O(n^3)$ of them in an $n \times n$ 2D-string and presented a simple constru… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: To appear in the ESA 2020 proceedings

  15. arXiv:2006.02408  [pdf, other

    cs.DS

    Dynamic Longest Common Substring in Polylogarithmic Time

    Authors: Panagiotis Charalampopoulos, Paweł Gawrychowski, Karol Pokorski

    Abstract: The longest common substring problem consists in finding a longest string that appears as a (contiguous) substring of two input strings. We consider the dynamic variant of this problem, in which we are to maintain two dynamic strings $S$ and $T$, each of length at most $n$, that undergo substitutions of letters, in order to be able to return a longest common substring after each substitution. Rece… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

    Comments: Full version of a paper that is to appear in the ICALP 2020 proceedings

  16. arXiv:2005.05681  [pdf, ps, other

    cs.DS

    Counting Distinct Patterns in Internal Dictionary Matching

    Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, Wiktor Zuba

    Abstract: We consider the problem of preprocessing a text $T$ of length $n$ and a dictionary $\mathcal{D}$ in order to be able to efficiently answer queries $CountDistinct(i,j)$, that is, given $i$ and $j$ return the number of patterns from $\mathcal{D}$ that occur in the fragment $T[i \mathinner{.\,.} j]$. The dictionary is internal in the sense that each pattern in $\mathcal{D}$ is given as a fragment of… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

    Comments: Accepted to CPM 2020

  17. arXiv:2004.08350  [pdf, other

    cs.DS

    Faster Approximate Pattern Matching: A Unified Approach

    Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Philip Wellnitz

    Abstract: Approximate pattern matching is a natural and well-studied problem on strings: Given a text $T$, a pattern $P$, and a threshold $k$, find (the starting positions of) all substrings of $T$ that are at distance at most $k$ from $P$. We consider the two most fundamental string metrics: the Hamming distance and the edit distance. Under the Hamming distance, we search for substrings of $T$ that have at… ▽ More

    Submitted 16 November, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

    Comments: 74 pages, 7 figures, FOCS'20

  18. arXiv:1909.11577  [pdf, ps, other

    cs.DS

    Internal Dictionary Matching

    Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń

    Abstract: We introduce data structures answering queries concerning the occurrences of patterns from a given dictionary $\mathcal{D}$ in fragments of a given string $T$ of length $n$. The dictionary is internal in the sense that each pattern in $\mathcal{D}$ is given as a fragment of $T$. This way, $\mathcal{D}$ takes space proportional to the number of patterns $d=|\mathcal{D}|$ rather than their total len… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

    Comments: A short version of this paper was accepted for presentation at ISAAC 2019

  19. arXiv:1909.11433  [pdf, ps, other

    cs.DS

    Weighted Shortest Common Supersequence Problem Revisited

    Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, Wiktor Zuba

    Abstract: A weighted string, also known as a position weight matrix, is a sequence of probability distributions over some alphabet. We revisit the Weighted Shortest Common Supersequence (WSCS) problem, introduced by Amir et al. [SPIRE 2011], that is, the SCS problem on weighted strings. In the WSCS problem, we are given two weighted strings $W_1$ and $W_2$ and a threshold $\mathit{Freq}$ on probability, and… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

    Comments: Accepted to SPIRE'19

  20. arXiv:1907.01815  [pdf, other

    cs.DS

    Circular Pattern Matching with $k$ Mismatches

    Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, Wiktor Zuba

    Abstract: The $k$-mismatch problem consists in computing the Hamming distance between a pattern $P$ of length $m$ and every length-$m$ substring of a text $T$ of length $n$, if this distance is no more than $k$. In many real-world applications, any cyclic rotation of $P$ is a relevant pattern, and thus one is interested in computing the minimal distance of every length-$m$ substring of $T$ and any cyclic ro… ▽ More

    Submitted 13 January, 2020; v1 submitted 3 July, 2019; originally announced July 2019.

    Comments: Extended version of a paper from FCT 2019

  21. arXiv:1811.01551  [pdf, other

    cs.DS

    Almost Optimal Distance Oracles for Planar Graphs

    Authors: Panagiotis Charalampopoulos, Paweł Gawrychowski, Shay Mozes, Oren Weimann

    Abstract: We present new tradeoffs between space and query-time for exact distance oracles in directed weighted planar graphs. These tradeoffs are almost optimal in the sense that they are within polylogarithmic, sub-polynomial or arbitrarily small polynomial factors from the naïve linear space, constant query-time lower bound. These tradeoffs include: (i) an oracle with space $\tilde{O}(n^{1+ε})$ and query… ▽ More

    Submitted 5 November, 2018; originally announced November 2018.

  22. arXiv:1807.11702  [pdf, ps, other

    cs.DS

    Efficient Computation of Sequence Mappability

    Authors: Panagiotis Charalampopoulos, Costas S. Iliopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski, Juliusz Straszyński

    Abstract: In the $(k,m)$-mappability problem, for a given sequence $T$ of length $n$, the goal is to compute a table whose $i$th entry is the number of indices $j \ne i$ such that the length-$m$ substrings of $T$ starting at positions $i$ and $j$ have at most $k$ mismatches. Previous works on this problem focused on heuristics computing a rough approximation of the result or on the case of $k=1$. We present… ▽ More

    Submitted 16 June, 2021; v1 submitted 31 July, 2018; originally announced July 2018.

    Comments: Accepted to SPIRE 2018

    ACM Class: F.2.2

  23. arXiv:1807.05968  [pdf, other

    cs.DS

    Exact Distance Oracles for Planar Graphs with Failing Vertices

    Authors: Panagiotis Charalampopoulos, Shay Mozes, Benjamin Tebeka

    Abstract: We consider exact distance oracles for directed weighted planar graphs in the presence of failing vertices. Given a source vertex $u$, a target vertex $v$ and a set $X$ of $k$ failed vertices, such an oracle returns the length of a shortest $u$-to-$v$ path that avoids all vertices in $X$. We propose oracles that can handle any number $k$ of failures. We show several tradeoffs between space, query… ▽ More

    Submitted 30 August, 2021; v1 submitted 16 July, 2018; originally announced July 2018.

    Comments: Improved space vs. query time tradeoffs

  24. arXiv:1806.02718  [pdf, ps, other

    cs.DS cs.FL

    Alignment-free sequence comparison using absent words

    Authors: Panagiotis Charalampopoulos, Maxime Crochemore, Gabriele Fici, Robert Mercas, Solon P. Pissis

    Abstract: Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realised by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as $q$-gram distance, are… ▽ More

    Submitted 7 June, 2018; originally announced June 2018.

    Comments: Extended version of "Linear-Time Sequence Comparison Using Minimal Absent Words & Applications" Proc. LATIN 2016, arxiv:1506.04917

  25. arXiv:1804.08731  [pdf, other

    cs.DS

    Longest Common Substring Made Fully Dynamic

    Authors: Amihood Amir, Panagiotis Charalampopoulos, Solon P. Pissis, Jakub Radoszewski

    Abstract: In the longest common substring (LCS) problem, we are given two strings $S$ and $T$, each of length at most $n$, and we are asked to find a longest string occurring as a fragment of both $S$ and $T$. This is a classical and well-studied problem in computer science with a known $\mathcal{O}(n)$-time solution. In the fully dynamic version of the problem, edit operations are allowed in either of the… ▽ More

    Submitted 16 July, 2018; v1 submitted 23 April, 2018; originally announced April 2018.

  26. arXiv:1802.06369  [pdf, ps, other

    cs.DS

    Linear-Time Algorithm for Long LCF with $k$ Mismatches

    Authors: Panagiotis Charalampopoulos, Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń

    Abstract: In the Longest Common Factor with $k$ Mismatches (LCF$_k$) problem, we are given two strings $X$ and $Y$ of total length $n$, and we are asked to find a pair of maximal-length factors, one of $X$ and the other of $Y$, such that their Hamming distance is at most $k$. Thankachan et al. show that this problem can be solved in $\mathcal{O}(n \log^k n)$ time and $\mathcal{O}(n)$ space for constant $k$.… ▽ More

    Submitted 18 February, 2018; originally announced February 2018.

    Comments: submitted to CPM 2018

  27. arXiv:1801.04425  [pdf, ps, other

    cs.DS

    Longest Common Prefixes with $k$-Errors and Applications

    Authors: Lorraine A. K. Ayad, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Solon P. Pissis

    Abstract: Although real-world text datasets, such as DNA sequences, are far from being uniformly random, average-case string searching algorithms perform significantly better than worst-case ones in most applications of interest. In this paper, we study the problem of computing the longest prefix of each suffix of a given string of length $n$ over a constant-sized alphabet that occurs elsewhere in the strin… ▽ More

    Submitted 13 January, 2018; originally announced January 2018.

  28. arXiv:1705.04589  [pdf, ps, other

    cs.DS

    How to answer a small batch of RMQs or LCA queries in practice

    Authors: Mai Alzamel, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Solon P. Pissis

    Abstract: In the Range Minimum Query (RMQ) problem, we are given an array $A$ of $n$ numbers and we are asked to answer queries of the following type: for indices $i$ and $j$ between $0$ and $n-1$, query $\text{RMQ}_A(i,j)$ returns the index of a minimum element in the subarray $A[i..j]$. Answering a small batch of RMQs is a core computational task in many real-world applications, in particular due to the c… ▽ More

    Submitted 12 May, 2017; originally announced May 2017.

    Comments: Accepted to IWOCA 2017

  29. arXiv:1705.04022  [pdf, ps, other

    cs.DS

    Faster algorithms for 1-mappability of a sequence

    Authors: Mai Alzamel, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Solon P. Pissis, Jakub Radoszewski, Wing-Kin Sung

    Abstract: In the k-mappability problem, we are given a string x of length n and integers m and k, and we are asked to count, for each length-m factor y of x, the number of other factors of length m of x that are at Hamming distance at most k from y. We focus here on the version of the problem where k = 1. The fastest known algorithm for k = 1 requires time O(mn log n/ log log n) and space O(n). We present t… ▽ More

    Submitted 11 May, 2017; originally announced May 2017.

  30. arXiv:1705.03385  [pdf, ps, other

    cs.DS

    Optimal Computation of Overabundant Words

    Authors: Yannis Almirantis, Panagiotis Charalampopoulos, Jia Gao, Costas S. Iliopoulos, Manal Mohamed, Solon P. Pissis, Dimitris Polychronopoulos

    Abstract: The observed frequency of the longest proper prefix, the longest proper suffix, and the longest infix of a word $w$ in a given sequence $x$ can be used for classifying $w$ as avoided or overabundant. The definitions used for the expectation and deviation of $w$ in this statistical model were described and biologically justified by Brendel et al. (J Biomol Struct Dyn 1986). We have very recently in… ▽ More

    Submitted 9 May, 2017; originally announced May 2017.

  31. arXiv:1703.08931  [pdf, ps, other

    cs.DS

    Palindromic Decompositions with Gaps and Errors

    Authors: Michał Adamczyk, Mai Alzamel, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Jakub Radoszewski

    Abstract: Identifying palindromes in sequences has been an interesting line of research in combinatorics on words and also in computational biology, after the discovery of the relation of palindromes in the DNA sequence with the HIV virus. Efficient algorithms for the factorization of sequences into palindromes and maximal palindromes have been devised in recent years. We extend these studies by allowing ga… ▽ More

    Submitted 27 March, 2017; originally announced March 2017.

    Comments: accepted to CSR 2017

  32. arXiv:1604.08760  [pdf, ps, other

    cs.DS

    Optimal Computation of Avoided Words

    Authors: Yannis Almirantis, Panagiotis Charalampopoulos, Jia Gao, Costas S. Iliopoulos, Manal Mohamed, Solon P. Pissis, Dimitris Polychronopoulos

    Abstract: The deviation of the observed frequency of a word $w$ from its expected frequency in a given sequence $x$ is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of $w$, denoted by $std(w)$, effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A wor… ▽ More

    Submitted 29 April, 2016; originally announced April 2016.