PatMatch: a program for finding patterns in peptide and nucleotide sequences

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W262-6. doi: 10.1093/nar/gki368.

Abstract

Here, we present PatMatch, an efficient, web-based pattern-matching program that enables searches for short nucleotide or peptide sequences such as cis-elements in nucleotide sequences or small domains and motifs in protein sequences. The program can be used to find matches to a user-specified sequence pattern that can be described using ambiguous sequence codes and a powerful and flexible pattern syntax based on regular expressions. A recent upgrade has improved performance and now supports both mismatches and wildcards in a single pattern. This enhancement has been achieved by replacing the previous searching algorithm, scan_for_matches [D'Souza et al. (1997), Trends in Genetics, 13, 497-498], with nondeterministic-reverse grep (NR-grep), a general pattern matching tool that allows for approximate string matching [Navarro (2001), Software Practice and Experience, 31, 1265-1312]. We have tailored NR-grep to be used for DNA and protein searches with PatMatch. The stand-alone version of the software can be adapted for use with any sequence dataset and is available for download at The Arabidopsis Information Resource (TAIR) at ftp://ftp.arabidopsis.org/home/tair/Software/Patmatch/. The PatMatch server is available on the web at http://www.arabidopsis.org/cgi-bin/patmatch/nph-patmatch.pl for searching Arabidopsis thaliana sequences.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Arabidopsis / genetics
  • Arabidopsis Proteins / chemistry
  • DNA, Plant / chemistry
  • Internet
  • Peptides / chemistry*
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, Protein / methods*
  • Software*
  • User-Computer Interface

Substances

  • Arabidopsis Proteins
  • DNA, Plant
  • Peptides