Motivation: Most multiple sequence alignment programs use heuristics that sometimes introduce errors into the alignment. The most commonly used methods to correct these errors use iterative techniques to maximize an objective function. We present here an alternative, knowledge-based approach that combines a number of recently developed methods into a two-step refinement process. The alignment is divided horizontally and vertically to form a 'lattice' in which well aligned regions can be differentiated. Alignment correction is then restricted to the less reliable regions, leading to a more reliable and efficient refinement strategy.
Results: The accuracy and reliability of RASCAL is demonstrated using: (i) alignments from the BAliBASE benchmark database, where significant improvements were often observed, with no deterioration of the existing high-quality regions, (ii) a large scale study involving 946 alignments from the ProDom protein domain database, where alignment quality was increased in 68% of the cases; and (iii) an automatic pipeline to obtain a high-quality alignment of 695 full-length nuclear receptor proteins, which took 11 min on a DEC Alpha 6100 computer
Availability: RASCAL is available at ftp://ftp-igbmc.u-strasbg.fr/pub/RASCAL.
Supplementary information: http://bioinfo-igbmc.u-strasbourg.fr/BioInfo/RASCAL/paper/rascal_supp.html