Bayesian adaptive sequence alignment algorithms
Source: Bioinformatics, Volume 14, Number 1, February 1998 , pp. 25-39(15)
Publisher: Oxford University Press
Abstract:The selection of a scoring matrix and gap penalty parameters continues to be an important problem in sequence alignment. We describe here an algorithm, the `Bayes block aligner, which bypasses this requirement. Instead of requiring a fixed set of parameter settings, this algorithm returns the Bayesian posterior probability for the number of gaps and for the scoring matrices in any series of interest. Furthermore, instead of returning the single best alignment for the chosen parameter settings, this algorithm returns the posterior distribution of all alignments considering the full range of gapping and scoring matrices selected, weighing each in proportion to its probability based on the data. We compared the Bayes aligner with the popular SmithWaterman algorithm with parameter settings from the literature which had been optimized for the identification of structural neighbors, and found that the Bayes aligner correctly identified more structural neighbors. In a detailed examination of the alignment of a pair of kinase and a pair of GTPase sequences, we illustrate the algorithm's potential to identify subsequences that are conserved to different degrees. In addition, this example shows that the Bayes aligner returns an alignment-free assessment of the distance between a pair of sequences.Availability: Software is available at http://www.wadsworth.org/res&res/bioinfo/Contact: junzhu, email@example.com, firstname.lastname@example.org
Document Type: Research article
Publication date: 1998-02-01
- The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.