Automated Alignment and Nomenclature for Consistent Treatment of Polymorphisms in the Human Mitochondrial DNA Control Region
Naming mtDNA sequences by listing only those sites that differ from a reference sequence is the standard practice for describing the observed variations. Consistency in nomenclature is desirable so that all sequences in a database that are concordant with an evidentiary sequence will be found for estimating the rarity of that profile. The operational alignment and nomenclature rules, i.e., “Wilson Rules,” suggested for this purpose do not always guarantee a single consistent sequence description for all observed polymorphisms. In this work, the operational alignment/nomenclature rules were reconfigured to better reflect traditional user preferences. The rules for selecting alignments are described. In addition, to avoid human error and to more efficiently name mtDNA sequence variants, a computer-facilitated method of aligning mtDNA sample sequences with a reference sequence was developed. There were 33 differences between these hierarchical rules and the data in SWGDAM, which translates into a 99.92% consistency between the new rules and the manual historical nomenclature approach. The data support the reliability of the current SWGDAM database. As the few discrepancies were changed in favor of the new hierarchical rules, the quality of the SWGDAM database is further improved.
Document Type: Research Article
Publication date: September 1, 2010