The Role of the COG Database in Comparative and Functional Genomics
A major breakthrough in classifying proteins from different microbial genomes in terms of sequence similarity was the development of the COG concept by Tatusov et al. in 1997. The authors defined clusters of orthologous groups of proteins (COGs) by strictly applying all against all BLAST alignments of protein sequences from completely sequenced microbial genomes. The latest update of the COG database already covered 66 microbial genomes and additionally included the KOG database, an equivalent consisting of seven eukaryotic genomes. Although excellent web-based software tools designed to analyze this huge amount of data were initially provided by the authors, many other groups independently developed more specialized or extended programs making use of COG data for diverse purposes. Here a brief introduction is given to the concept behind COGs and their potentials in the field of comparative and functional genomics are discussed. The review then is focused on the multitude of recently developed web services aimed at mining the COG database. Their capabilities to solve diverse problems in biochemistry are addressed. In order to illustrate the broad field of possible applications, a compilation of recently published findings, implementing information derived from comparative genomics with emphasis on data retrieved from the COG database, is given.
No Supplementary Data
No Article Media
Document Type: Research Article
Affiliations: Institute of Neurobiochemistry, The Protein Chemistry Group, Witten/Herdecke University, Stockumer Strasse 10, 58448 Witten, Germany.
Publication date: 01 August 2006
More about this publication?
- Current Bioinformatics aims to publish all the latest and outstanding developments in bioinformatics. Each issue contains a series of timely, in-depth reviews written by leaders in the field, covering a wide range of the integration of biology with computer and information science.
The journal focuses on reviews on advances in computational molecular/structural biology, encompassing areas such as computing in biomedicine and genomics, computational proteomics and systems biology, and metabolic pathway engineering. Developments in these fields have direct implications on key issues related to health care, medicine, genetic disorders, development of agricultural products, renewable energy, environmental protection, etc.
Current Bioinformatics is an essential journal for all academic and industrial researchers who want expert knowledge on all major advances in bioinformatics.