Bioinformatic Standards for Proteomics-Oriented Mass Spectrometry
A major goal of proteomics is the complete description of all the proteins present in cells, tissues and biological fluids. The method of choice for identifying and characterizing proteins for such purposes is protease digestion coupled with mass spectrometry (MS) and subsequent protein sequence database searching. New software tools to increase the sensitivity and specificity of MS based protein identification and methods for evaluating the validity of the peptide-mass spectrum matches have been developed and existing software has generally been improved. However, with the ongoing rapid increase in both volume and fragmentation of publicly available MS protein data, the development and adoption of data standards has become pivotal to the realization of integrated systems biology investigations. Unfortunately, the native data standards used by each type of mass spectrometers, each database search engine, and each public database currently differ. The diverse, nontransparent nature of the proprietary data structures complicates the necessary data integration and data comparison across experiments. To overcome this problem, data standards have been developed through the extensible markup language (XML). To date, the most comprehensive standardization attempt has been concomitantly conducted by the Institute for Systems Biology (mzXML, PepXML, ProtXML) and the Proteomics Standard Initiative (mzData, PSIMI). Their standards eliminate the need to support multiple input formats and significantly facilitate the exchange and publication of MS-based proteomic data. In this article, we also discuss the standards used for biological proteomic data representation in order to facilitate interpretation and dissemination of research results.
No Supplementary Data
No Article Media
Document Type: Research Article
Affiliations: Health and Environment Unit /Eastern Quebec Proteomic Center, Laval University Medical Research Center (CHUL), 2705 Boul. Laurier, Ste-Foy, Quebec, Canada, G1V 4G2.
Publication date: 01 July 2006
More about this publication?
- Current Proteomics research in the emerging field of proteomics is growing at an extremely rapid rate. The principal aim of Current Proteomics is to publish well-timed review articles in this fast-expanding area on topics relevant and significant to the development of proteomics. Current Proteomics is an essential journal for everyone involved in proteomics and related fields in both academia and industry.