Gene Set Enrichment Analysis (GSEA) for Interpreting Gene Expression Profiles
Gene set enrichment analysis (GSEA) is a statistical method to determine if predefined sets of genes are differentially expressed in different phenotypes. Predefined gene sets may be genes in a known metabolic pathway, located in the same cytogenetic band, sharing the same Gene Ontology category, or any user-defined set. In microarray experiments where no single gene shows statistically significant differential expression between phenotypes, GSEA has identified significant differentially expressed sets of genes, even where the average difference in expression between two phenotypes is only 20% for genes in the gene set. The gene set identified in the first GSEA analysis (oxidative phosphorylation genes differentially expressed in diabetic versus non-diabetic patients) was subsequently confirmed by independent laboratory studies published in the New England Journal of Medicine. Since the first paper on GSEA was published, many extensions and alternative methods have been described in the literature. In this paper, we describe the original GSEA algorithm, subsequent extensions and alternatives, results of some of the applications, some limitations of the methods and caveats for users, and possible future research directions. GSEA and related methods are complementary to conventional single-gene methods. Single gene methods work best when individual genes have large effects and there is small variance within the phenotype. GSEA is likely to be more powerful than conventional single-gene methods for studying the large number of common diseases in which many genes each make subtle contributions. It is a tool that deserves to be in the toolbox of bioinformatics practitioners.
No Supplementary Data
No Article Media
Document Type: Research Article
Affiliations: Biomedical Informatics Program,MSOB X-215, 251 Campus Dr. Stanford, CA, 94305, USA.
Publication date: May 1, 2007
More about this publication?
- Current Bioinformatics aims to publish all the latest and outstanding developments in bioinformatics. Each issue contains a series of timely, in-depth reviews written by leaders in the field, covering a wide range of the integration of biology with computer and information science.
The journal focuses on reviews on advances in computational molecular/structural biology, encompassing areas such as computing in biomedicine and genomics, computational proteomics and systems biology, and metabolic pathway engineering. Developments in these fields have direct implications on key issues related to health care, medicine, genetic disorders, development of agricultural products, renewable energy, environmental protection, etc.
Current Bioinformatics is an essential journal for all academic and industrial researchers who want expert knowledge on all major advances in bioinformatics.
- Editorial Board
- Information for Authors
- Subscribe to this Title
- Ingenta Connect is not responsible for the content or availability of external websites