Thanks to the availability of high-throughput omics data, bioinformatics approaches are able to hypothesize thus-far undocumented genetic interactions. However, due to the amount of noise in these data, inferences based on a single data source are often unreliable. A popular approach to overcome this problem is to integrate different data sources. In this study, we describe DISTILLER, a novel framework for data integration that simultaneously analyzes microarray and motif information to find modules that consist of genes that are co-expressed in a subset of conditions, and their corresponding regulators. By applying our method on publicly available data, we evaluated the condition-specific transcriptional network of Escherichia coli. DISTILLER confirmed 62% of 736 interactions described in RegulonDB, and 278 novel interactions were predicted.
frequent itemset mining;
Document Type: Research Article
Department of Electrical Engineering, Katholieke Universiteit Leuven, Leuven, Belgium
Department of Microbial and Molecular Systems, Katholieke Universiteit Leuven, Leuven, Belgium
March 1, 2009