TomExpress, a unified tomato RNA‐Seq platform for visualization of expression data, clustering and correlation networks
The TomExpress platform was developed to provide the tomato research community with a browser and integrated web tools for public RNA‐Seq data visualization and data mining. To avoid major biases that can result from the use of different mapping and statistical processing methods, RNA‐Seq raw sequence data available in public databases were mapped de novo on a unique tomato reference genome sequence and post‐processed using the same pipeline with accurate parameters. Following the calculation of the number of counts per gene in each RNA‐Seq sample, a communal global normalization method was applied to all expression values. This unifies the whole set of expression data and makes them comparable. A database was designed where each expression value is associated with corresponding experimental annotations. Sample details were manually curated to be easily understandable by biologists. To make the data easily searchable, a user‐friendly web interface was developed that provides versatile data mining web tools via on‐the‐fly generation of output graphics, such as expression bar plots, comprehensive in planta representations and heatmaps of hierarchically clustered expression data. In addition, it allows for the identification of co‐expressed genes and the visualization of correlation networks of co‐regulated gene groups. TomExpress provides one of the most complete free resources of publicly available tomato RNA‐Seq data, and allows for the immediate interrogation of transcriptional programs that regulate vegetative and reproductive development in tomato under diverse conditions. The design of the pipeline developed in this project enables easy updating of the database with newly published RNA‐Seq data, thereby allowing for continuous enrichment of the resource.
No Supplementary Data
No Article Media