Skip to main content

Robust detection and identification of sparse segments in ultrahigh dimensional data analysis

Buy Article:

$43.00 plus tax (Refund Policy)

Summary.  Copy number variants (CNVs) are alternations of DNA of a genome that result in the cell having less or more than two copies of segments of the DNA. CNVs correspond to relatively large regions of the genome, ranging from about one kilobase to several megabases, that are deleted or duplicated. Motivated by CNV analysis based on next generation sequencing data, we consider the problem of detecting and identifying sparse short segments hidden in a long linear sequence of data with an unspecified noise distribution. We propose a computationally efficient method that provides a robust and near optimal solution for segment identification over a wide range of noise distributions. We theoretically quantify the conditions for detecting the segment signals and show that the method near optimally estimates the signal segments whenever it is possible to detect their existence. Simulation studies are carried out to demonstrate the efficiency of the method under various noise distributions. We present results from a CNV analysis of a HapMap Yoruban sample to illustrate the theory and the methods further.
No References
No Citations
No Supplementary Data
No Article Media
No Metrics

Document Type: Research Article

Affiliations: University of Pennsylvania, Philadelphia, USA

Publication date: 2012-11-01

  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more