A Novel Feature Vector for Short Coding Region Recognition of Human Gene
Abstract:Protein coding region recognition has been the one of the main topics in computational biology. In this study, we try to solve the issue using new methods. We received two new features by integrating the information of the distributions of stop codons and the information of base compositional bias. And the pseudo-base composition features were given, which can extract the information of the bases interaction in different positions. The accuracy of the algorithm was tested based on a large database of human genes. The average accuracy achieved by three new features was as high as 92.73% for the fragments with length of 192 base pairs. And the accuracy of the algorithm with 15 features can achieve 95.65% in the same length. We find that the use of the combination of two characters and pseudo-base composition features improve the accuracy of coding region recognition.
Document Type: Research Article
Publication date: 2012-01-01
More about this publication?
- Journal of Computational and Theoretical Nanoscience is an international peer-reviewed journal with a wide-ranging coverage, consolidates research activities in all aspects of computational and theoretical nanoscience into a single reference source. This journal offers scientists and engineers peer-reviewed research papers in all aspects of computational and theoretical nanoscience and nanotechnology in chemistry, physics, materials science, engineering and biology to publish original full papers and timely state-of-the-art reviews and short communications encompassing the fundamental and applied research.
- Editorial Board
- Information for Authors
- Submit a Paper
- Subscribe to this Title
- Terms & Conditions
- Ingenta Connect is not responsible for the content or availability of external websites