skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

nDNA-Prot: identification of DNA-binding proteins based on unbalanced classification

BMC bioinformatics, 2014-09, Vol.15 (1), p.298-298, Article 298 [Peer Reviewed Journal]

COPYRIGHT 2014 BioMed Central Ltd. ;2014 Song et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. ;Song et al.; licensee BioMed Central Ltd. 2014 ;ISSN: 1471-2105 ;EISSN: 1471-2105 ;DOI: 10.1186/1471-2105-15-298 ;PMID: 25196432

Full text available

Citations Cited by
  • Title:
    nDNA-Prot: identification of DNA-binding proteins based on unbalanced classification
  • Author: Song, Li ; Li, Dapeng ; Zeng, Xiangxiang ; Wu, Yunfeng ; Guo, Li ; Zou, Quan
  • Subjects: Accuracy ; Algorithms ; Artificial Intelligence ; Cellular ; Classifiers ; Computational Biology - methods ; Databases, Protein ; DNA-Binding Proteins - chemistry ; DNA-Binding Proteins - metabolism ; Feature extraction ; Mathematical models ; Proteins ; Redundancy
  • Is Part Of: BMC bioinformatics, 2014-09, Vol.15 (1), p.298-298, Article 298
  • Description: DNA-binding proteins are vital for the study of cellular processes. In recent genome engineering studies, the identification of proteins with certain functions has become increasingly important and needs to be performed rapidly and efficiently. In previous years, several approaches have been developed to improve the identification of DNA-binding proteins. However, the currently available resources are insufficient to accurately identify these proteins. Because of this, the previous research has been limited by the relatively unbalanced accuracy rate and the low identification success of the current methods. In this paper, we explored the practicality of modelling DNA binding identification and simultaneously employed an ensemble classifier, and a new predictor (nDNA-Prot) was designed. The presented framework is comprised of two stages: a 188-dimension feature extraction method to obtain the protein structure and an ensemble classifier designated as imDC. Experiments using different datasets showed that our method is more successful than the traditional methods in identifying DNA-binding proteins. The identification was conducted using a feature that selected the minimum Redundancy and Maximum Relevance (mRMR). An accuracy rate of 95.80% and an Area Under the Curve (AUC) value of 0.986 were obtained in a cross validation. A test dataset was tested in our method and resulted in an 86% accuracy, versus a 76% using iDNA-Prot and a 68% accuracy using DNA-Prot. Our method can help to accurately identify DNA-binding proteins, and the web server is accessible at http://datamining.xmu.edu.cn/~songli/nDNA. In addition, we also predicted possible DNA-binding protein sequences in all of the sequences from the UniProtKB/Swiss-Prot database.
  • Publisher: England: BioMed Central Ltd
  • Language: English
  • Identifier: ISSN: 1471-2105
    EISSN: 1471-2105
    DOI: 10.1186/1471-2105-15-298
    PMID: 25196432
  • Source: Freely Accessible Journals
    SpringerOpen
    MEDLINE
    PubMed Central
    ROAD: Directory of Open Access Scholarly Resources
    ProQuest Central
    DOAJ Directory of Open Access Journals

Searching Remote Databases, Please Wait