skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

Minimum entropy decomposition: unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences

The ISME Journal, 2015-03, Vol.9 (4), p.968-979 [Peer Reviewed Journal]

Copyright Nature Publishing Group Mar 2015 ;Distributed under a Creative Commons Attribution 4.0 International License ;Copyright © 2015 International Society for Microbial Ecology 2015 International Society for Microbial Ecology ;ISSN: 1751-7362 ;EISSN: 1751-7370 ;DOI: 10.1038/ismej.2014.195 ;PMID: 25325381

Full text available

Citations Cited by
  • Title:
    Minimum entropy decomposition: unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences
  • Author: Eren, A Murat ; Morrison, Hilary G ; Lescault, Pamela J ; Reveillaud, Julie ; Vineis, Joseph H ; Sogin, Mitchell L
  • Subjects: Algorithms ; Animals ; Biodiversity ; Biodiversity and Ecology ; Environmental Sciences ; Genetic Markers ; High-Throughput Nucleotide Sequencing - methods ; Humans ; Microbiota ; Mouth - microbiology ; Original ; Phylogeny ; Porifera - microbiology ; Sequence Analysis, DNA - methods
  • Is Part Of: The ISME Journal, 2015-03, Vol.9 (4), p.968-979
  • Description: Molecular microbial ecology investigations often employ large marker gene datasets, for example, ribosomal RNAs, to represent the occurrence of single-cell genomes in microbial communities. Massively parallel DNA sequencing technologies enable extensive surveys of marker gene libraries that sometimes include nearly identical sequences. Computational approaches that rely on pairwise sequence alignments for similarity assessment and de novo clustering with de facto similarity thresholds to partition high-throughput sequencing datasets constrain fine-scale resolution descriptions of microbial communities. Minimum Entropy Decomposition (MED) provides a computationally efficient means to partition marker gene datasets into 'MED nodes', which represent homogeneous operational taxonomic units. By employing Shannon entropy, MED uses only the information-rich nucleotide positions across reads and iteratively partitions large datasets while omitting stochastic variation. When applied to analyses of microbiomes from two deep-sea cryptic sponges Hexadella dedritifera and Hexadella cf. dedritifera, MED resolved a key Gammaproteobacteria cluster into multiple MED nodes that are specific to different sponges, and revealed that these closely related sympatric sponge species maintain distinct microbial communities. MED analysis of a previously published human oral microbiome dataset also revealed that taxa separated by less than 1% sequence variation distributed to distinct niches in the oral cavity. The information theory-guided decomposition process behind the MED algorithm enables sensitive discrimination of closely related organisms in marker gene amplicon datasets without relying on extensive computational heuristics and user supervision.
  • Publisher: England: Nature Publishing Group
  • Language: English
  • Identifier: ISSN: 1751-7362
    EISSN: 1751-7370
    DOI: 10.1038/ismej.2014.195
    PMID: 25325381
  • Source: Open Access: PubMed Central
    AUTh Library subscriptions: ProQuest Central
    MEDLINE

Searching Remote Databases, Please Wait