skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences

Bioinformatics (Oxford, England), 2019-06, Vol.35 (12), p.2084-2092 [Peer Reviewed Journal]

The Author(s) 2018. Published by Oxford University Press. ;The Author(s) 2018. Published by Oxford University Press. 2018 ;ISSN: 1367-4803 ;EISSN: 1367-4811 ;DOI: 10.1093/bioinformatics/bty895 ;PMID: 30395178

Full text available

Citations Cited by
  • Title:
    Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences
  • Author: Zhu, Anqi ; Ibrahim, Joseph G ; Love, Michael I
  • Stegle, Oliver
  • Subjects: Original Papers
  • Is Part Of: Bioinformatics (Oxford, England), 2019-06, Vol.35 (12), p.2084-2092
  • Description: In RNA-seq differential expression analysis, investigators aim to detect those genes with changes in expression level across conditions, despite technical and biological variability in the observations. A common task is to accurately estimate the effect size, often in terms of a logarithmic fold change (LFC). When the read counts are low or highly variable, the maximum likelihood estimates for the LFCs has high variance, leading to large estimates not representative of true differences, and poor ranking of genes by effect size. One approach is to introduce filtering thresholds and pseudocounts to exclude or moderate estimated LFCs. Filtering may result in a loss of genes from the analysis with true differences in expression, while pseudocounts provide a limited solution that must be adapted per dataset. Here, we propose the use of a heavy-tailed Cauchy prior distribution for effect sizes, which avoids the use of filter thresholds or pseudocounts. The proposed method, Approximate Posterior Estimation for generalized linear model, apeglm, has lower bias than previously proposed shrinkage estimators, while still reducing variance for those genes with little information for statistical inference. The apeglm package is available as an R/Bioconductor package at https://bioconductor.org/packages/apeglm, and the methods can be called from within the DESeq2 software. Supplementary data are available at Bioinformatics online.
  • Publisher: England: Oxford University Press
  • Language: English
  • Identifier: ISSN: 1367-4803
    EISSN: 1367-4811
    DOI: 10.1093/bioinformatics/bty895
    PMID: 30395178
  • Source: Journals@Ovid Open Access Journal Collection Rolling
    Geneva Foundation Free Medical Journals at publisher websites
    PubMed Central

Searching Remote Databases, Please Wait