Introduction

Differential expression analysis can identify genes that are significantly up or down regulated between conditions. While many differential expression algorithms exist, their performance may vary on scRNA-Seq datasets.

limma, DESeq2, and ANOVA

Users can apply common differential expression algorithms limma, DESeq2, or perform an ANOVA to identify differentially expressed genes by selecting one or multiple condition variables present in the annotation information. After choosing the assay of interest using the “Select Assay” field and the algorithm in the “Select Method” field, choose the experimental condition and any additional covariates to use in the differential expression model. For limma analysis with a condition with more than two categories, a “Factor of Interest” must be selected. For DESeq2 analysis with more than two categories, users can select one of three modes. In Biomarker mode, select a factor of interest. The resulting genes will be differentially expressed between the factor of interest and all other cells. For factor of interest vs control factor analysis, the user selects a factor of interest and a control factor and the resulting genes will be significantly differentially expressed between these two-factor levels. Finally, for ANOVA, DESeq2 will be run using a likelihood ratio test. Users can customize the differential expression results by changing the number of genes to return, the p-value significance cutoff, and the p-value correction method applied to the results.

Visualization

The resulting gene list is displayed as a table in the “Results Table” tab and also in a heatmap in the “Heatmap” tab, which can also be customized using the options available in the “Options” tab. Users can download the gene list directly or create a biomarker list for a specific cell type or cell cluster, which can be stored in the gene annotation information in the SCtkExperiment object.

MAST

MAST, Model-based Analysis of Single-cell Transcriptomics, is a differential expression analysis tool specifically designed for single cell RNA-Seq data, which uses a hurdle model to account for the missingness in scRNA-Seq data. MAST has been implemented within the SCTK. Users can choose whether to use MAST’s adaptive thresholding model, choose fold change and expression thresholds, and identify significant genes based on conditions present in the annotation information provided. The results are presented in a table, violin plots, or visualized in a heatmap. For detailed information about MAST analysis, see the MAST documentation.

Saving a Biomarker

After differential expression analysis has been performed, the resulting gene list can be downloaded or stored in the SCtkExperiment object in the gene annotation data frame.

Session info

## R version 3.6.0 (2019-04-26)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.4
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] BiocStyle_2.12.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.1         rstudioapi_0.10    knitr_1.22        
##  [4] xml2_1.2.0         magrittr_1.5       roxygen2_6.1.1    
##  [7] MASS_7.3-51.4      R6_2.4.0           rlang_0.3.4       
## [10] stringr_1.4.0      tools_3.6.0        xfun_0.6          
## [13] htmltools_0.3.6    commonmark_1.7     yaml_2.2.0        
## [16] digest_0.6.18      assertthat_0.2.1   rprojroot_1.3-2   
## [19] bookdown_0.9       pkgdown_1.3.0      crayon_1.3.4      
## [22] BiocManager_1.30.4 fs_1.3.0           memoise_1.1.0     
## [25] evaluate_0.13      rmarkdown_1.12     stringi_1.4.3     
## [28] compiler_3.6.0     desc_1.2.0         backports_1.1.4