Provides cluster assignments for all genes in a provided single-cell sequencing count matrix, using the celda Bayesian hierarchical model.

celda_G(counts, L, beta = 1, delta = 1, gamma = 1, stop.iter = 10,
  max.iter = 200, split.on.iter = 10, split.on.last = TRUE,
  seed = 12345, nchains = 3, initialize = c("random", "split"),
  count.checksum = NULL, y.init = NULL, logfile = NULL, verbose = TRUE)

Arguments

counts

Integer matrix. Rows represent features and columns represent cells.

L

Integer. Number of feature modules.

beta

Numeric. Concentration parameter for Phi. Adds a pseudocount to each feature module in each cell. Default 1.

delta

Numeric. Concentration parameter for Psi. Adds a pseudocount to each feature in each module. Default 1.

gamma

Numeric. Concentration parameter for Eta. Adds a pseudocount to the number of features in each module. Default 1.

stop.iter

Integer. Number of iterations without improvement in the log likelihood to stop inference. Default 10.

max.iter

Integer. Maximum number of iterations of Gibbs sampling to perform. Default 200.

split.on.iter

Integer. On every `split.on.iter` iteration, a heuristic will be applied to determine if a feature module should be reassigned and another feature module should be split into two clusters. To disable splitting, set to -1. Default 10.

split.on.last

Integer. After the the chain has converged, according to `stop.iter`, a heuristic will be applied to determine if a cell population should be reassigned and another cell population should be split into two clusters. If a split occurs, then 'stop.iter' will be reset. Default TRUE.

seed

Integer. Passed to set.seed(). Default 12345.

nchains

Integer. Number of random cluster initializations. Default 1.

initialize

Chararacter. One of 'random' or 'split'. With 'random', features are randomly assigned to a clusters. With 'split' cell and feature clusters will be recurssively split into two clusters using `celda_G` until the specified L is reached. Default 'random'.

count.checksum

Character. An MD5 checksum for the `counts` matrix. Default NULL.

y.init

Integer vector. Sets initial starting values of y. If NULL, starting values for each feature will be randomly sampled from 1:L. 'y.init' can only be used when 'initialize' = "random". Default NULL.

logfile

Character. Messages will be redirected to a file named `logfile`. If NULL, messages will be printed to stdout. Default NULL.

verbose

Logical. Whether to print log messages. Default TRUE.

Value

An object of class celda_G with clustering results and various sampling statistics.

Examples

celda.sim = simulateCells(model="celda_G") celda.mod = celda_G(celda.sim$counts, L=celda.sim$L)
#> --------------------------------------------------------------------
#> Starting Celda_G: Clustering genes.
#> --------------------------------------------------------------------
#> Thu Sep 06 12:56:28 2018 .. Initializing chain 1 with 'random' (seed=12345)
#> Thu Sep 06 12:56:28 2018 .... Completed iteration: 1 | logLik: -291426.665356919
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 2 | logLik: -286595.332370062
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 3 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 4 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 5 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 6 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 7 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 8 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 9 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Determining if any gene clusters should be split.
#> Thu Sep 06 12:56:29 2018 .... No additional splitting was performed.
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 10 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 11 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Completed iteration: 12 | logLik: -286579.864858824
#> Thu Sep 06 12:56:29 2018 .... Determining if any gene clusters should be split.
#> Thu Sep 06 12:56:30 2018 .... No additional splitting was performed.
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 13 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .. Finished chain 1 with seed 12345
#> Thu Sep 06 12:56:30 2018 .. Initializing chain 2 with 'random' (seed=12346)
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 1 | logLik: -287718.041765169
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 2 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 3 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 4 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 5 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 6 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 7 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 8 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 9 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .... Determining if any gene clusters should be split.
#> Thu Sep 06 12:56:30 2018 .... No additional splitting was performed.
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 10 | logLik: -286579.864858824
#> Thu Sep 06 12:56:30 2018 .... Completed iteration: 11 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .... Determining if any gene clusters should be split.
#> Thu Sep 06 12:56:31 2018 .... No additional splitting was performed.
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 12 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .. Finished chain 2 with seed 12346
#> Thu Sep 06 12:56:31 2018 .. Initializing chain 3 with 'random' (seed=12347)
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 1 | logLik: -290090.898941072
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 2 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 3 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 4 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 5 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 6 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 7 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 8 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .... Completed iteration: 9 | logLik: -286579.864858824
#> Thu Sep 06 12:56:31 2018 .... Determining if any gene clusters should be split.
#> Thu Sep 06 12:56:32 2018 .... No additional splitting was performed.
#> Thu Sep 06 12:56:32 2018 .... Completed iteration: 10 | logLik: -286579.864858824
#> Thu Sep 06 12:56:32 2018 .... Completed iteration: 11 | logLik: -286579.864858824
#> Thu Sep 06 12:56:32 2018 .... Determining if any gene clusters should be split.
#> Thu Sep 06 12:56:32 2018 .... No additional splitting was performed.
#> Thu Sep 06 12:56:32 2018 .... Completed iteration: 12 | logLik: -286579.864858824
#> Thu Sep 06 12:56:32 2018 .. Finished chain 3 with seed 12347
#> --------------------------------------------------------------------
#> Completed Celda_G. Total time: 3.479028 secs
#> --------------------------------------------------------------------