celda_CG.Rd
celda Cell and Gene Clustering Model
celda_CG(counts, sample.label = NULL, K, L, alpha = 1, beta = 1, delta = 1, gamma = 1, algorithm = c("EM", "Gibbs"), stop.iter = 10, max.iter = 200, split.on.iter = 10, split.on.last = TRUE, seed = 12345, nchains = 3, initialize = c("random", "split"), count.checksum = NULL, z.init = NULL, y.init = NULL, logfile = NULL, verbose = TRUE)
counts | Integer matrix. Rows represent features and columns represent cells. |
---|---|
sample.label | Vector or factor. Denotes the sample label for each cell (column) in the count matrix. |
K | Integer. Number of cell populations. |
L | Integer. Number of feature modules. |
alpha | Numeric. Concentration parameter for Theta. Adds a pseudocount to each cell population in each sample. Default 1. |
beta | Numeric. Concentration parameter for Phi. Adds a pseudocount to each feature module in each cell population. Default 1. |
delta | Numeric. Concentration parameter for Psi. Adds a pseudocount to each feature in each module. Default 1. |
gamma | Numeric. Concentration parameter for Eta. Adds a pseudocount to the number of features in each module. Default 1. |
algorithm | String. Algorithm to use for clustering cell subpopulations. One of 'EM' or 'Gibbs'. Default 'EM'. |
stop.iter | Integer. Number of iterations without improvement in the log likelihood to stop inference. Default 10. |
max.iter | Integer. Maximum number of iterations of Gibbs sampling to perform. Default 200. |
split.on.iter | Integer. On every `split.on.iter` iteration, a heuristic will be applied to determine if a cell population or feature module should be reassigned and another cell population or feature module should be split into two clusters. To disable splitting, set to -1. Default 10. |
split.on.last | Integer. After the the chain has converged, according to `stop.iter`, a heuristic will be applied to determine if a cell population or feature module should be reassigned and another cell population or feature module should be split into two clusters. If a split occurs, then 'stop.iter' will be reset. Default TRUE. |
seed | Integer. Passed to set.seed(). Default 12345. |
nchains | Integer. Number of random cluster initializations. Default 1. |
initialize | Chararacter. One of 'random' or 'split'. With 'random', cells and features are randomly assigned to a clusters. With 'split' cell and feature clusters will be recurssively split into two clusters using `celda_C` and `celda_G`, respectively, until the specified K and L is reached. Default 'random'. |
count.checksum | Character. An MD5 checksum for the `counts` matrix. Default NULL. |
z.init | Integer vector. Sets initial starting values of z. If NULL, starting values for each cell will be randomly sampled from 1:K. 'z.init' can only be used when 'initialize' = "random". Default NULL. |
y.init | Integer vector. Sets initial starting values of y. If NULL, starting values for each feature will be randomly sampled from 1:L. 'y.init' can only be used when 'initialize' = "random". Default NULL. |
logfile | Character. Messages will be redirected to a file named `logfile`. If NULL, messages will be printed to stdout. Default NULL. |
verbose | Logical. Whether to print log messages. Default TRUE. |
An object of class celda_CG with clustering results and various sampling statistics.
celda.sim = simulateCells(model="celda_CG") celda.mod = celda_CG(celda.sim$counts, K=celda.sim$K, L=celda.sim$L, sample.label=celda.sim$sample.label, nchains=1)#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>