scClusterCells#
scClusterCells clusters cells based on the input count matrix (output of scCountReads) and
performs dimensionality reduction, community detection and 2D projection (UMAP) of the cells. The
result is an updated h5ad object, and (optionally) a plot file and a tsv file with UMAP coordinates
and corresponding cluster id for each barcode.
usage: scClusterCells -i cellCounts.h5ad -o clustered.h5ad -op umap.png
Input/Output options#
- --input, -i
Input file in .h5ad format.
- --outFile, -o
The file to write results to. For scFilterStats, scFilterBarcodes and scJSD, the output file is a .tsv file. For other tools, the output file is an updated .h5ad object with the result of the requested operation.
Clustering Options#
- --outFileUMAP, -op
The output plot file (for UMAP). If you specify this option, another file with the same prefix (and .tsv extention) is also created with the raw UMAP coordinates.
- --outFileTrainedModel, -om
The output file for the trained LSI model. The saved model can be used later to embed/compare new cells to the existing cluster of cells.
- --method, -m
Possible choices: logPCA, LSA, LDA, glmPCA
The dimensionality reduction method for clustering. (Default: 'LSA')
- --glmPCAfamily, -gf
Possible choices: gaussian, poisson, bernoulli, beta, gamma, lognormal, log_normal, sigmoid_beta
The choice of exponential family distribution to use for glmPCA method. (Default: 'poisson')
- --binarize
Binarize the counts per region before dimensionality reduction (only for LSA/LDA)
- --nPrinComps, -n
Number of principal components to reduce the dimensionality to. Use higher number for samples with more expected heterogenity. (Default: 20)
- --nNeighbors, -nk
Number of nearest neighbours to consider for clustering and UMAP. This number should be chosen considering the total number of cells and expected number of clusters. Smaller number will lead to more fragmented clusters. (Default: 30)
- --clusterResolution, -cr
Resolution parameter for clustering. Values lower than 1.0 would result in less clusters, while higher values lead to splitting of clusters. In most cases, the optimum value would be between 0.8 and 1.2. (Default: 1.0)
Plot options#
- --plotWidth
Output plot width (in cm). (Default: 10)
- --plotHeight
Output plot height (in cm). (Default: 10)
- --plotFileFormat
Possible choices: png, pdf, svg, eps
Image format type. If given, this option overrides the image format based on the plotFile ending. (Default: 'png')
Other options#
- --verbose, -v
Set to see processing messages.
- --version
show program's version number and exit