scJSD

This tool samples regions in the genome from BAM files and compares the cumulative read coverages for each cell on those regions. to a synthetic cell with poisson distributed reads using Jansson Shannon Distance. Cells with high enrichment of signals show a higher JSD compared to cells whose signal is homogenously distrubuted.

usage: An example usage is: plotFingerprint -b treatment.bam control.bam -plot fingerprint.png

Input/Output options

--bamfiles, -b

List of indexed bam files separated by spaces.

--barcodes, -bc

A single-column file containing barcodes (whitelist) to be used for the analysis.

--outFile, -o

The file to write results to. For scFilterStats, scFilterBarcodes and scJSD, the output file is a .txt file. For other tools, the output file is an updated .loom object with the result of the requested operation.

BAM processing options

--cellTag, -ct

Name of the BAM tag from which to extract barcodes.

--groupTag, -gt

In case of a groupped BAM file, such as the one containing Read Group (RG) or Sample (SM) tag,it is possible to process group the reads using the provided –groupTag argument. NOTE: In case of such input, please ensure that the –labels argument indicates the expected group labels contained in the BAM files. The –groupTag along with the –cellTag is then used to identify unique samples (cells) from the input.

--numberOfProcessors, -p

Number of processors to use. Type “max/2” to use half the maximum number of processors or “max” to use all available processors. (Default: 1)

--labels, -l

User defined labels instead of default labels from file names. Multiple labels have to be separated by a space, e.g. –labels sample1 sample2 sample3

--smartLabels

Instead of manually specifying labels for the input BAM files, this causes sincei to use the file name after removing the path and extension.

--blackListFileName, -bl

A BED or GTF file containing regions that should be excluded from all analyses. Currently this works by rejecting genomic chunks that happen to overlap an entry. Consequently, for BAM files, if a read partially overlaps a blacklisted region or a fragment spans over it, then the read/fragment might still be considered. Please note that you should adjust the effective genome size, if relevant.

--binSize, -bs

Size of the bins, in bases, to calculate coverage (Default: 10000)

Read Processing Options

--minMappingQuality

If set, only reads that have a mapping quality score of at least this are considered.

--samFlagInclude

Include reads based on the SAM flag. For example, to get only reads that are the first mate, use a flag of 64. This is useful to count properly paired reads only once, as otherwise the second mate will be also considered for the coverage. (Default: None)

--samFlagExclude

Exclude reads based on the SAM flag. For example, to get only reads that map to the forward strand, use –samFlagExclude 16, where 16 is the SAM flag for reads that map to the reverse strand. (Default: None)

--minFragmentLength

The minimum fragment length needed for read/pair inclusion. This option is primarily useful in ATACseq experiments, for filtering mono- or di-nucleosome fragments. (Default: 0)

--maxFragmentLength

The maximum fragment length needed for read/pair inclusion. (Default: 0)

Read Filtering Options

--duplicateFilter

Possible choices: start_bc, start_bc_umi, start_end_bc, start_end_bc_umi

How to filter for duplicates? Different combinations (using start/end/umi) are possible. Read start position and read barcode are always considered. Default (None) would consider all reads. Note that in case of paired end data, both reads in the fragment are considered (and kept). So if you wish to keep only read1, combine this option with samFlagInclude

Optional arguments

--numberOfSamples, -n

The number of bins that are sampled from the genome, for which the overlapping number of reads is computed. (Default: 100000.0)

--skipZeros

If set, then regions with zero overlapping readsfor all given BAM files are ignored. This will result in a reduced number of read counts than that specified in –numberOfSamples

Other options

--verbose, -v

Set to see processing messages.

--version

show program’s version number and exit