sincei.WriteBedGraph module

sincei.WriteBedGraph module#

sincei.WriteBedGraph.scaleCoverage(tile_coverage, args)[source]#

Return coverage per cluster as sum of cells. tileCoverage should be a list with only one element.

class sincei.WriteBedGraph.WriteBedGraph(bamFilesList, binLength=50, barcodes=None, cellTag=None, groupTag=None, groupLabels=None, clusterInfo=None, motifFilter=None, genome2bit=None, GCcontentFilter=None, numberOfSamples=None, numberOfProcessors=1, verbose=False, region=None, bedFile=None, extendReads=False, genomeChunkSize=None, blackListFileName=None, minMappingQuality=None, duplicateFilter=None, chrsToSkip=[], stepSize=None, center_read=False, samFlag_include=None, samFlag_exclude=None, zerosToNans=False, skipZeroOverZero=False, smoothLength=0, minFragmentLength=0, maxFragmentLength=0, minAlignedFraction=0, out_file_for_raw_data=None, bed_and_bin=False, sumCoveragePerBin=False, binarizeCoverage=False, statsList=[], mappedList=[])[source]#

Bases: CountReadsPerBin

Reads bam file coverages and writes a bedgraph or bigwig file

Extends the CountReadsPerBin object such that the coverage of bam files is writen to multiple bedgraph files at once.

The bedgraph files are later merge into one and converted into a bigwig file if necessary.

The constructor arguments are the same as for CountReadsPerBin. However, when calling the run method, the extra parameters function_to_call, func_args, out_file_prefix need to be passed.

Examples#

Given the following distribution of reads that cover 200 on a chromosome named '3R':

  0                              100                           200
  |------------------------------------------------------------|
A                                ===============
                                                ===============


B                 ===============               ===============
                                 ===============
                                                ===============
>>> import tempfile
>>> test_path = os.path.dirname(os.path.abspath(__file__)) + "/test/test_data/"
>>> outFile = tempfile.NamedTemporaryFile()
>>> bam_file = test_path +  "testA.bam"

For the example a simple scaling function is going to be used. This function takes the coverage found at each region and multiplies it to the scaling factor. In this case the scaling factor is 1.5

>>> function_to_call = scaleCoverage
>>> funcArgs = {'scaleFactor': 1.5}

Restrict process to a region between positions 0 and 200 of chromosome 3R

>>> region = '3R:0:200'

Set up such that coverage is computed for consecutive bins of length 25 bp >>> bin_length = 25 >>> step_size = 25

>>> num_sample_sites = 0 #overruled by step_size
>>> c = WriteBedGraph([bam_file], binLength=bin_length, region=region, stepSize=step_size)
>>> c.run(function_to_call, funcArgs, outFile.name)
>>> f = open(outFile.name, 'r')
>>> f.readlines()
['3R\t0\t100\t0\n', '3R\t100\t200\t1.5\n']
>>> f.close()
>>> outFile.close()
run(func_to_call, func_args, out_file_prefix, blackListFileName=None, format='bedgraph', smoothLength=0, normUsing=None)[source]#

Given a list of bamfiles, a function, and function arguments, this method writes a bedgraph file (or bigwig) file for a partition of the genome into tiles of given size and a value for each tile that corresponds to the given function and that is related to the coverage underlying the tile.

Parameters#

func_to_callstr

function name to be called to convert the list of coverages computed for each bam file at each position into a single value.

func_argsdict

dict of arguments to pass to func. E.g. {'scaleFactor':1.0}

out_file_prefixstr

name of the file to save the resulting data.