Why possibleIntraInRangeCount calculated this way

16 views
Skip to first unread message

yinjian...@gmail.com

unread,
Jul 12, 2021, 10:01:41 PM7/12/21
to Fit-Hi-C
Hi,
I am reading the fithic.py script, and find it hard to understand line 622-642, the effect of which seems to double the number of possibleIntraInRangeCountPerChr calculated in line 618. I understand that the possibleIntraInRangeCount is used as the total number of multiple tests for q-value calculation, which I think should just be the sum of (n bins - d distance) looping over all chromosomes and at all distances within the 30kb-2Mb range.  But could someone explain to me why this sum has to be doubled?
Thanks,
Jenny

Attached here are line 622-642 in fithic.py:

# condition added - sourya

if (len(binStats) > 0) and (binTracker in binStats):

currBin = binStats[binTracker]

minOfBin = currBin[0][0]

maxOfBin = currBin[0][1]

while not (minOfBin<=intxnDistance<=maxOfBin):

binTracker += 1

if binTracker not in binStats:

binTracker-=1

currBin = binStats[binTracker]

minOfBin = currBin[0][0]

maxOfBin = currBin[0][1]

break

else:

currBin = binStats[binTracker]

minOfBin = currBin[0][0]

maxOfBin = currBin[0][1]

currBin[7]+=npairs

currBin[1]+=npairs

currBin[3]+=(float(intxnDistance/distScaling)*npairs)

possibleIntraInRangeCountPerChr += npairs


Reply all
Reply to author
Forward
0 new messages