Understanding output in *_allValidPairs file, allele specific Hi-C Pro

759 views
Skip to first unread message

tkat...@ucr.edu

unread,
Oct 4, 2017, 7:50:38 PM10/4/17
to HiC-Pro
Hello, I'm new to this sort of work so I apologize for what may be an obvious question, but could someone point me to the documentation describing the output of the *_allValidPairs file? I have read the online manual which describes the categories for the .validpairs file, but the _allValidPairs file seems to have additional information after the strand 2 reads and before fragment size.  An example of the output looks like this:

SRR2240738.552429       chr1    3006382 -       chr1    29614403        +       691     HIC_chr1_3      HIC_chr1_8715   42      42      0-0


I am specifically wondering about the: "272" "HIC_chr1_3" and "HIC_chr1_8715" columns, the 8th, 9th and 10th.
Thank you for your time,
Theo Kataras

nservant

unread,
Oct 5, 2017, 4:23:02 AM10/5/17
to HiC-Pro
Hi Theo,

So the columns are ;
read_name - chromosome of first locus - read position of first locus (currently middle of the reads) - strand of first locus - chromosome of second locus - read position of second locus - strand of second locus - DNA fragment length
Then, the next 4 columns were added for Juicebox compatibility ;
name of first restriction fragment - name of second restiction fragment - mapping quality of first read - mapping quality of second read
Finally the last column of for allele-specific analysis ;
0-0 = first read from genome1 and second read from genome1
1-0 = first read from genome2 and second read from genome1
0-1 = first read from genome1 and second read from genome2
1-1 = first read from genome2 and second read from genome2

Hope it helps
N
Reply all
Reply to author
Forward
0 new messages