What do Pairs_with_singleton refer to in the statistics?

34 views
Skip to first unread message

Clifford Rostomily

unread,
Oct 4, 2022, 6:15:34 PM10/4/22
to HiC-Pro
Hello,

I'm trying to troubleshoot an issue with a dataset we are running HiC-pro on. We are getting a vary large quantity of "Pairs_with_singleton" reads which could indicate a data quality issue. I am a bit confused about what this means though. In the mergeSAM.py file the singleton_counter variable tracks the number of singletons and is the output of the "Pairs_with_singleton" line. However, this counter only increments if exactly one read is not mapped. Where I get confused is that in the paper Fig. 4 indicates that singleton pairs result from self ligation products or dangling ends. I'm assuming these are filtered at a different step and "Pairs_with_singleton" is just referring to read pairs where one read was not mapped. Is this the correct interpretation?

Thanks,

Clifford Rostomily

nservant

unread,
Oct 5, 2022, 2:38:29 AM10/5/22
to HiC-Pro
Hi,
Singleton means that you only have one of the two reads which is aligned on the reference genome. The other one is not aligned.
In the Fig4 of the paper, the singletons are part of 'invalid pairs', together with dangling-end, self ligation and dumped pairs.
Hope it helps
N
Reply all
Reply to author
Forward
0 new messages