CITS calling discrepancy? Determining binding site at particular region (e.g. an exon with a large peak) at high resolution

49 views
Skip to first unread message

Steve Coyne

unread,
May 13, 2020, 12:45:43 PM5/13/20
to CTK User Group
Hi Zhang lab,

Prior to our lab shutting down for coronavirus, I was preparing my eCLIP samples for sequencing. I was unfortunately unable to finish before things shut down, but during this time I have made use of your CTK toolset to analyze existing eCLIP data in preparation for my own. Thankfully, there is data relevant to my cell type out there from Van Nostrand.

I have followed and completed your guide to using CTK for eCLIP. I have been starting to try different motif enrichment tools and had some success with MEME-ChIP.

I recently came across your mcross paper and looked up your results in mcross for this dataset (I presume it is the van nostrand data): https://zhanglab.c2b2.columbia.edu/mCrossBase/rbp.php?id=K562.IGF2BP1

Which contains the following #s:

Crosslink sites: 707 (CIMS, deletion) ; 1050 (CIMS, insertion); 14461 (CIMS, substitution); 12770 (CITS)

This is different than my result, but in most cases, not so drastically. I have taken these numbers from the corresponding s30 files (*tag.uniq.del.CIMS.s30.bed) via wc -l file

Crosslink sites: 381 (CIMS, deletion) ; 607 (CIMS, insertion); 10720 (CIMS, substitution); 228 (CITS)

For CITS, the wc -l was taken from the tag.uniq.clean.CITS.s30.bed

How can I go about troubleshooting this discrepancy?

Moving forward, I would like to identify at high resolution the binding site at specific genes that are of interest to me and highly bound in this dataset.

To do this properly, I would imagine accurate calling of CITS is essential, but to be honest, I am not sure how to proceed with this question even if I had confidence in my CITS.

Kind regards,

Steve

Chaolin Zhang

unread,
May 13, 2020, 4:58:47 PM5/13/20
to Steve Coyne, CTK User Group
Did you combine the two replicates?  Are you trying to follow precisely what we did (as described in the method of the paper)?  There might be some subtle differences between the analysis we did for the paper and the recommended protocol we put on our website, as we learn from the process.

Chaolin



--
You received this message because you are subscribed to the Google Groups "CTK User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ctk-user-grou...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ctk-user-group/710d6249-544e-4bcd-9254-b724a768299d%40googlegroups.com.

Steve Coyne

unread,
May 13, 2020, 5:05:58 PM5/13/20
to CTK User Group
Yes, I did combine the two replicates.

I closely followed the posted protocol from your lab website rather than the paper itself.

I'll try to dig in to that for potential discrepancies.

Steve
To unsubscribe from this group and stop receiving emails from it, send an email to ctk-use...@googlegroups.com.

Chaolin Zhang

unread,
May 13, 2020, 5:08:24 PM5/13/20
to Steve Coyne, CTK User Group
One recommendation I have is that you can probably use RBFOX2 for a positive control to see if you could replicate results.  If you combined the replicates, I do not expect the subtle differences can explain the discrepancy you saw.

Chaolin



To unsubscribe from this group and stop receiving emails from it, send an email to ctk-user-grou...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ctk-user-group/abb31423-95b8-4b39-b515-2280fc3be04b%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages