Re:Leafcutter results

347 views
Skip to first unread message

durgabhavan...@gmail.com

unread,
Mar 5, 2019, 1:00:45 AM3/5/19
to leafcutter-users

I have successfully run the commands and finished the analysis using leafcutter. But deferentially spliced intron for p.value(FDR),0.05 are more than 10000 deferentially spliced introns but other publication got only 80-100 deferentially spliced introns. I wanted to know where I have gone wrong. Below are the list of commands I have used. It would be great if you anyone can help me on this

 

Step1) 


for bamfile in `ls /data/STAR_BAM/*.bam`

do

    echo Converting $bamfile to $bamfile.junc

    sh /home/softwares/leafcutter/leafcutter/scripts/bam2junc.sh $bamfile $bamfile.junc

    echo $bamfile.junc >> test_juncfiles.txt

done


Step2)


 python ../clustering/leafcutter_cluster.py -j test_juncfiles.txt -m 50 -o testYRIvsEU -l 500000


Step3)


/leafcutter/scripts/leafcutter_ds.R --num_threads 16 --exon_file=/data/leafcutter/scripts/gencode.v29.exons.txt.gz ./testYRIvsEU_perind_numers.counts.gz ./groups_file.txt

Jack Humphrey

unread,
Mar 5, 2019, 8:21:59 AM3/5/19
to leafcutter-users
Hi! Thanks for running Leafcutter. 

Have you tried exploring your results with LeafViz? I'd be curious to see what your PCA looks like to have so many significant introns. How many samples are you working with? Is there a potential batch effect you're not correcting for?



Yang Li

unread,
Mar 5, 2019, 8:43:55 AM3/5/19
to durgabhavan...@gmail.com, leafcutter-users
Hi there,

I would also rank by the delta PSI. Small differences in PSI may be statistical significant but not biologically significant.

Best,
Yang

--
You received this message because you are subscribed to the Google Groups "leafcutter-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to leafcutter-use...@googlegroups.com.
To post to this group, send email to leafcutt...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/leafcutter-users/05fc8355-d36b-460b-85eb-043e7811dfce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

durgabhavan...@gmail.com

unread,
Mar 6, 2019, 4:59:54 AM3/6/19
to leafcutter-users
Hi Jack 

Thanks for the prompt reply. 

I am working with 250 samples. And i have not used WASP tool, is it the reason for so many thousand of differential splicing introns??
I have also explored the results with LeafViz and it shows all the cluster annotated


Regards
DB

durgabhavan...@gmail.com

unread,
Mar 6, 2019, 5:10:53 AM3/6/19
to leafcutter-users
Dear Yang

Thanks for the response.

i tried that still i am getting 8000 Differential spliced introns. Is using WASP mentioned in the leafcutter workflow essential as explanied in the leafcutter article below. 

But WASP is not used in the main leafcutter usage tutorial (bit confusing)

"Overview of LeafCutter. (a) LeafCutter uses split reads to uncover alternative choices of intron excision by finding introns that share splice sites. In this example, LeafCutter identifies two clusters of variably excised introns. (b) LeafCutter workflow. First, short reads are mapped to the genome. When SNP data are available, WASP should be used to filter allele-specific reads that map with a bias. Next, LeafCutter extracts junction reads from .bam files, identifies alternatively excised intron clusters, and summarizes intron usage as counts or proportions"



Thanks
DB

atla goutham

unread,
Mar 6, 2019, 5:22:14 AM3/6/19
to leafcutter-users
If you have 250 samples, I would imagine there will be a lot of heterogeneity. To remove splicing events happening just by chance, I would suggest to set min.number.of.samples that an intro excision ratio is detected to a number that is ~ 60% of the samples in one group in leafcutter_ds script, which is currently set to very low number ( I think its 3 or 4).

Applying WASP may improve results but doesn't drastically change the results.

Goutham A

durgabhavan...@gmail.com

unread,
Mar 6, 2019, 5:41:41 AM3/6/19
to leafcutter-users
Hi 

Below are the PCA and summary 

PCA.JPG

result.JPG


samples size is 151


Thanks
DB

On Tuesday, March 5, 2019 at 6:51:59 PM UTC+5:30, Jack Humphrey wrote:

durgabhavan...@gmail.com

unread,
Mar 6, 2019, 8:14:45 AM3/6/19
to leafcutter-users
Hi Goutham

Thank for the reply. Yes its set as 3 now and will including -g 50 and will check if the DSI number decreases.


Regards
DB

atla goutham

unread,
Mar 6, 2019, 9:30:35 AM3/6/19
to leafcutter-users
I guess you should consider: --min_samples_per_intron rather than --min_samples_per_group. Also you can tweak --min_coverage

Yang Li

unread,
Mar 6, 2019, 10:02:32 AM3/6/19
to durgabhavan...@gmail.com, leafcutter-users
WASP is not required for differential splicing analysis, unless your case control samples are just samples from 2 individuals.

Yang Li

unread,
Mar 6, 2019, 10:04:34 AM3/6/19
to durgabhavan...@gmail.com, leafcutter-users
Finding more differential splicing events should be a good thing, I would first advise to see whether you can recapitulate findings from the paper you mention.

--
You received this message because you are subscribed to the Google Groups "leafcutter-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to leafcutter-use...@googlegroups.com.
To post to this group, send email to leafcutt...@googlegroups.com.

durgabhavan...@gmail.com

unread,
Mar 11, 2019, 2:05:03 AM3/11/19
to leafcutter-users

Hi Jack 

Tired the analysis with less number of samples i.e 18. I have got only 8 DSEs. But with 150 samples its giving me more than 10000 DSEs. Any idea 


Thanks
DB

On Tuesday, March 5, 2019 at 6:51:59 PM UTC+5:30, Jack Humphrey wrote:

durgabhavan...@gmail.com

unread,
Mar 11, 2019, 2:05:27 AM3/11/19
to leafcutter-users
No Difference Goutham

atla goutham

unread,
Mar 11, 2019, 6:21:37 AM3/11/19
to leafcutter-users
 Can you make boxplots between two conditions for intron ratios for few of the significant spliced events ? randomly chose 10 events and plot boxplots or swarm plots

durgabhavan...@gmail.com

unread,
Mar 11, 2019, 7:27:59 AM3/11/19
to leafcutter-users
Hi Goutham

Tired the analysis with other comparison around 100 samples I have got only 18 DSEs. But with 150 samples its giving me more than 10000 DSEs. Any idea 


Thanks
DB

durgabhavan...@gmail.com

unread,
Mar 14, 2019, 6:46:43 AM3/14/19
to leafcutter-users
What is the cutoff value for delta PSI. Is it 0.1


Thanks
DB

On Tuesday, March 5, 2019 at 7:13:55 PM UTC+5:30, Yang Li wrote:

Jack Humphrey

unread,
Mar 14, 2019, 3:44:35 PM3/14/19
to leafcutter-users
The delta PSI threshold is up to you but commonly in the literature people use 10%. 
Reply all
Reply to author
Forward
0 new messages