Hi,
When the strand-specific option is used, it is expected that the /2 reads are in the sense orientation whereas the /1 reads are in the anti-sense orientation. This is what dUTP strand-specific RNA-seq reads would look like and it is also illustrated on our wiki:
https://github.com/bcgsc/transabyss/wiki#17-strand-specific-assembly
So, this means your reads are in the exact opposite orientation than expected.
To work around the issue, you just need to rename the /2 reads with a /1 suffix and vice versa.
Hope that helps!
Ka Ming
--
Ka Ming Nip
Graduate Student | Dr. Inanc Birol Lab (BTL)
Canada's Michael Smith Genome Sciences Centre
________________________________________
From:
trans...@googlegroups.com [
trans...@googlegroups.com] On Behalf Of Brian Lawney [
brian...@gmail.com]
Sent: February 8, 2018 9:23 AM
To: Trans-ABySS
Subject: Assembly with and without stranded option
Hello,
I'm playing with TransAbyss by using some mock data and I'm looking for some help with interpreting the results. Namely, I have simulated strand-specific RNA-seq reads (using BioC Polyester), but transAbyss fails when I use the --SS flag.
Details:
I created a mock genome by sampling some human genes (all from the + strand) and creating an accompanying GTF file. I use those files to simulate paired-end RNA-seq reads using Bioconductor's Polyester package. The command (run from within in R session) was:
simulate_experiment(seqpath=SEQPATH, gtf=GTF,
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
strand_specific=T,
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
num_reps=c(1,1),
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
paired=T,
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
fold_changes=c(1,1),
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
reads_per_transcript=1000)
This created paired FASTA for two samples, but I'm only concerned with one of the samples here. Note the "strand_specific=T" tag, which allegedly generated strand-specific simulated reads. I aligned these reads with STAR aligner and I get the IGV screenshot below (colored by "first-of-pair strand") which leads me to believe the reads are indeed strand-specific since all the R1 reads directly align to the + strand. Additionally if I take the R2 read, reverse-complement it, then it aligns to my genome.
If necessary, I put the fasta files in a public storage bucket here:
https://storage.googleapis.com/transabyss-ss-query/sample_01_1.fasta
https://storage.googleapis.com/transabyss-ss-query/sample_01_2.fasta
Anyway, assuming the reads are indeed strand-specific, I put these fasta files through transabyss using the following command (note --SS flag):<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
/trans-abyss/transabyss-1.5.5/transabyss --pe sample_01_1.fasta sample_01_2.fasta --SS --threads 32
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
Building the character occurrence table...
Mateless 0
Unaligned 0
Singleton 49 0.219%
FR 0
RF 0
FF 0
Different 22322 99.8%
Total 22371
error: the histogram `transabyss-3.hist' is empty
sort: write failed: standard output: Broken pipe
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
If I remove the --SS flag, it "works", although I'm not certain it's doing the correct thing. The full log is athttps://
storage.googleapis.com/transabyss-ss-query/log_without_ss.txt <
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
In particular, does this seem reasonable? When I have run TransAbyss on real data before, the FR field was 54.7% and here it is VERY low:<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
Building the character occurrence table...
Mateless 0
Unaligned 0
Singleton 55 0.246%
FR 199 0.89%
RF 0
FF 0
Different 22117 98.9%
Total 22371
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
Any takes on this? Is it just because I have a very small/mock dataset? Or am I missing something? I will still take the successful final contigs and see how they align to my genome, etc. and maybe that will shed some light. Thanks!
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
[
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s320/igv_ss_stranded.png]<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
[
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s320/igv_ss_stranded.png]<
https://lh3.googleusercontent.com/-PclRhlSpsbo/Wnx_-exLGBI/AAAAAAAACNk/gvW1yUDsd0Y4CHKwJYTmBdy74mcXgjcKwCLcBGAs/s1600/igv_ss_stranded.png>
--
You received this message because you are subscribed to the Google Groups "Trans-ABySS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
trans-abyss...@googlegroups.com<mailto:
trans-abyss...@googlegroups.com>.
For more options, visit
https://groups.google.com/d/optout.