I am having difficulties getting Ballgown to read data output from Stringtie.
I map reads with tophat2, find transcripts with stringtie, merge the transcripts on different samples with cuffmerge, and then generate count data with stringtie like
stringtie accepted_hits.bam -o ./stringtie2.gtf -p 4 -G merged.gtf -B -e
This generates the same stringtie2.gtf for each sample, but the i_data.ctab files are different in that some lines are missing from one compared to another, eg
i_id chr strand start end rcount ucount mrcount
1 chr1 + 12228 12612 0 0 0.00
2 chr1 + 12722 13220 1 0 0.33
3 chr1 - 14830 14969 83 0 26.92
4 chr1 - 15039 15795 22 0 6.78
5 chr1 - 15948 16606 25 0 9.17
6 chr1 - 16766 16857 25 0 12.50
see line 5 vs below for another sample
i_id chr strand start end rcount ucount mrcount
1 chr1 + 12228 12612 0 0 0.00
2 chr1 + 12722 13220 0 0 0.00
3 chr1 - 14830 14969 155 0 44.92
4 chr1 - 15039 15795 45 0 10.18
5 chr1 - 16766 16857 204 0 98.65
Ballgown complains about the input:
Error in ballgown(samples = c(... :
intron ids were either not the same or not in the same order across samples. double check i_data.ctab for each sample.
I can try using Tablemaker per the Ballgown vignette, but the stringtie manual suggests that I should be able to use stringtie output directly.
I am using stringtie-1.0.4.Linux_x86_64 and ballgown_2.0.0
R version 3.2.0 (2015-04-16)
Thanks,
Vince