Different length in different conditions

42 views
Skip to first unread message

Torstein Tengs

unread,
Jun 26, 2024, 4:21:37 AM6/26/24
to rMATS User Group
Hi!

I have a data set where the input fastq-files have the same length within the file, but the different conditions have different read lengths. For instance, say, I want to contrast a set of paired-end reads from one condition with read length 100, with a set of paired-end reads where the read length is 75. What do I set as --readLength, and do I use the --variable-read-length option?

Thanks,

-Torstein

kutsc...@gmail.com

unread,
Jun 26, 2024, 9:06:55 AM6/26/24
to rMATS User Group
One option is to use --variable-read-length. Another option is to run multiple prep steps where each prep step uses the correct --readLength for that file. See this post: https://github.com/Xinglab/rmats-turbo/issues/83

Eric

Thomas Danhorn

unread,
Jun 26, 2024, 2:22:25 PM6/26/24
to kutsc...@gmail.com, rMATS User Group
To maximize comparability, you can also trim the 100-nt reads back to 75
nt (at the cost of losing information). But be aware that if the read
length is different, a lot of other things are likely different as well
(kits for RNA extraction, library prep, etc., maybe other variables
besides the one are controling), so you won't know if the results you are
picking up are actually because of your condition or because of batch
effects. If you need to have batches for some reason, the only proper way
is to not confound them with your groups.

Best,

Thomas
> --
> You received this message because you are subscribed to the Google Groups "rMATS User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rmats-user-gro...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/rmats-user-group/86840623-2913-45c5-84eb-6b579558781an%40googlegroups.com.
>

Torstein Tengs

unread,
Jun 27, 2024, 9:47:32 AM6/27/24
to rMATS User Group
Thanks! Turns out only a small number of the samples I have downloaded data from have read length 75, so I just excluded those. The rest either have 100 or 101, so I just trimmed them down to 100 and omitted --variable-read-length. Seems to work really well - and fast :-)
Reply all
Reply to author
Forward
0 new messages