Hello Bo,
Thank you for your answer!
I tried to align against the transcripts, which resolves the warning, but the names do not match:
$ rsem-calculate-expression --alignments 361174.bam ../../reference/rsem/gencode.v24 361174
rsem-parse-alignments ../../reference/rsem/gencode.v24 361174.temp/361174 361174.stat/361174 361174.bam 1 -tag XM
RSEM can not recognize reference sequence name ENST00000456328.2|ENSG00000223972.5|OTTHUMG00000000961.2|OTTHUMT00000362751.1|DDX11L1-002|DDX11L1|1657|processed_transcript|!
"rsem-parse-alignments ../../reference/rsem/gencode.v24 361174.temp/361174 361174.stat/361174 361174.bam 1 -tag XM" failed! Plase check if you provide correct parameters/options for the pipeline!
Then I simplified the RSEM reference, by removing the GTF
rsem-prepare-reference ../gencode-GRC38/gencode.v24.transcripts.fa gencode.v24.trans
Finally, the names matched, but I got the following:
$ rsem-calculate-expression --alignments 361174.bam ../../reference/rsem/gencode.v24.trans 361174
rsem-parse-alignments ../../reference/rsem/gencode.v24.trans 361174.temp/361174 361174.stat/361174 361174.bam 1 -tag XM
Read WINDU:108:C9AY9ANXX:2:1308:9460:47184: RSEM currently does not support gapped alignments, sorry!
"rsem-parse-alignments ../../reference/rsem/gencode.v24.trans 361174.temp/361174 361174.stat/361174 361174.bam 1 -tag XM" failed! Plase check if you provide co rrect parameters/options for the pipeline!
I do not know if there is any particular option in HISAT2 that would make this work, but at this point I'm thinking it's probably not (easily) doable. For me, the main benefit of HISAT2 would be the inclusion of SNPs, which slightly increases mapping rate and also the fact that it can run in a machine with less than 32GB RAM. I don't know if you are willing to consider integration of HISAT2 in a future RSEM version (if it's doable!), but I'll just use STAR+RSEM for the moment.
Thanks a lot for your help!
Petros