STARconsensus: Map reference transcriptome to panhuman reference

58 views

Skip to first unread message

Stevie Pederson

unread,

Sep 20, 2023, 1:34:07 PM9/20/23

to rna-star

Hi,

I've created a panhuman reference following the steps in the Kaminow et al paper and as a simple sanity check, would like to map the GENCODE reference transcriptome onto the panhuman reference to see if any transcripts are excluded from the new reference. This appears to be more challenging than I'd expected.

Initially using a Fasta file for the reference transcripts (--readFilesIn), only the first 30 transcripts were aligned. I then converted the Fasta File to a Fastq file using reformat.sh from the bbmap suite of tools and setting the fake quality scores to be uniformly 40, as follows

reformat.sh in=gencode.v44.transcripts.fa.gz out1=gencode.v44.transcripts.fq.gz qfake=40

I tried again by running STAR with the parameters for long reads (--outFilterMismatchNmax 999 --alignIntronMin 20 --alignIntronMax 1000000) At this point, STAR flagged the first read in the fastq file as having quality scores with a different length to the reads

EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length

Manually checking this revealed the file structure to be correct, as shown by the line lengths for the first read (read & score only)

gunzip -c gencode.v44.transcripts.fq.gz | head -n4 | awk '{print length; }'
92

1657

1
1657

I also tried using seqtk for the conversion from .fa to .fq and received the same error, which leads me to think there's something else going wrong and the error isn't what it appears to be. I'm not sure if it's a bug or if there's something else I'm missing

Has anyone else tried this and is there a way to perform this task?

Thanks in advance,

Stevie

Alexander Dobin

unread,

Sep 22, 2023, 8:56:07 AM9/22/23

to rna-star

Hi Stephen,

Are you trying to map full-length transcripts? This would require STARlong, as standard STAR work with read lengths up to 300b.

Reply all

Reply to author

Forward

0 new messages