Use of long-read kallisto output

6 views

Skip to first unread message

Annika Müller

unread,

Jun 18, 2026, 4:03:41 PMJun 18

to kallisto and applications

Heya,

I used lr-kallisto (v 0.52.0) for pseudoalignment and quantification of long reads (bulk RNA-seq) and am planning on performing differential expression analysis using DESeq2.

I have used kallisto with short read data in the past and there I got abundance.h5 which i could import into R using tximport. Since this file does not exist for the long-read approach, I am wondering whether I understood correctly how to proceed.

It is my understanding that I will take the .mtx file, skipping the first two lines, and combine this with the transcripts.txt file. The order of the transcripts corresponds to the number in column 2 of the mtx file. So if in one line it says 1 23 1.41, I will ignore the 1 (because I have bulk sequencing and therefore no cell barcodes) and the estimated abundance of 1.41 will be assigned to the transcript in line 23 of transcripts.txt, yes?

I am wondering why there is no merged file as everything done manually is more prone to error, so I am not sure whether I really understood it correctly. I took a look at the notebook, but since it's all python and I am working with a combination of bash and R, I am not entirely sure whether I understood that correctly.

Any help would be greatly appreciated!

I followed the approach recommended in Using lr-kallisto for quantifying long-read bulk RNA-seq · Issue #456 · pachterlab/kallisto :

apptainer exec \
lr-kallisto.sif \
kallisto index -k 63 -i "${CDS}.idx" "${CDS}"

apptainer exec \
lr-kallisto.sif \
kallisto bus -t 32 --long --threshold 0.7 -x bulk \
-i "${CDS}.idx" -o ${output} "${reads}/${1}.flnc.fastq.gz"

bustools sort -t 32 ${output}/output.bus \
-o ${output}/sorted.bus; \
bustools count ${output}/sorted.bus \
-t ${output}/transcripts.txt \
-e ${output}/matrix.ec \
-o ${output}/count --cm -m \
-g ${CDS}.t2g;

apptainer exec \
lr-kallisto.sif \
kallisto quant-tcc -t 32 \
--long -P ONT -f ${output}/flens.txt \
${output}/count.mtx -i "${CDS}.idx" \
-e ${output}/count.ec.txt \
-o ${output};

The output is the following:

total 14M
-rw-r--r--. 1 17 Jun 18 21:08 count.barcodes.txt
-rw-r--r--. 1 162K Jun 18 21:08 count.ec.txt
-rw-r--r--. 1 123K Jun 18 21:08 count.mtx
-rw-r--r--. 1 104K Jun 18 13:45 flens.txt
-rw-r--r--. 1 473K Jun 18 13:45 index.saved
-rw-r--r--. 1 119K Jun 18 21:16 matrix.abundance.mtx
-rw-r--r--. 1 169K Jun 18 21:16 matrix.abundance.tpm.mtx
-rw-r--r--. 1 7 Jun 18 13:45 matrix.cells
-rw-r--r--. 1 162K Jun 18 13:45 matrix.ec
-rw-r--r--. 1 104K Jun 18 21:16 matrix.efflens.mtx
-rw-r--r--. 1 6 Jun 18 21:16 matrix.fld.tsv
-rw-r--r--. 1 17 Jun 18 13:45 matrix.sample.barcodes
-rw-r--r--. 1 178 Jun 18 13:45 novel.fastq
-rw-r--r--. 1 11M Jun 18 13:45 output.bus
-rw-r--r--. 1 687 Jun 18 13:45 run_info.json
-rw-r--r--. 1 410K Jun 18 21:08 sorted.bus
-rw-r--r--. 1 322K Jun 18 21:16 transcript_lengths.txt
-rw-r--r--. 1 219K Jun 18 21:16 transcripts.txt

Reply all

Reply to author

Forward

0 new messages