Hi,
Hoping someone can help. It seems like it is taking a long time to run calculate-expression on my samples, but this is my first time so I could be wrong and it could be normal. Here's the code I used:
I built the reference based on the tutorial:
rsem-refseq-extract-primary-assembly
ref/GCF_000001405.39_GRCh38.p13_genomic.fna ref/GCF_000001405.39_GRCh38.p13_genomic.primary_assembly.fna
rsem-prepare-reference
--gtf GCF_000001405.39_GRCh38.p13_genomic.gtf \
-- trusted-sources BestRefSeq,Curated\ Genomic \
--bowtie2 \
GCF_000001405.39_GRCh38.p13_genomic.primary_assembly.fna \ ref/human_refseq
rsem-calculate-expression -p 32 \
--paired-end \
--bowtie2 \
--estimate-rspd \
--append-names \
--output-genome-bam \
--time\
${filename}_R1_001.fastq \
${filename}_R2_001.fastq \
ref/RefSeq/human_refseq \
exp/$folder
For one particular sample, I have 120M reads.
The time output is:
Aligning reads: 8136 s. ~ 2 hrs
Estimating expression levels: 43243 s. ~ 12 hrs
My computer is running on Linux, the CPU is Intel core i9 with 36 cores, and 134 GB RAM.
Is this normal, or is there anything I can do to speed it up?
Thanks,
Jared