Kallisto using absurd amounts of memory building an index

Samuel Clamons

Jun 24, 2022, 8:02:37 PM6/24/22
Hi everyone!

I'm trying to build an index of a modified mouse index (mouse + some spike-in controls). When I run with

kallisto index -i genome.idx genome.fa

it gives me the following:

[build] loading fasta file genome.fa
[build] k-mer length: 31
[build] warning: clipped off poly-A tail (longer than 10)
        from 91 target sequences
[build] warning: replaced 78088275 non-ACGUT characters in the input sequence
        with pseudorandom nucleotides
[build] counting k-mers ...
Job 3149667.1 memory limit 128G exceeded.

The last two lines are the SGE cluster kicking me off for using more than my allotted 128G of RAM! Any idea what could be causing this problem?

Using 0.46.1. 



Matthew Harke

Aug 4, 2022, 11:01:02 AM8/4/22
I have a similar question and wonder if there is a way to limit the memory usage of the indexing process to accommodate larger references. For instance, I have a reference that includes 6,357,353 reconstructed genes from the Tara Oceans expedition that I'd like to us Kalisto on for some metatranscriptomes. I've tried allocating a half of terabyte of memory but it still fails with memory limit exceeded.
