Hi,
I want to assemble animal mitogenomes for one bird species. In my first run I ran out of memory when using one threat and 34Gb. Is there any way to estimate the amount of memory needed? And is it best to use a single threat or multi-threat?
This is is the command I used:
get_organelle_from_reads.py -1 D2102046629_reads.1.fq.gz -2 D2102046629_reads.2.fq.gz -R 10 -k 21,45,65,85,105 -F animal_mt -o getOrganbelle_D2102046629_reads_animal_mt_out
And those fastq files are quite large - ~64Gb each which should be enough for getting around 75X for a 1.1Gb genome.
Should I also reduce the "--max-reads"?
Here is the log file,:
2022-01-19 14:18:21,759 - INFO: Pre-reading fastq ...
2022-01-19 14:18:21,759 - INFO: Estimating reads to use ... (to use all reads, set '--reduce-reads-for-coverage inf --max-reads inf')
2022-01-19 14:18:21,966 - INFO: Tasting 100000+100000 reads ...
2022-01-19 14:20:22,906 - INFO: Tasting 500000+500000 reads ...
2022-01-19 14:23:29,634 - INFO: Tasting 2500000+2500000 reads ...
2022-01-19 14:31:51,227 - INFO: Tasting 12500000+12500000 reads ...
2022-01-19 14:57:30,490 - INFO: Tasting 62500000+62500000 reads ...
2022-01-19 17:00:33,719 - INFO: Estimating reads to use finished.
2022-01-19 17:00:33,722 - INFO: Unzipping reads file: D2102046629_reads.1.fq.gz (68180469154 bytes)
2022-01-19 17:26:56,642 - INFO: Unzipping reads file: D2102046629_reads.2.fq.gz (64799842753 bytes)
2022-01-19 17:53:35,958 - INFO: Counting read qualities ...
2022-01-19 17:53:36,569 - INFO: Identified quality encoding format = Sanger
2022-01-19 17:53:36,570 - INFO: Phred offset = 33
2022-01-19 17:53:36,573 - INFO: Trimming bases with qualities (0.00%): 33..33 !
2022-01-19 17:53:36,817 - INFO: Mean error rate = 0.0057
2022-01-19 17:53:36,818 - INFO: Counting read lengths ...
2022-01-19 18:29:57,371 - INFO: Mean = 100.0 bp, maximum = 100 bp.
2022-01-19 18:29:57,372 - INFO: Reads used = 300000000+300000000
2022-01-19 18:29:57,372 - INFO: Pre-reading fastq finished.
2022-01-19 18:29:57,373 - INFO: Making seed reads ...
2022-01-19 18:29:57,467 - INFO: Seed bowtie2 index existed!
2022-01-19 18:29:57,468 - INFO: Mapping reads to seed bowtie2 index ...
2022-01-20 01:49:15,762 - INFO: Mapping finished.
2022-01-20 01:49:15,787 - INFO: Seed reads made: getOrganbelle_D2102046629_reads_animal_mt_out/seed/animal_mt.initial.fq (1244321 bytes)
2022-01-20 01:49:15,794 - INFO: Making seed reads finished.
2022-01-20 01:49:15,794 - INFO: Checking seed reads and parameters ...
2022-01-20 01:49:15,794 - INFO: The automatically-estimated parameter(s) do not ensure the best choice(s).
2022-01-20 01:49:15,795 - INFO: If the result graph is not a circular organelle genome,
2022-01-20 01:49:15,796 - INFO: you could adjust the value(s) of '-w'/'-R' for another new run.
2022-01-20 01:49:25,941 - INFO: Pre-assembling mapped reads ...
2022-01-20 01:50:54,887 - INFO: Pre-assembling mapped reads finished.
2022-01-20 01:50:54,888 - INFO: Estimated animal_mt-hitting base-coverage = 40.13
2022-01-20 01:50:55,290 - INFO: Estimated word size(s): 61
2022-01-20 01:50:55,291 - INFO: Setting '-w 61'
2022-01-20 01:50:55,291 - INFO: Setting '--max-extending-len inf'
2022-01-20 01:50:55,404 - INFO: Checking seed reads and parameters finished.
2022-01-20 01:50:55,405 - INFO: Making read index ...
slurmstepd: error: Job 30000873 exceeded memory limit (35656317952 > 35651584000), being killed
slurmstepd: error: Exceeded job memory limit
slurmstepd: error: *** JOB 30000873 ON node920 CANCELLED AT 2022-01-20T02:58:01 ***