Hi,
I'm running populations (2.41) and
trying to output to a --phylip format. Here, I have 47 individuals from
which I define 47 distinct groups. The dataset is not that big
(catalog.calls is 79M and catalog.fa.gz is 6.9MB). However populations
dies after a <1min. It seems to be due to a memory issue. I can see
that memory usage slowly creeps up to >14GB until it dies. It's odd
that it would take that much memory: the sample size and data are not
that big. I think this has to do with me
defining 47 populations out of 47 distinct groups, but why would that not be valid or consume that much memory (I want to get one sequence per individual)?
Thanks for your help,
sebastien
$ populations -P . -t 2 -M pop_labels --phylip
populations -P . -t 2 -M pop_labels --phylip
Logging to './populations.log'.
Locus/sample distributions will be written to './populations.log.distribs'.
populations parameters selected:
Percent samples limit per population: 0
Locus Population limit: 1
Percent samples overall: 0
Minor allele frequency cutoff: 0
Maximum observed heterozygosity cutoff: 1
Applying Fst correction: none.
Pi/Fis kernel smoothing: off
Fstats kernel smoothing: off
Bootstrap resampling: off
Parsing population map...
The population map contained 47 samples, 47 population(s), 1 group(s).
Working on 47 samples.
Working on 47 population(s):
Alyssa: Alyssa_sorted
B08: B08_sorted
Bialobrzeskie: Bialobrzeskie_sorted
C11: C11_sorted
CAN100_01: CAN100_01_sorted
CAN16_94: CAN16_94_sorted
CAN17_95: CAN17_95_sorted
CAN18_95: CAN18_95_sorted
CAN19_87: CAN19_87_sorted
CAN20_02: CAN20_02_sorted
CAN22_88: CAN22_88_sorted
CAN23_99: CAN23_99_sorted
CAN24_89: CAN24_89_sorted
CAN26_93: CAN26_93_sorted
CAN28_01: CAN28_01_sorted
CAN29_94: CAN29_94_sorted
CAN37_97: CAN37_97_sorted
CAN39_98: CAN39_98_sorted
CAN40_99: CAN40_99_sorted
Carmagnola: Carmagnola_sorted
Carmen: Carmen_sorted
Chameleon: Chameleon_sorted
D12: D12_sorted
Delores: Delores_sorted
E11: E11_sorted
F01: F01_sorted
Fedora17: Fedora17_sorted
Fedora19: Fedora19_sorted
Fedrina: Fedrina_sorted
Felina: Felina_sorted
Ferimon: Ferimon_sorted
Futura77: Futura77_sorted
Jus: Jus_sorted
K110: K110_sorted
Kompolti: Kompolti_sorted
LKCSD: LKCSD_sorted
Novosadska: Novosadska_sorted
Petera: Petera_sorted
Silesia: Silesia_sorted
Suditalien: Suditalien_sorted
Tygra: Tygra_sorted
Uniko: Uniko_sorted
VIR541: VIR541_sorted
VIR569: VIR569_sorted
VIR575: VIR575_sorted
VIR577: VIR577_sorted
Zolotonsha15: Zolotonsha15_sorted
Working on 1 group(s) of populations:
defaultgrp: Alyssa, B08, Bialobrzeskie, C11, CAN100_01, CAN16_94,
CAN17_95, CAN18_95, CAN19_87, CAN20_02, CAN22_88, CAN23_99, CAN24_89,
CAN26_93, CAN28_01, CAN29_94, CAN37_97, CAN39_98, CAN40_99, Carmagnola,
Carmen, Chameleon, D12, Delores, E11, F01, Fedora17, Fedora19, Fedrina,
Felina, Ferimon, Futura77, Jus, K110, Kompolti, LKCSD, Novosadska,
Petera, Silesia, Suditalien, Tygra, Uniko, VIR541, VIR569, VIR575,
VIR577, Zolotonsha15
Fixed difference sites in Phylip format will be written to './populations.fixed.phylip'
Genotyping markers will be written to './populations.markers.tsv'
Raw Genotypes/Haplotypes will be written to './populations.haplotypes.tsv'
Population-level summary statistics will be written to './populations.sumstats.tsv'
Population-level haplotype summary statistics will be written to './populations.hapstats.tsv'
Processing data in batches:
* load a batch of catalog loci and apply filters
* compute SNP- and haplotype-wise per-population statistics
* write the above statistics in the output files
* export the genotypes/haplotypes in specified format(s)
More details in './populations.log.distribs'.
Now processing...
Killed