Training TDNN chain model on 25,000 hours of data

Shruthi BS

unread,

Sep 15, 2025, 6:50:56 PM (11 days ago) Sep 15

to kaldi-help

Dear All,

We are training a TDNN chain model using around 25k hours of People Speech English data. Our GPU configuration is 3 L40S cards of 46gb each,756 GB RAM,128 cores.

We are facing out of memory issues in fmllr alignment in the chain/run_ivector_common.sh script.It shows Aborted in terminal.We have reduced nj to 50,but still we face this issue.

What would the ideal way to go forward,please suggest.

Also ways to monitor memory management for big datasets

Thanks,

Shruthi

Jan Yenda Trmal

unread,

Sep 17, 2025, 3:39:07 AM (10 days ago) Sep 17

to kaldi...@googlegroups.com

Hi,

are you using run.pl? I suggest setting up slurm, even if on a single machine. Run.pl does not do any resource management.

And you will have to investigate logs for specific errors.

y.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/0291671b-7e4c-4aec-8d25-0d1d61d7abdcn%40googlegroups.com.

Shruthi BS

unread,

Sep 17, 2025, 4:23:12 AM (10 days ago) Sep 17

to kaldi...@googlegroups.com

Thank you for the reply. Yes we are using run.pl. it is on a single machine. Just using slurm.pl in cmd.sh is sufficient?

Regards,
Shruthi

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAFReZQZ-Mg4o_8%2BrTf1dH_WwHkT0ojLkc2_SrqjWN5V8LNQVQg%40mail.gmail.com.

Jan Yenda Trmal

unread,

Sep 17, 2025, 4:30:22 AM (10 days ago) Sep 17

to kaldi...@googlegroups.com

Unfortunately, no -- it's a software package you have to install and configure (https://slurm.schedmd.com/)

y.

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAgE6DhLv0Nu-HBE6vjg3OE-3nkB8gF3KC0agzp-Y6co-em5Rg%40mail.gmail.com.

Reply all

Reply to author

Forward