Running mfa validate -j 36 only uses one CPU

30 views
Skip to first unread message

David Lukeš

unread,
Dec 2, 2021, 9:42:51 AM12/2/21
to MFA Users

Is this expected behavior? Maybe it runs more jobs in parallel for brief stints that I never notice, but most of the time (and I keep checking regularly), there’s just one CPU running at full throttle.

It’s been a few months since I last used MFA but I don’t remember validation taking such a long time (when ignoring acoustics, which I am).

I’m using 2.0.0b7 on Linux.

Thanks for any advice, including “this is how it’s supposed to work” — at least I’ll know I’m not doing anything wrong :)

Best,

David

David Lukeš

unread,
Dec 2, 2021, 5:09:38 PM12/2/21
to MFA Users
A few more details: the behavior described pertains to the phase after
logging “Setting up training data”. I experience the same behavior
when running mfa train -j 36.

When killed, there’s no remote traceback, just a regular one, which
tells me multiprocessing is not being used? Anyway, here it is:

Cleaning old directory!
INFO - Setting up corpus information...
INFO - Number of speakers in corpus: 592, average number of utterances
per speaker: 186.69932432432432
INFO - Parsing dictionary "mfa_dict" without pronunciation
probabilities without silence probabilities
INFO - Creating dictionary information...
INFO - Setting up training data...
^CTraceback (most recent call last):
File "/home/lukes/miniconda3/envs/aligner/bin/mfa", line 11, in
<module>
sys.exit(main())
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/command_line/mfa.py",
line 797, in main
run_train_acoustic_model(args, unknown)
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/command_line/train_acoustic_model.py",
line 251, in run_train_acoustic_model
train_acoustic_model(args, unknown_args)
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/command_line/train_acoustic_model.py",
line 156, in train_acoustic_model
a = TrainableAligner(
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/aligner/trainable.py",
line 65, in __init__
super(TrainableAligner, self).__init__(
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/aligner/base.py",
line 92, in __init__
self.setup()
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/aligner/base.py",
line 100, in setup
self.corpus.initialize_corpus(self.dictionary,
self.align_config.feature_config)
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/corpus/base.py",
line 725, in initialize_corpus
self.split()
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/corpus/base.py",
line 942, in split
job.output_to_directory(split_dir)
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/multiprocessing/classes.py",
line 2457, in output_to_directory
text_int = self.text_int_scp_data()
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/multiprocessing/classes.py",
line 672, in text_int_scp_data
data[key][u.name] = " ".join(map(str, u.text_int_for_scp()))
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/corpus/classes.py",
line 955, in text_int_for_scp
lookup = self.speaker.dictionary.to_int(t)
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/dictionary/base_dictionary.py",
line 391, in to_int
return self.data().to_int(item)
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/dictionary/base_dictionary.py",
line 216, in data
reversed_word_mapping = self.reversed_word_mapping
File "/home/lukes/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/dictionary/base_dictionary.py",
line 432, in reversed_word_mapping
for k, v in self.words_mapping.items():
KeyboardInterrupt

I first pip installed 2.0.0.b7 (upgraded my previous pip install),
then (noticing the new installation recommendations) switched to the
conda-forge MFA package in the existing conda env, then tried a fresh
conda env with conda-forge, which the current docs recommend as the
optimal way. Results were the same in all three cases.

I then noticed in the changelog that 2.0.0b4 did a “massive refactor”,
including a Job class “to make it easier to generate and keep track of
information about different processes”, which I think is involved in
the traceback above after poking in the source for a bit (sorry, don’t
have much time right now to investigate further).

So I tried a fresh conda env with a pip-installed 2.0.0b3 (the last
version prior to the refactor), and that one does run multiple jobs in
parallel on my machine, which of course makes a big difference. It
looks like training might get done faster than validation (without
acoustic features), which took a few hours.

So anyway — just a heads up to fellow users who might have been
experiencing a much slower MFA than they’re used to: maybe you’re
running into this issue as well :)

Best,

David
Reply all
Reply to author
Forward
0 new messages