mfa align --speaker_characters results in Key Error

27 views

Skip to first unread message

Thea Knowles

unread,

Aug 3, 2023, 2:32:13 PM8/3/23

to MFA Users

Note: I also posted this as an issue on GitHub here. Any thoughts appreciated, and also wondering if others are running into this?? Thanks in advance for any input!

When running mfa align with the --speaker_characters (-s) flag for speaker adaptation, alignment fails. The KeyError at the end is the id for a single speaker. This error also occurs for other mfa commands (align, validate, train, etc.).

I have tested this with several different corpora and with several different naming conventions (directories for each speaker, speaker IDs as initial 4 characters, speaker IDs also as transcript textgrid tier names in the input) and on 3 different machines/MacOSX versions (older OSX and two newer M1, M2, details in the GitHub issue). All mfa commands run as expected without the -s flag and I'm able to validate/align my data as usual.

I'm using the most recent MFA version (2.2.15), Python 3.11.4, and always run with the --clean flag.

Here's a recent example of my output when running mfa align. The same errors occur with mfa validate etc. For context, my filenames in this run look something like this: oc01_01.wav, where oc02 is a speaker ID followed by other info. The Key Error at the end of the output is oc01.

Command & Output:

(aligner) thea ~ % mfa align --clean -s 4 /Users/thea/Documents/z_test/0_test english_us_arpa english_us_arpa /Users/thea/Documents/z_test/2_output_aligned_test

INFO Setting up corpus information...

INFO Loading corpus from source files...

1% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/100 [ 0:00:02 < -:--:-- , ? it/s ]

INFO Stopped parsing early (0.06491400000000169 seconds)

ERROR There was an error in the run, please see the log.

Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x189e80e50>>

Traceback (most recent call last):

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/command_line/mfa.py", line 98, in history_save_handler

raise self.exception

File "/Users/thea/miniconda3/envs/aligner/bin/mfa", line 10, in <module>

sys.exit(mfa_cli())

^^^^^^^^^

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1157, in __call__

return self.main(*args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/rich_click/rich_group.py", line 21, in main

rv = super().main(*args, standalone_mode=False, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1078, in main

rv = self.invoke(ctx)

^^^^^^^^^^^^^^^^

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1688, in invoke

return _process_result(sub_ctx.command.invoke(sub_ctx))

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1434, in invoke

return ctx.invoke(self.callback, **ctx.params)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 783, in invoke

return __callback(*args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func

return f(get_current_context(), *args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/command_line/align.py", line 113, in align_corpus_cli

aligner.align()

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/alignment/pretrained.py", line 412, in align

self.setup()

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/alignment/pretrained.py", line 205, in setup

self.load_corpus()

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/acoustic_corpus.py", line 1209, in load_corpus

self._load_corpus()

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/base.py", line 1288, in _load_corpus

self._load_corpus_from_source_mp()

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/acoustic_corpus.py", line 1023, in _load_corpus_from_source_mp

import_data.add_objects(self.generate_import_objects(file))

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/base.py", line 1018, in generate_import_objects

"speaker_id": self._speaker_ids[u.speaker_name],

~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^

KeyError: 'oc01'

Thea Knowles

unread,

Aug 8, 2023, 4:09:29 PM8/8/23

to MFA Users

Update: while -s still doesn't work as expected, I did figure out a workaround for correctly triggering speaker adaptation (on by default), which is to ensure that the tiers containing your transcripts are named with the appropriate speaker IDs. See the resolution & a praat script for editing your textgrids at the original issue: https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues/669#issuecomment-1668621720

FYI i think this post is also related to the speaker adaptation issue (namely, you need your tiers labelled; directories/file names are not enough): https://groups.google.com/g/mfa-users/c/dXAgCXF0eeI/m/oVQXqg1LCAAJ

Reply all

Reply to author

Forward

0 new messages