mfa align --speaker_characters results in Key Error

26 views
Skip to first unread message

Thea Knowles

unread,
Aug 3, 2023, 2:32:13 PM8/3/23
to MFA Users
Note: I also posted this as an issue on GitHub here. Any thoughts appreciated, and also wondering if others are running into this?? Thanks in advance for any input!

When running mfa align with the --speaker_characters (-s) flag for speaker adaptation, alignment fails. The KeyError at the end is the id for a single speaker. This error also occurs for other mfa commands (align, validate, train, etc.)

I have tested this with several different corpora and with several different naming conventions (directories for each speaker, speaker IDs as initial 4 characters, speaker IDs also as transcript textgrid tier names in the input) and on 3 different machines/MacOSX versions (older OSX and two newer M1, M2, details in the GitHub issue). All mfa commands run as expected without the -s flag and I'm able to validate/align my data as usual. 

I'm using the most recent MFA version (2.2.15), Python 3.11.4, and always run with the --clean flag.

Here's a recent example of my output when running mfa align. The same errors occur with mfa validate etc. For context, my filenames in this run look something like this: oc01_01.wav, where oc02 is a speaker ID followed by other info. The Key Error at the end of the output is oc01.

Command & Output:

(aligner) thea ~ % mfa align --clean -s 4 /Users/thea/Documents/z_test/0_test english_us_arpa english_us_arpa /Users/thea/Documents/z_test/2_output_aligned_test    

 INFO     Setting up corpus information...                                      

 INFO     Loading corpus from source files...                                   

   1% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/100  [ 0:00:02 < -:--:-- , ? it/s ]

 INFO     Stopped parsing early (0.06491400000000169 seconds)                   

 ERROR    There was an error in the run, please see the log.                    

Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x189e80e50>>

Traceback (most recent call last):

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/command_line/mfa.py", line 98, in history_save_handler

    raise self.exception

  File "/Users/thea/miniconda3/envs/aligner/bin/mfa", line 10, in <module>

    sys.exit(mfa_cli())

             ^^^^^^^^^

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1157, in __call__

    return self.main(*args, **kwargs)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/rich_click/rich_group.py", line 21, in main

    rv = super().main(*args, standalone_mode=False, **kwargs)

         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1078, in main

    rv = self.invoke(ctx)

         ^^^^^^^^^^^^^^^^

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1688, in invoke

    return _process_result(sub_ctx.command.invoke(sub_ctx))

                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1434, in invoke

    return ctx.invoke(self.callback, **ctx.params)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 783, in invoke

    return __callback(*args, **kwargs)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func

    return f(get_current_context(), *args, **kwargs)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/command_line/align.py", line 113, in align_corpus_cli

    aligner.align()

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/alignment/pretrained.py", line 412, in align

    self.setup()

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/alignment/pretrained.py", line 205, in setup

    self.load_corpus()

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/acoustic_corpus.py", line 1209, in load_corpus

    self._load_corpus()

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/base.py", line 1288, in _load_corpus

    self._load_corpus_from_source_mp()

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/acoustic_corpus.py", line 1023, in _load_corpus_from_source_mp

    import_data.add_objects(self.generate_import_objects(file))

                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/base.py", line 1018, in generate_import_objects

    "speaker_id": self._speaker_ids[u.speaker_name],

                  ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^

KeyError: 'oc01'

Thea Knowles

unread,
Aug 8, 2023, 4:09:29 PM8/8/23
to MFA Users
Update: while -s still doesn't work as expected, I did figure out a workaround for correctly triggering speaker adaptation (on by default), which is to ensure that the tiers containing your transcripts are named with the appropriate speaker IDs. See the resolution & a praat script for editing your textgrids at the original issue: https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues/669#issuecomment-1668621720

FYI i think this post is also related to the speaker adaptation issue (namely, you need your tiers labelled; directories/file names are not enough): https://groups.google.com/g/mfa-users/c/dXAgCXF0eeI/m/oVQXqg1LCAAJ

Reply all
Reply to author
Forward
0 new messages