Word timings - is there a way of knowing how long the gaps between words are?

116 просмотров
Перейти к первому непрочитанному сообщению

Mark Boas

не прочитано,
28 мар. 2019 г., 14:07:3728.03.2019
– aeneas-forced-alignment
Hi All! 👋

I'm new to this forum, but have been following progress on Aeneas for a while now.

I think it's a great tool and this looks like a great community :)

We're thinking about using Aeneas in a number of projects, which is exciting.

We will need word level timings so I'm using 1.7.3 and I separated the text file I'm submitting so each word is on a separate line and am using the flag --presets-word to get more accurate timings.

If I'm understanding the docs correctly adding --presets-word should handle the MFCC nonspeech masking for me. Perhaps I'm misunderstanding the definition of a multi line text format or need to fine tune manually.

My text file looks like this:

I'm
used
to
the
idea
of
dying
while
I
have
no
desire
to
die
for
the
like's
of
you.

And I'm trying to align with this file: http://www.barbneal.com/wp-content/uploads/kirk07.mp3

Basically I run something like:

python -m aeneas.tools.execute_task \
    kirk07.mp3 \
    kirk07.txt \
    "task_language=eng|os_task_file_format=json|is_text_type=plain" \
    kirk07map.json --presets-word


The only issue is that I'm not seeing gaps between words in the outputted JSON timings. ie the end time of a word matches the start time of the next word, which is especially problematic when there are large pauses in the speech.

Here's an example of the output I get:

  {
   "begin": "1.240",
   "children": [],
   "end": "1.320",
   "id": "f000007",
   "language": "eng",
   "lines": [
    "dying"
   ]
  },
  {
   "begin": "1.320",
   "children": [],
   "end": "1.800",
   "id": "f000008",
   "language": "eng",
   "lines": [
    "while"
   ]
  },


There's a significant pause between "dying" and "while" which is not reflected in the output, and actually "while" occurs sometime after 4 seconds, although I wonder if that is a separate issue.

I've looked back through some of the discussions in this forum but to no avail, any suggestions are gratefully received.

Thanks in advance!

Mark


Ответить всем
Отправить сообщение автору
Переслать
0 новых сообщений