AI4LAM Speech-to-Text WG, November 25: Community-driven Speech Data with Mozilla Common Voice and Mozilla Data Collective

20 views
Skip to first unread message

Owen King

unread,
Nov 20, 2025, 11:24:10 AMNov 20
to AI4LAM group
The AI4LAM Speech-to-Text Working Group invites you to its next meeting on Tuesday, November 25 at 09:00 US-Pacific | 12:00 US-Eastern | 17:00 UK | 18:00 Central Europe | 03:00 +1 Canberra.

Topic:  Community-driven Speech Data with Mozilla Common Voice and Mozilla Data Collective

The ASR systems we discuss in this group were trained on many thousands of hours of speech data.  Such data have many possible sources.  This week we'll be focusing on community-driven approaches to speech data, with special attention to the Mozilla Common Voice project and the wider Mozilla Data Collective platform.  We'll think about how the LAM community can benefit from these projects and also how we might be able to contribute back to them.  Robert Pugh from Indiana University, who is a Language Community Manager at the Mozilla Data Collective, will be joining us to help us think through these questions!

In advance of the meeting, please take a moment to brainstorm:  What kinds of datasets might benefit the projects you work on?  Do you steward audio collections that could be contributed to a shared dataset?

Note that, because the general AI4LAM Community Call is a week later than usual this month, our Speech-to-Text call will take place immediately following the general call.  Please join us for both!


Cheers,
Owen
(on behalf of the Speech-to-Text WG conveners)

Owen King (he/him)
Metadata Operations Manager
E: owen...@wgbh.org
One Guest Street, Boston, MA 02135

Logo

Instagram Facebook X




Reply all
Reply to author
Forward
0 new messages