Training with mp3 data files

542 views
Skip to first unread message

Yoav Ramon

unread,
May 23, 2019, 4:11:18 AM5/23/19
to kaldi-help
We have a large dataset of files, but they all were encoded as mp3 files by the company that saved them, and we were wondering if that data is still usable in training an ASR system.
I wondered if anyone had any experience in training an ASR with mp3 data and how much it degraded the performance or if anyone knows any articles regarding that subject.

Thanks,
Yoav

Jan Trmal

unread,
May 23, 2019, 4:15:17 AM5/23/19
to kaldi-help
If you have the chance to train on it and the mp3 does not have some very low bitrate setup, then it should be fine (I think for speech at least 128kbit/s you won't have significant degradation, if any).
y.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/96c08b3b-7223-4f87-93e8-ab20d1bdcfa2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yoav Ramon

unread,
May 23, 2019, 4:28:31 AM5/23/19
to kaldi-help
Unfortunately, the data is highly compressed. 24 kbps and 16 kHz

בתאריך יום חמישי, 23 במאי 2019 בשעה 11:15:17 UTC+3, מאת Yenda:
If you have the chance to train on it and the mp3 does not have some very low bitrate setup, then it should be fine (I think for speech at least 128kbit/s you won't have significant degradation, if any).
y.

On Thu, May 23, 2019 at 10:11 AM Yoav Ramon <ramo...@gmail.com> wrote:
We have a large dataset of files, but they all were encoded as mp3 files by the company that saved them, and we were wondering if that data is still usable in training an ASR system.
I wondered if anyone had any experience in training an ASR with mp3 data and how much it degraded the performance or if anyone knows any articles regarding that subject.

Thanks,
Yoav

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Jan Trmal

unread,
May 23, 2019, 4:30:41 AM5/23/19
to kaldi...@googlegroups.com
Then you will see some impact but I think it won't make things completely terrible. I would rather take 500hrs mp3s than 50 hrs wavs.

Y.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Jan Trmal

unread,
May 23, 2019, 4:42:13 AM5/23/19
to kaldi...@googlegroups.com
To give you some numbers, I was looking at a performance of some Telco codecs and other compressions schemes on AMI and I don't have mp3s in particular but for opus 24kbit the difference is 37.3 vs 36.5 % on clean, for AMR-wB it's 37.8 vs 36.5.
So do not worry. Especially for call centers, using the MP3 is not uncommon)I have heard it several times)
Y.

Daniel Povey

unread,
May 23, 2019, 1:50:50 PM5/23/19
to kaldi-help
And of course if you might be testing on data encoded in the same way, training on matched data will help.


Arkadi

unread,
Mar 1, 2020, 3:13:52 AM3/1/20
to kaldi-help
Hi,

I want to combine mp3 data ( 32bit float, 48Khz )  ~200 hours
with my pcm data ( 16bit 8khz)  ~100 hours
First I'll resample the mp3 to 16bit 8Khz pcm and then combine it together.

I want to ask for your advise before spending a few days on training,
Should it work better ?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Arkadi

unread,
Mar 1, 2020, 3:16:08 AM3/1/20
to kaldi-help
* By better I mean that better than only with the pcm files *

Daniel Povey

unread,
Mar 1, 2020, 4:19:59 AM3/1/20
to kaldi-help
That should work.
You don't have to dump to disk as pcm, just find the appropriate sox command to switch format.


To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/b8d9fb4a-7f63-42b9-b4b2-421919079c3b%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages