Problematic files in Spear...

16 views
Skip to first unread message

Marc Ferras

unread,
Sep 28, 2016, 8:24:13 AM9/28/16
to bob-devel
Hello,

I am running speaker verification experiments using bob.bio.spear. I
would like to know how Bob handles empty or corrupted audio files
through the processing chain in verify.py. So far I have noticed that
the jobs involved stopped after finding one of these files. Is there a
"best" way of handling these? What about scoring and performance
evaluation? Are dummy scores used for trials that use these files?

Thanks,

- Marc

Tiago Freitas Pereira

unread,
Sep 28, 2016, 8:43:27 AM9/28/16
to bob-...@googlegroups.com
Hey Marc,

bob.bio framework does not handle these things, you must provide "good" files for the execution chain.

One thing that you can do is to raise an exception or use the logger to print the corrupted file names in the load function of you database package.

Cheers



--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/
--- You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Tiago

Pavel Korshunov

unread,
Sep 28, 2016, 8:43:42 AM9/28/16
to bob-...@googlegroups.com
Hi Marc,

The corrupted or empty files are handled by preprocessor. You should check what does preprocessor do when it cannot process the files. If it breaks but you do not want to break, you can extend the preprocessor and do your own handling of the errors. A possible solutions would be for preprocessor to return a fixed size numpy array filled with zeros. Then, in the extractor, once such array is encountered, it should correctly deal with it. For instance, extractor could return a zero array of the size that is expected by a classifier. But that would mean that classifier may receive an unknown number of zeroed arrays. You should make sure it's fine for you. 

In any case, it's not a problem of the framework itself but the components that it calls.

cheers,
-pavel 

On Wed, Sep 28, 2016 at 2:24 PM, Marc Ferras <marc....@idiap.ch> wrote:
--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/
--- You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Dr. Pavel Korshunov
Biometric group
Idiap Research Institute
Rue Marconi 19
CH - 1920 Martigny
Switzerland

Room: 207

Marc Ferras

unread,
Sep 28, 2016, 8:46:33 AM9/28/16
to bob-...@googlegroups.com
Thanks guys for the help. I think the easiest thing to do is handling this in the load function.

Best,

- Marc
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/

---
You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+...@googlegroups.com.

Amir Mohammadi

unread,
Sep 28, 2016, 9:12:05 AM9/28/16
to bob-...@googlegroups.com
verify.py has a option:

  -A, --allow-missing-files
                        If given, missing files will not stop the processing;
                        this is helpful if not all files of the database can
                        be processed; missing scores will be NaN. (default:
                        False)

You can look into that too.

Thanks,
Amir


On Wed, Sep 28, 2016 at 2:46 PM Marc Ferras <marc....@idiap.ch> wrote:
Thanks guys for the help. I think the easiest thing to do is handling this in the load function.

Best,

- Marc



On 28/09/16 14:43, Tiago Freitas Pereira wrote:
Hey Marc,

bob.bio framework does not handle these things, you must provide "good" files for the execution chain.

One thing that you can do is to raise an exception or use the logger to print the corrupted file names in the load function of you database package.

Cheers
On Wed, Sep 28, 2016 at 2:24 PM, Marc Ferras <marc....@idiap.ch> wrote:
Hello,

I am running speaker verification experiments using bob.bio.spear. I would like to know how Bob handles empty or corrupted audio files through the processing chain in verify.py. So far I have noticed that the jobs involved stopped after finding one of these files. Is there a "best" way of handling these? What about scoring and performance evaluation? Are dummy scores used for trials that use these files?

Thanks,

- Marc


--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/
--- You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Tiago
--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/
---
You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Manuel Günther

unread,
Sep 28, 2016, 11:27:52 AM9/28/16
to bob-devel
Dear Marc,

as Pavel pointed out, I have recently implemented an ``--allow-missing-files`` flag. With this flag enabled, all tools in the tool chain will handle missing files. 

However, when the files are corrupted, you still need to implement your own function to read the original data. In fact, the way, data-IO is handled by the preprocessor has changed between the last stable version ob bob.bio.* and the latest (I think we still have that inside a branch) on GitLab. When you use the stable versions, you need to overwrite the ``read_original_data`` function inside your preprocessor. With the new version, you pass the ``read_original_data`` function to the base Preprocessor class.

In any way, when you have corrupted files and the ``--allow-missing-files`` flag enabled, your preprocessor can simply return ``None`` for corrupted files in order to skip processing those later on.

Cheers
Manuel

Marc Ferras

unread,
Sep 28, 2016, 12:07:14 PM9/28/16
to bob-...@googlegroups.com
That is even better!

Thanks Amir.

- Marc
Reply all
Reply to author
Forward
0 new messages