[mongoDB] storing audio files (Best way)

2,106 views
Skip to first unread message

Lucas Guimarães

unread,
Aug 26, 2016, 6:06:25 PM8/26/16
to mongodb-user
I've a bunch of audio files (.wav) and I'd like know from you guys, what's the best way to store them in mongoDB? 
What I'm doing today is, just storing the path of the file.(as you can see below).
But I think it's not good because I'm creating a "fake reference" to the file and I wonder If by chance I delete the file, how could I consist it?

{
    "_id" : ObjectId("57c0a06cd92f49222ce2f42d"),
    "eps" : "Veganet",
    "terminal" : 989638523,
    "main_path" : "W:\\Python\\Speech\\audio\\teste\\teste_9",
    "motivo" : "Cancelamento",
    "audio" : [ 
        {
            "path" : "W:\\Python\\Speech\\audio\\teste\\teste_9\\01_audio.wav",
            "confidence" : 0.8332507,
            "transcript" : "Alô bom dia com quem eu falo",
            "sequence" : 1
        }, 
        {
            "path" : "W:\\Python\\Speech\\audio\\teste\\teste_9\\02_audio.wav",
            "confidence" : 0.90813386,
            "transcript" : "Um novo benefício pra minha da senhora, sem impostos e nada mais do que isso",
            "sequence" : 2
        }
}


Thank you,

Lukas Lehner

unread,
Aug 27, 2016, 4:12:15 AM8/27/16
to mongod...@googlegroups.com
How big are the *.wav in average?
Have you tried out GridFS?

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+unsubscribe@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/6f6cc230-554d-41ff-836d-caa3dd2b3f06%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

John Murphy

unread,
Sep 16, 2016, 2:13:37 AM9/16/16
to mongodb-user

Hi Lucas,

But I think it’s not good because I’m creating a “fake reference” to the file and I wonder If by chance I delete the file, how could I consist it?

A ‘best’ solution is relative on case by case basis depending on the system environment and requirements.

For example, a solution for tracking local audio files using a desktop application differs from tracking audio files stored in network storage using a server/client-based application.

A number of suggestions depending on your use case:

  • Store the audio files within your MongoDB database. The maximum document size is 16 megabytes so if your files exceed this size you would need to consider using GridFS.
  • Use the concept of a file watcher which receive notifications of file deletions. This could then be used as a trigger to ensure your audio file paths are consistent within MongoDB. Some examples of file watchers are the System.IO.FileSystemWatcher class in the .NET Framework or the watchdog package in Python.
  • If the system is server/client based you could restrict the filesystem access of the audio files, thereby only allowing upload and removal of audio files via your client based application.

You may also find the following resources useful:

Regards,
John Murphy

Reply all
Reply to author
Forward
0 new messages