Copy GridFS files

1,246 views
Skip to first unread message

Octavian Covalschi

unread,
Dec 27, 2011, 4:20:46 PM12/27/11
to mongod...@googlegroups.com
Hello,

I'm looking for an easy way to "clone" a GridFS file without pulling it out and pushing it back (if possible at all) to mongodb... I may have pretty big files, such as movies for instance and would like to avoid expensive operations of downloading and uploading..

Thank you in advance.

Richard Kreuter

unread,
Dec 27, 2011, 4:38:46 PM12/27/11
to mongodb-user
This operation isn't really supported, but more importantly, GridFS
files aren't supposed to be writable after they're stored (i.e.,
they're immutable), so I'm not sure why you'd want to clone a GridFS
file. Can you say more about what you're trying to do?

Regards,
Richard

On Dec 27, 4:20 pm, Octavian Covalschi <octavian.covals...@gmail.com>
wrote:

Octavian Covalschi

unread,
Dec 27, 2011, 4:48:29 PM12/27/11
to mongod...@googlegroups.com
The whole application is trying to simulate a regular filesystem with mongodb and gridfs. My current task is to implement a copy feature of multiple files, so basically I need to duplicate a stored GridFS file... and it looks like I need to retrieve it and store back... which it works, but it's expensive... in particular if the file is big...
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Richard Kreuter

unread,
Dec 27, 2011, 4:57:18 PM12/27/11
to mongodb-user
Well since GridFS files are immutable, it may not be a great fit for
simulating a file system too straightforwardly.

However, if all you're interested in is having two objects in your
file system's namespace that correspond to the same sequence of bits,
you can do this in a sneaky way, by adding new attributes to the
fs.files document for a given file. Then your copy operation would
just be a $push onto an array of names, rather than the creation of a
new GridFS file and all its chunks.

Regards,
Richard

On Dec 27, 4:48 pm, Octavian Covalschi <octavian.covals...@gmail.com>
wrote:
> The whole application is trying to simulate a regular filesystem with
> mongodb and gridfs. My current task is to implement a copy feature of
> multiple files, so basically I need to duplicate a stored GridFS file...
> and it looks like I need to retrieve it and store back... which it works,
> but it's expensive... in particular if the file is big...
>

Octavian Covalschi

unread,
Dec 27, 2011, 5:11:32 PM12/27/11
to mongod...@googlegroups.com
Aha, that's one way to do it, though that would be more like a symbolic link than an actual copy.

The complete copy of all chunks would be useful in case if I want to change a file, but also keep original version or if it's read only...

I guess the ideal solution would be to have something like "INSERT ... SELECT" from MySQL http://dev.mysql.com/doc/refman/5.0/en/insert-select.html 

Richard Kreuter

unread,
Dec 28, 2011, 10:27:45 AM12/28/11
to mongodb-user
Again, GridFS files are immutable: you're not allowed to change one
after it's been written. (If any library out there lets you change an
existing GridFS file, that library is broken.) So GridFS is a
somewhat unlikely storage engine for a file system, but if you want to
use it that way, you'll need to do your own work to sort out how to
make it appear as if your files are mutable.

On Dec 27, 5:11 pm, Octavian Covalschi <octavian.covals...@gmail.com>
wrote:
> Aha, that's one way to do it, though that would be more like a symbolic
> link than an actual copy.
>
> The complete copy of all chunks would be useful in case if I want to change
> a file, but also keep original version or if it's read only...
>
> I guess the ideal solution would be to have something like "INSERT ...
> SELECT" from MySQLhttp://dev.mysql.com/doc/refman/5.0/en/insert-select.html

Octavian Covalschi

unread,
Dec 28, 2011, 5:14:30 PM12/28/11
to mongod...@googlegroups.com
Thank you, I'll keep that in mind.

rob

unread,
Dec 29, 2011, 12:03:09 PM12/29/11
to mongod...@googlegroups.com
not sure how you would do a hard copy without reading and writing the content. even fs copy on the same disk partition does that. you can symlink in fs.files as OP suggested or try a subprocess or cStringIO gridfs.get(_id) gridfs.put().

Octavian Covalschi

unread,
Dec 29, 2011, 12:10:02 PM12/29/11
to mongod...@googlegroups.com
The concern is more about network traffic. If I have a 2GB file in order to copy it or clone, I'll have to download it and upload (if the mongod instance is on a remote server)... That's why I was looking for something server side, which will allow me not to do that... 

On Thu, Dec 29, 2011 at 11:03 AM, rob <rob...@gmail.com> wrote:
not sure how you would do a hard copy without reading and writing the content. even fs copy on the same disk partition does that. you can symlink in fs.files as OP suggested or try a subprocess or cStringIO gridfs.get(_id) gridfs.put().

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/TlHZZ2v6Q00J.

Scott Hernandez

unread,
Dec 29, 2011, 3:09:55 PM12/29/11
to mongod...@googlegroups.com

There is no query into command which would allow doing that completely on the server.

You could use javascript in a db.eval to do that now.

Or, add a feature request in jira.

Octavian Covalschi

unread,
Dec 29, 2011, 4:07:54 PM12/29/11
to mongod...@googlegroups.com
Thanks, I'll try with db.eval.

After looking a bit in JIRA looks like  https://jira.mongodb.org/browse/SERVER-732 would help in this case, so +1 from me...
Reply all
Reply to author
Forward
0 new messages