13:38:12 [conn532] Caught Assertion in insert , continuing
13:38:12 [conn532] insert transcode.worker_flac.chunks exception 11000 E11000 duplicate key error index: transcode.worker_flac.chunks.$files_id_1_n_1 dup key: { : ObjectId('4c636968acb81b35a1073e59'), : 166 } 1ms
Ideas?
Sigurd--
Richard
Sigurd Høgsbro <sigurd....@museeka.com> wrote:
> Hello,
>
> We're using GridFS for a batch job to transcode from WMA Lossless to FLAC,
> which requires the use of Windows for the actual conversion (ffmpeg does not
> yet support WMA *lossless* codec).
>
> The model is one of a controller running on a Linux server (Ubuntu 8.04.2),
> talking to mongoDB on the same server. All code is Python (2.5.2 on controller,
> 2.6 on workers), using pymongo 1.81. Worker apps run on Windows XP instances
> hosted within VirtualBox on Linux servers (Ubuntu 10.04).
>
> We're seeing some issues that I'd welcome feedback on:
>
> 1. GridFS read performance is pretty good (average 10-20MB/sec), but write
> performance is a very different beast. Writing a 18MB file can take 1 min
> 45 secs, though it can also complete in around 11 secs.
>
> We never see such slow writes on the controller task dumping the WMA file
> into GridFS, so I fear this isn't just triggered by the allocation of
> another datafile for the database.
>
> 2. We sometimes receive exceptions from the server which I'd like to
> understand the cause of:
> command SON([('filemd5', ObjectId('...')), ('root', u'worker_flac')])
> failed: exception: chunks out of order
>
> 3. When a worker crashes we sometimes end up with files left over in the
> GridFS collection. When doing GridFS.delete() on such files, immediately
> followed by a GridFS.new_file() using the same '_id', we get error 11000
> (see below). This is probably caused by the missing safe=True on the call
> to remove the chunks in GridFS.delete().
>
> 13:38:12 [conn532] Caught Assertion in insert , continuing
> 13:38:12 [conn532] insert transcode.worker_flac.chunks exception 11000 E11000 duplicate key error index: transcode.worker_flac.chunks.$files_id_1_n_1 dup key: { : ObjectId('4c636968acb81b35a1073e59'), : 166 } 1ms
>
>
> Ideas?
>
> Sigurd
>
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/
> mongodb-user?hl=en.
2010/9/6 Sigurd Høgsbro <sigurd....@museeka.com>:
The issue doesn't really have anything to do with safe mode, but with
the fact that we can't do multiple, isolated operations, which is
needed to do a delete that is concurrency safe. This limitation is
noted strongly in the API docs, IIRC.
This seems like something that should be integrated into Python's
sendall method, if it's the proper fix for this. Has there been any
discussion / is there a case open for this w/ the Python project? If
there is a case that just hasn't made it in yet, then we could
definitely add it to pymongo for the time being, as well.