How to export the results of my whiz-bang document size query?

43 views
Skip to first unread message

Robert Kuhar

unread,
Oct 5, 2012, 3:09:49 PM10/5/12
to mongod...@googlegroups.com
We have a collection in our database that is all schema-less and what not.  The size of these documents is becoming problematic.  The first step to fixing a problem is admitting you have one, right?  To this end, I am trying to craft a query to report on the size of each document in our problem-child collection.  In the mongo shell, the following works like a champ:

  db.profiles.find().forEach( function( doc ) { print( "\"" + doc._id + "\", \"" + doc.projectName + "\", " + Object.bsonsize( doc) ); } );

...it produces output like...

  "XQ531", "Mango#3.5,Mango#WL,Mango,MANGO#WL", 1912
  "MJ8310", "Cheetah#UL", 913655
  "MJ8311", "Cheetah#UL", 503
  "MJ8312", "Cheetah#UL", 213510
  "MJ8313", "Cheetah#UL", 317795

How do I get this output to pop out of some script and go directly into a CSV file?  I plan to put this thing on some cron or something and run it daily.  Could I make mongoexport do this?

As always, any advice is greatly appreciated.

Bob

Robert Kuhar

unread,
Oct 8, 2012, 5:33:01 PM10/8/12
to mongod...@googlegroups.com
Crickets?  Sigh.

Joel Bender

unread,
Oct 8, 2012, 9:47:08 PM10/8/12
to mongod...@googlegroups.com
I put this Python snippet together that does the same thing, or at least something similar.  As I was hunting around I found this, so the real allocated size on disk will be different.


import bson
from pymongo import ReplicaSetConnection, ReadPreference

from collections import OrderedDict
from datetime import datetime

connection = ReplicaSetConnection(
    "langly,frohike,byers",
    replicaset='gunmen',
    document_class=OrderedDict,
    read_preference = ReadPreference.SECONDARY,
    )

db = connection.mymongo
for doc in db.colors.find():
    print "\"" + str(doc['_id']) + "\", \"" + doc['name'] + "\", " + str(len(bson.BSON.encode(doc)))

connection.close()

William Zola

unread,
Oct 9, 2012, 1:52:52 PM10/9/12
to mongod...@googlegroups.com
If you want to save this to a file, you'll have to redirect the output of the 'mongo' shell.  The shell itself has no facility for writing to files.

Failing that, you can re-write the query in another language and use that language's facilities to write to a file.  Unfortunately, 'mongoexport' has no facility to run arbitrary operations (such as 'Object.bsonsize()')  when performing an export.

 -William 


On Friday, October 5, 2012 12:09:49 PM UTC-7, Robert Kuhar wrote:
Reply all
Reply to author
Forward
0 new messages