How big is a document in MongoDB?

1,647 views
Skip to first unread message

Tream

unread,
Jan 27, 2012, 10:36:36 AM1/27/12
to mongodb-user
Hi,
is it possible to get the file size of a single Document in MongoDB?

Andreas Jung

unread,
Jan 27, 2012, 10:38:54 AM1/27/12
to mongod...@googlegroups.com
doc.bsonsize()

-aj

lists.vcf

Robert Stam

unread,
Jan 27, 2012, 10:51:42 AM1/27/12
to mongod...@googlegroups.com
Object.bsonsize(doc)

doc.bsonsize()

-aj
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Tream

unread,
Jan 27, 2012, 11:04:52 AM1/27/12
to mongodb-user
Hi,
I work with the PHP-Driver, and can´t find the method bsonsize().

Robert Stam

unread,
Jan 27, 2012, 11:27:35 AM1/27/12
to mongod...@googlegroups.com
Sorry for being so brief.

The bsonsize method is defined in the mongo shell. So you can either create a document in the shell and test its size, or you can fetch one from the database. For example:

> var doc = { _id : new ObjectId(), x : 1 }
> Object.bsonsize(doc)
33
>

or

> db.test.remove()
> db.test.insert({ x : 1 })
> var doc = db.test.findOne()
> Object.bsonsize(doc)
33
>

On Fri, Jan 27, 2012 at 11:04 AM, Tream <wadimb...@googlemail.com> wrote:
Hi,
I work with the PHP-Driver, and can´t find the method bsonsize().

Robert Stam

unread,
Jan 27, 2012, 11:28:33 AM1/27/12
to mongod...@googlegroups.com
p.s. I'm hoping someone else can tell you how to do the same thing in PHP.

Sam Millman

unread,
Jan 27, 2012, 11:29:49 AM1/27/12
to mongod...@googlegroups.com
I have been dpoing some searching around and except form finding the strlen() of teh BSON string after encoding I cannot find a way of extracting the actual size of the BSON document.

It does exist in drivers programming but it seems it has not been added to the drivers public API.

Derick Rethans

unread,
Jan 27, 2012, 11:46:18 AM1/27/12
to mongodb-user
On Fri, 27 Jan 2012, Tream wrote:

> Hi,
> I work with the PHP-Driver, and can´t find the method bsonsize().

There is a function the driver called bson_encode:
http://php.net/manual/en/function.bson-encode.php

You can use that to do:

$len = strlen( bson_encode( $yourDocument ) );

cheers,
Derick

--
http://mongodb.org | http://derickrethans.nl
twitter: @derickr and @mongodb

Tream

unread,
Jan 27, 2012, 1:00:49 PM1/27/12
to mongodb-user
Hi,
thank you all very much, that works fine!

greetings,
Wadim

Sam Millman

unread,
Jan 27, 2012, 1:04:25 PM1/27/12
to mongod...@googlegroups.com
Ah thought you was looking for the size in MB of a doc but good to hear you found your answer :)

Tream

unread,
Jan 27, 2012, 7:41:07 PM1/27/12
to mongodb-user
Hi Sam,
the method(s) returns a value of 32.024, so that should be ~ 0.03MB.
Or do I overlook something?

Sam Millman

unread,
Jan 27, 2012, 7:58:21 PM1/27/12
to mongod...@googlegroups.com
I must be missing something. bson_encode encodes to a BSON string much like json_encode to a JSON string.

So strlen() on that encoding will result in the number of characters in that string right? Not sure how it returns 32.024 since you can't have 0.024 of a character.

So according to the rules of PHP really all you should get back is the number of characters contained in that string unless somehow the drivers api is able to do something to a global function which should be impossible...

Sam Millman

unread,
Jan 27, 2012, 8:18:13 PM1/27/12
to mongod...@googlegroups.com
I mean I not sure how you come to 0.03mb as well since characters can be multibyte which means you cant just assign one value of storage weight to all characters in a string. I suppose http://www.php.net/manual/en/function.mb-strlen.php could be used to calc the storage weight of characters but that cant even be trusted for something like bson I wouldnt think.

I mean I may be missing something blatent here but that doesn't look right to me.

Mardix

unread,
Jan 28, 2012, 2:07:39 AM1/28/12
to mongod...@googlegroups.com
Well, I use this method in my PHP code to get document size.

It will return the total size in bytes of the document. 

Of course you must change $DBName, $CollectionName etc.. 


$DBName = "MyDBName";
$MongoDB = new MongoDB(new Mongo(),$DBName);
function documentSize(Array $Criteria){
global $MongoDB;
$collectionName = "MyCollection";
$jsonCriteria = json_encode($Criteria);
$code = "function(){
  return Object.bsonsize(db.{$collectionName}.findOne({$jsonCriteria}))
}";

$resp = $MongoDB->execute($code);

return $resp["retval"];
}

    example: where _ID is the document id

    $myDocSize = documentSize(array("_id"=>_ID));

I hope it helps!

1!

Sam Millman

unread,
Jan 28, 2012, 6:52:51 AM1/28/12
to mongod...@googlegroups.com
Indeed that could will work, but this should really be made into the public api cos it is in the driver (they do calc the size of the doc there).

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/eGF6e1IwIoUJ.

Derick Rethans

unread,
Jan 28, 2012, 9:47:19 AM1/28/12
to mongod...@googlegroups.com
Hi!

On Sat, 28 Jan 2012, Sam Millman wrote:

> Indeed that could will work, but this should really be made into the
> public api cos it is in the driver (they do calc the size of the doc
> there).

If you mean the PHP driver with "public API", why do you think that
strlen(bson_encode($document)); is not a good enough approach for this?
Personally, I don't expect that this is a feature that many people will
need, but feel free to (try to) convince me.

Sam Millman

unread,
Jan 28, 2012, 11:25:46 AM1/28/12
to mongod...@googlegroups.com
Knowing the size of a document in MB  (or KB) before saving can be very useful for certain restrictions in schema designs.

Well strlen() wouldnt be very useful since the size (in MB or KB on disk) cannot be accurately determined from the amount of characters in a encoded BSON string.

Thing is this is provided freely in the JS driver just seems a bit weird that its not in most other drivers.

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.

NPSF3000

unread,
Jan 28, 2012, 12:21:59 PM1/28/12
to mongodb-user
Sorry to state the obvious... but can't you get the binary of the bson
document through the API? [hacked or otherwise]

Sam Millman

unread,
Jan 28, 2012, 12:28:58 PM1/28/12
to mongod...@googlegroups.com
The binary? What you mean the binary integer groups? What do you mean by "the binary"?

Sam Millman

unread,
Jan 28, 2012, 12:39:23 PM1/28/12
to mongod...@googlegroups.com
yes it is true using something to convert the bson_encoded into ASCII then down into base 2 then strlen() that could work cos every character would be exactly one bit, didn't think of that...

Sam Millman

unread,
Jan 28, 2012, 12:42:14 PM1/28/12
to mongod...@googlegroups.com
Though I am not sure how that would react around ObjectIds and other objects within the BSON structure...

Muharem Hrnjadovic

unread,
Jan 28, 2012, 4:00:48 PM1/28/12
to mongod...@googlegroups.com, NPSF3000
On 01/28/2012 06:21 PM, NPSF3000 wrote:
> Sorry to state the obvious... but can't you get the binary of the bson
> document through the API? [hacked or otherwise]
Have you tried

Object.bsonsize(doc)

in the MongoDB shell?

> On Jan 29, 2:25 am, Sam Millman <sam.mill...@gmail.com> wrote:
>> Knowing the size of a document in MB (or KB) before saving can be very
>> useful for certain restrictions in schema designs.
>>
>> Well strlen() wouldnt be very useful since the size (in MB or KB on disk)
>> cannot be accurately determined from the amount of characters in a encoded
>> BSON string.
>>
>> Thing is this is provided freely in the JS driver just seems a bit weird
>> that its not in most other drivers.
>>
>> On 28 January 2012 14:47, Derick Rethans <der...@10gen.com> wrote:
>>
>>> Hi!
>>
>>> On Sat, 28 Jan 2012, Sam Millman wrote:
>>
>>>> Indeed that could will work, but this should really be made into the
>>>> public api cos it is in the driver (they do calc the size of the doc
>>>> there).
>>
>>> If you mean the PHP driver with "public API", why do you think that
>>> strlen(bson_encode($document)); is not a good enough approach for this?
>>> Personally, I don't expect that this is a feature that many people will
>>> need, but feel free to (try to) convince me.

Best regards/Mit freundlichen Grüßen

--
Muharem Hrnjadovic <m...@foldr3.com>
Public key id : B2BBFCFC
Key fingerprint : A5A3 CC67 2B87 D641 103F 5602 219F 6B60 B2BB FCFC

signature.asc

Muharem Hrnjadovic

unread,
Jan 29, 2012, 1:39:32 AM1/29/12
to mongod...@googlegroups.com
On 01/28/2012 10:00 PM, Muharem Hrnjadovic wrote:
> On 01/28/2012 06:21 PM, NPSF3000 wrote:
>> Sorry to state the obvious... but can't you get the binary of the bson
>> document through the API? [hacked or otherwise]
> Have you tried
>
> Object.bsonsize(doc)
>
> in the MongoDB shell?
Oops, hit the "send" button too quick. What I meant is: schema design is
usually a manual process, carried out by a human being.
Storing a number of document variations during the design and checking
the resulting BSON sizes (as described above) should thus be viable.
Or is there something I am missing?
signature.asc

Sam Millman

unread,
Jan 29, 2012, 5:55:45 AM1/29/12
to mongod...@googlegroups.com
Yea in PHP it is currently impossible to totally reliably detect the size. Only way is to send off an execute() command to the DB to get Object bsonsize().

It was thought conveting the encoded string to binary would solve the problem but I am not sure about how objects in PHP (mongoId, MongoDate etc) might conflict with the calculation and make it less accurate.

It is so far unsolved, I am gonna do more testing today to see if I find a work around.

Sam Millman

unread,
Jan 29, 2012, 9:47:32 AM1/29/12
to mongod...@googlegroups.com
Ah ok actually strlen() does work (kinda) due to the ASCII encoding of the BSON string (which I didn't realise it did).

So yea strlen() does actually work, sorry.
Reply all
Reply to author
Forward
0 new messages