MongoDB and BSON

14 views
Skip to first unread message

Martin Ichilevici de Oliveira

unread,
Dec 20, 2019, 5:34:47 PM12/20/19
to mongodb-dev
Hello,

I am trying to understand the structure of a BSON document generated by MongoDB by comparing the hexdump with the formal specification.

I inserted the following into an empty mongo collection (without compression):
{"name": John, age: NumberInt(10)}
{"name": Paul, age: NumberInt(25)}

And here is the hexdump

$ hexdump -C collection-0-1541750112074217368.wt
00000000  41 d8 01 00 01 00 00 00  d8 08 23 b7 00 00 00 00  |A.........#.....|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000  00 00 00 00 00 00 00 00  26 04 00 00 00 00 00 00  |........&.......|
00001010  8a 00 00 00 04 00 00 00  07 04 00 00 00 10 00 00  |................|
00001020  f0 22 09 6b 01 00 00 00  05 81 bb 2e 00 00 00 07  |.".k............|
00001030  5f 69 64 00 5d fd 43 d4  53 6c 1e 93 e1 fb fb 1d  |_id.].C.Sl......|
00001040  02 6e 61 6d 65 00 05 00  00 00 6a 6f 68 6e 00 10  |.name.....john..|
00001050  61 67 65 00 0a 00 00 00  00 05 82 bb 2e 00 00 00  |age.............|
00001060  07 5f 69 64 00 5d fd 43  dc 53 6c 1e 93 e1 fb fb  |._id.].C.Sl.....|
00001070  1e 02 6e 61 6d 65 00 05  00 00 00 70 61 75 6c 00  |..name.....paul.|
00001080  10 61 67 65 00 19 00 00  00 00 00 00 00 00 00 00  |.age............|
[...]

I understand that the bytes underlined are the types and bytes in blue are the objectids, yellow are the names and green are the ages.

However, I do not understand what the bytes between them represent (in red).

So I tried to parse that same file using libbson (using example-client.c). It connects to Mongo and retrives that collection. Using gdb, I stepped through the parsing and I realized that those bytes were actually different:

00 03 31 00 2e 00 00 00

Now, that makes a lot more sense: the first 00 is the end of the previous document, "03 31 00" is the beginning of the new document and "2e 00 00 00" is the document's size.

What is going on here? Why is the hexdump different from what I see with libbson?

Thanks,
Martin
Reply all
Reply to author
Forward
0 new messages