I am doing some spelunking into the tokyo cabinet code to possibly work
on some custom dbfile checking and manipulation programs and noticed
something odd. The fileformat documentation[1] says that at byte 33
the database type is encoded as:
0x01 -> hash db type
0x02 -> b+tree type
0x03 -> fixed-length
0x04 -> table
When I create a few sample databases and then look at byte 33 they all
are off by one.
jeremy@fs5:/tmp % cat tt-type-test.sh
#!/bin/sh
for ttype in tch tcb tcf tct
do
dbfile="/tmp/test.${ttype}"
cmd="${ttype}mgr create ${dbfile}"
echo "==> Creating ${ttype} db"
$(${cmd})
echo "==> First 33 bytes of ${dbfile}"
od -c -N 33 ${dbfile}
done
jeremy@fs5:/tmp % ./tt-type-test.sh
==> Creating tch db
==> First 33 bytes of /tmp/test.tch
0000000 T o K y O C a B i N e T \n 1 .
0000020 0 : 8 2 6 \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000040 \0
0000041
==> Creating tcb db
==> First 33 bytes of /tmp/test.tcb
0000000 T o K y O C a B i N e T \n 1 .
0000020 0 : 8 2 6 \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000040 001
0000041
==> Creating tcf db
==> First 33 bytes of /tmp/test.tcf
0000000 T o K y O C a B i N e T \n 1 .
0000020 0 : 8 2 6 \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000040 002
0000041
==> Creating tct db
==> First 33 bytes of /tmp/test.tct
0000000 T o K y O C a B i N e T \n 1 .
0000020 0 : 8 2 6 \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000040 003
0000041
And then in the tcutil.h we see that these database type bytes
are assigned from enumerated values:
enum { /* enumeration for database type */
TCDBTHASH, /* hash table */
TCDBTBTREE, /* B+ tree */
TCDBTFIXED, /* fixed-length */
TCDBTTABLE /* table */
};
So I just want to make confirm that the database file format
documentation is incorrect and if so, make sure it gets
updated appropriately.
enjoy,
-jeremy
[1] http://1978th.net/tokyocabinet/spex-en.html#fileformat
--
========================================================================
Jeremy Hinegardner jer...@hinegardner.org