How to efficiently store protobuf messages in MySQL?

6,382 views
Skip to first unread message

Yegor

unread,
Aug 30, 2008, 4:14:06 PM8/30/08
to Protocol Buffers
Hi everyone,

I am not a DBA, so this question might sound stupid.

My backend storage is MySQL. I have a table with several indexed
columns (you know, those I use for fast searching) and a special non-
indexed column to store my protobuf messages (the data).

Example:

create table T_MESSAGES {
MESSAGE_ID int unsigned not null auto_increment,
MESSAGE_DATE timestamp not null,
MESSAGE_DATA ??????,
primary key (MESSAGE_ID),
index (MESSAGE_DATE)
} engine=?????? default charset=utf8;

In your opinion, what is the most suitable table column type (and
other parameters) that I should use for protobuf messages?

As usual, I am trying to achieve the best performance and minimize
storage requirements. I understand that discussion about performance
vs storage involves some give-and-takes. I am just trying to avoid
obvious mistakes.

If anyone has ever run any performance tests, etc, could you please
share your experiences?

Assume a basic configuration with a single database on a single hard-
drive, with potentially tens of millions of records in the above-
mentioned table. Usage is write intensive with occasional reports on
various date ranges.

I am going to run some experiments, so if you have any parameter
combinations that I could try that would be nice.

Thanks,

Yegor

Lenjoy

unread,
Aug 31, 2008, 2:08:32 AM8/31/08
to Yegor, Protocol Buffers
protobuffer can not be searched by field, so I think the database table is not different, the protobuffer is only used to transfer data.
DB -> protobuffer to string ->...-> string to protobuffer -> do something.

 
2008/8/31, Yegor <Yegor....@gmail.com>:

Jay Pipes

unread,
Aug 31, 2008, 1:16:23 PM8/31/08
to Yegor, Protocol Buffers
Hi!  I'd just store the message in a TEXT field (or BLOB).  Then, store the message in the field using something like:

string data;
message.SerializeToString(&data);

Then put data->to_cstr() into the MySQL database using the MySQL C API.

Cheers,

Jay Pipes
Community Relations Manager, North America
MySQL

Mats Kindahl

unread,
Sep 1, 2008, 3:55:17 AM9/1/08
to Yegor, Protocol Buffers
Hi Yegor,

If you are going to handle large number of rows and you don't need
transactional support, MyISAM is a good candidate for an engine. It is
used by, e.g., ScienceLogic in write-intensive setups, so it might be
something for you.

http://en.oreilly.com/mysql2008/public/schedule/detail/592

Also, the data should probably be stored as a BLOB since it has an
arbitrary length and is pure binary data (TEXT is like a BLOB but with a
character set and collation attached).

It is easy to serialize the message to a string using the serialization
functions available in protobuf and then write it to the database in the
normal manner.

Just my few cents,
Mats Kindahl

Yegor

unread,
Sep 2, 2008, 10:53:51 AM9/2/08
to Protocol Buffers
Thank you for your replies and the links. I will take a look at
ScienceLogic approach.

Yegor
Reply all
Reply to author
Forward
0 new messages