Status of MySQL storage engine?

171 views
Skip to first unread message

Vanessa Williams

unread,
Jul 5, 2013, 5:53:06 PM7/5/13
to project-...@googlegroups.com
Hi all, I had been happily using the BDB back-end as recommended (it seems better supported and probably faster.) However, I've just discovered two things that may be of interest to others as well:

1) the BDB license is effectively a GPL license. This is unacceptable because we sell a software product and all the viral nastiness would attach and put us out of business.

2) Oracle very recently made the license even worse, so that even if you don't redistribute code that uses it, the viral nature of the license attaches. See this article for the details: http://www.infoworld.com/d/open-source-software/oracle-switches-berkeley-db-license-222097

So, long story short we have to ditch BDB immediately, and the only other storage engine is MySQL. However, a search of this group shows that there is not much mention of it, especially recent. 

a) Does anyone reading this use it?

b) Is it stable?

c) Are there any caveats, e.g. performance issues or known problems?

Thanks in advance. We really want to keep using Voldemort, but GPL (and now AGPL) is a nightmare for a software vendor (and now it could be a nightmare for everyone, but at least people not redistributing their code can get away with just the number of licenses they need for their own use.)

Regards,
Vanessa

--
Vanessa Williams
ThoughtWire Corporation

Carlos Tasada

unread,
Jul 8, 2013, 3:11:04 AM7/8/13
to project-...@googlegroups.com
How will affect the BDB license change to Voldemort? Anyone knows?


--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
Visit this group at http://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Vinoth Chandar

unread,
Jul 8, 2013, 1:48:54 PM7/8/13
to project-...@googlegroups.com
Good pointer Vanessa.. I think the fact that you cannot directly sell Voldemort based software is true for a long time, due to Bdb license. (Ask Francois , who built his lightweight file storage engine and whom I owe cleanup of diskmap and potentially merge his solution :) )

W.r.t the Mysql storage engine, this is what I know/think

  • Its functional.. (Joongjin Bae from CyberAgent reported he could use rebalancing successfully)
  • I think its problem would be around JDBC connection management ..
  • In the coming months, I will be working on making the mysql storage engine better.. So in fact thanks for bringing this up. We can team up if you are interested..
Carlos, I am poor at reading licenses. But let me grok and get back to you..

Thanks
Vinoth

Vanessa Williams

unread,
Jul 8, 2013, 2:16:03 PM7/8/13
to project-...@googlegroups.com
thanks, Vinoth.

w.r.t. MySQL, apparently it is GPL'ed, so it's of no use to us either (YMMV).

Finding/writing another storage engine seems the only way.

Anyone who'd like to team up on a friendly-license engine, let me know.

Vanessa

You received this message because you are subscribed to a topic in the Google Groups "project-voldemort" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/project-voldemort/je6SEvpVMYE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to project-voldem...@googlegroups.com.

Vinoth Chandar

unread,
Jul 8, 2013, 3:26:02 PM7/8/13
to project-...@googlegroups.com
Please talk to Francois..

https://github.com/vinothchandar/diskmap

We both started on this a while ago and got derailed (its me) due to other projects I was working with at that time..

Francois

unread,
Jul 9, 2013, 3:32:49 AM7/9/13
to project-...@googlegroups.com
Hi Vanessa, hello Vinoth,

YobiDrive diskmap was built for two reasons:

1) because of the terrible Oracle license

2) to have a better RAM to Disk ratio, to fit with cheaper motherboards. DiskMap is our key to have a production cost < 0,5 ct / GB / Month with 3 replicas, and works well with our WD Caviar Green disks not really designed for multitasking :)

More info on how it works here ( I like very much the title of the article  :D ):



It's build for Voldemort and should be slightly adapted for a standard usage, because we have hacked Voldemort by implementing conflict detection at write time to avoid pb before they occur, and our routing is also adapted with some hacks to save keys ( e.g.: chunks from the same file fall behind the same key so that the drive reads pre provision data ). But that's quite easy and there are no more than 4000 lines of code if I remember well.

Depending on your use case machines can be built with different specs as ours ( which are optimized for cost, cost, and cost ).

We can help you integrating it and propose some support.

To check the speed:

François

Vanessa Williams

unread,
Jul 10, 2013, 11:55:44 AM7/10/13
to project-...@googlegroups.com
Hi Francois and Vinoth,

First, thanks for the pointers.

So, it seems I have two choices:

1) a well-supported key-value store (leveldb) with a possibly not well supported Voldemort integration (I'll have to try to contact the writer/maintainer). I know Vinoth said leveldb did not perform well, but our performance requirements are less...extreme than the usual Voldemort use case.

2) a newly-created key-value store (disk map) that I may be able to get some help with from folks here.

I'll have to look at them both, but before I do, is there an intention to roll diskmap into the Voldemort distro. That would make my decision easier.

Thanks all,
Vanessa

--
Vanessa Williams
ThoughtWire Corporation

Brian Goad

unread,
Jan 2, 2014, 5:31:43 PM1/2/14
to project-...@googlegroups.com
Hi Vanessa,

Did you ever come up with a solution? Our company is dealing with similar issues and we are looking into other options for a storage engine with Voldemort that use a matching Apache license. Did you find an implementation of leveldb that worked with Voldemort?

Thanks for any input or advice,

Brian

Vinoth C

unread,
Jan 2, 2014, 5:47:16 PM1/2/14
to project-...@googlegroups.com
Brian,

Sorry for interjecting.  If you plan to run on large datasets, (north of 100 gigs say), leveldb may not very well for you.

I am currently perf testing Rocksdb for use in Voldemort.. It seems like fixes some of the issues with leveldb. Testing will tell..

Vanessa Williams

unread,
Jan 3, 2014, 9:40:29 AM1/3/14
to project-...@googlegroups.com
Hi Brian, yes we did succeed with two things:

- create a leveldb storage implementation suitable for contribution to project Voldemort
- create a version of Voldemort with no references at all to BDB

We plan to make both of these available. Two things are delaying this:

- I'm waiting for the final performance and stress tests to be completed by QA (they turned up problems earlier.)
- I need to find some time to figure out how Github works and how to make this stuff available.

In the meantime, if you need it, let me know. I can probably get the code to you some other way. I don't think Maven is an option because Voldemort isn't mavenized and it's too much work to fix that. Note two other items:

- it will not build on Windows and we have no intention of doing anything about that
- it hasn't been tested on large data sets or on huge numbers of writes (our app is heavy on reads but light on writes)

So it's use at your own risk, but leveldb has a good reputation so...as long as I didn't screw up the implementation it might be fine for you.

Hth,
Vanessa
--

Brian Goad

unread,
Jan 20, 2014, 2:29:07 PM1/20/14
to project-...@googlegroups.com
Vinoth,

We are currently testing with the integrated form of Krati built-in to Voldemort. I was curious as to what your experiences and findings in using Krati have been? Have you or anyone tested it on Solid State drives as well?
We steered away from LevelDB because of similar issues we found reported across the internet. Also, RocksDB looks like another intriguing prospect, so I would be very interested in your results.

Thanks,

Brian

Brian Goad

unread,
Jan 20, 2014, 2:36:35 PM1/20/14
to project-...@googlegroups.com
Hi Vanessa,

Thanks for the info.

Was just curious, but shouldn't license issues be abated as long as you are not using BDB as the backend engine, ie with this config:

bdb.sync.transactions=false
bdb.write.transactions=false
bdb.lock.read_uncommitted=false
enable.bdb.engine=false
storage.configs=voldemort.store.krati.KratiStorageConfiguration
slop.store.engine=krati


Also we would be interested in examining your contributions, if you are able to share them.

Github is not that hard to use. Check out this bootcamp tutorial for help setting up your first git repo https://help.github.com/categories/54/articles

Thanks for the response!

Brian
To unsubscribe from this group and all its topics, send an email to project-voldemort+unsubscribe@googlegroups.com.

Vanessa Williams

unread,
Jan 20, 2014, 5:05:36 PM1/20/14
to project-...@googlegroups.com

Hi Brian,

Re: licensing, the licensing problem is two-fold:

1) you need an alternative to BDB (contrib/leveldb)
2) you cannot even *distribute* the BDB jars without violating the license. Thus the need to have available a build with all references to BDB stripped out. It cannot depend on the BDB jar.

Re: examining the contribution: I can send you a patch for 1.3.0. (I don't think I can attach to posts to the group so I'll forward under separate cover).

One of my colleagues knows his way around Git, so we'll work on that. The complication is that I have to separate what I can contribute to Voldemort (adding leveldb support) from what has to remain part of a separate fork (added leveldb and removed bdb). I didn't keep those things as separate patches from the start (doh!)

- Vanessa

To unsubscribe from this group and all its topics, send an email to project-voldem...@googlegroups.com.

Vinoth C

unread,
Jan 21, 2014, 11:03:21 AM1/21/14
to project-...@googlegroups.com
Hi Brian, 

>> I was curious as to what your experiences and findings in using Krati have been? Have you or anyone tested it on Solid State drives as well?
We did not even get that far with Krati. The project is no longer maintained and in some cases other people have run into data corruption issues with krati. 

Yeah. I need to clean up my rocksdb-jna repo and get a storage engine moving. The StorageEngine api is more advanced now . So, its taking more time than before. I wish I can find a distraction free weekend. Anyways, will keep you all posted how that goes..

Vanessa, why would the MySQL engine not work for you? The storage engine already in, probably works (cant speak to its performance though) since I have seen posts on it. Also, the AGPL only applies to the C Version of BDB as of version 6. We are at version 5 (now on master) on BDB-JE which is still on the Oracle License.. I am sure you have gone through all this info. Just clarfifying it for the forums. 

Vanessa Williams

unread,
Jan 21, 2014, 3:31:24 PM1/21/14
to project-...@googlegroups.com
Vanessa, why would the MySQL engine not work for you? The storage engine already in, probably works (cant speak to its performance though) since I have seen posts on it. Also, the AGPL only applies to the C Version of BDB as of version 6. We are at version 5 (now on master) on BDB-JE which is still on the Oracle License.. I am sure you have gone through all this info. Just clarfifying it for the forums.

Hi Vinoth, the MySQL storage engine doesn't appear to have enough usage or maintenance to give me confidence in it, and it isn't likely to be as performant as an in-process key-value store. It may have some other limitations I can't remember right now as well. The Oracle license for BDB-JE does not allow commercial distribution without a paid license. We use Voldemort as a component of a commercial enterprise software product, not to run the backend of a service, so the paid license requirement attaches.

Regards,
Vanessa

Vinoth C

unread,
Jan 22, 2014, 2:04:00 AM1/22/14
to project-...@googlegroups.com
Fair enough.. But then again, I remember you mentioning that your performance needs are nt extreme before, when I pointed out the shortcomings in leveldb.. So, what I was getting at is, you can improve the existing MySQL storage engine and work with something, (which will inturn give you all the mysql tooling goodness) that is proven rock solid and reasonably performant. 

Anyways, my 2c. You know better.. 

Vanessa Williams

unread,
Jan 22, 2014, 10:34:05 AM1/22/14
to project-...@googlegroups.com
Hi Vinoth, there may be something to what you say, but another consideration is that we have many "moving parts" already, and MySQL would be another piece with deployment and operational considerations. An in-process solution is easier to manage. This might not be a problem for someone else, especially if they are running a service rather than distributing a product. 

I could not really confirm any performance problems with leveldb and the fact that it is used in Riak was a positive sign. (See their comments on it at http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/). You are right, though, that I don't have huge performance needs, especially for writes. However, it's not a bad thing to have an alternative in-process store to BDB available. I think there are others with more interest in working on the MySQL storage engine (also a good thing to have.) One final consideration with LevelDB is that it compresses the data. This does not matter to me at all, but some might find this useful (compression can be turned off in server.properties--especially convenient when debugging.)

The trade-offs are different for everyone, so it's a good discussion to have for the sake of others trying to decide what to use. 

Regards,
Vanessa
Reply all
Reply to author
Forward
0 new messages