Deprecation of HSQLDB for PuppetDB

326 views
Skip to first unread message

Ken Barber

unread,
May 7, 2015, 6:40:18 AM5/7/15
to Puppet Users, puppe...@googlegroups.com
Hi all,

As a representative of the PuppetDB engineering team, I wanted to let
you all know we are deprecating support for HSQLDB (HyperSQL DataBase)
in the next major release of PuppetDB (version 3.0). We will drop
support in the major release after that (version 4.0).

Note: For those customers using Puppet Enterprise, this email should
not apply to you, since we have never supported HSQLDB there. However,
if you have any questions about this feel free to ask me or open a
support case.

Since PuppetDB was released in 2012 we've supported HSQLDB as an
alternative database store, for both development and smaller use
cases. The intention (and the reality) was that HSQLDB was never
really intended for middle to large production use-cases. Over the
years we've often struggled to maintain support, given its substantial
limitations as compared to PostgreSQL.

Largely this has been fine, but as time goes on and we push the
platform further, HSQLDB’s more limited capabilities have forced us to
make decisions that favor the lowest common denominator. Sometimes
this wasn't a major problem, other times it has slowed development,
and forced us to limit the features, performance, and design for
everyone.

For example:

* For queries that include child data (i.e. catalogs containing
resources) we have an excellent PostgreSQL solution using JSON
aggregation functions in 3.0, but HSQLDB simply can't support this
case efficiently, and so we are forced to perform multiple queries
instead.
* JSONB based storage offers substantial promise for our
document-style data, but since HSQLDB doesn’t support it, we haven't
been able to seriously pursue the benefits.
* Common Table Expressions could simplify and improve the performance
of some of our queries, and although HSQLDB has CTE’s, they're weak
enough that we've had to avoid using them.
* HSQLDB lacks some basic operational functions, like online backups
and online querying (for debugging purposes). Instead you must stop
the service entirely before proceeding.
* Performance tuning for HSQLDB is more difficult, since it doesn't
have a powerful query optimizer and since it’s more difficult to to
execute explain plans. We've hit various cases where we've had to
completely redesign the way we construct a query because HSQLDB’s
optimizer couldn't handle the work.

Our general opinion is that instead of compromising our overall
solution and making choices for a more ‘development’ or ‘smaller scale
only’ focused solution, we want to remove that from the equation and
to only support the more production-ready case, which is PostgreSQL
today.

What does this mean to you all? Well we'll start shipping deprecation
messages in the next major release, and the default setup will become
PostgreSQL for new installs.

For the more studious of you all, we recommend not waiting for this,
and migrating as soon as possible. However don't worry, HSQLDB will
still continue to work for the lifetime of the next major release.

For those wanting to migrate, we have supplied tooling and
documentation so that you can export your database from a HSQLDB based
system to something using PostgreSQL:

http://docs.puppetlabs.com/puppetdb/2.3/migrate.html#exporting-data-from-an-existing-puppetdb-database

For PostgreSQL setup, the PGDG team have made it super simple with
their new package repos to get the latest and greatest PostgreSQL on
most popular distributions:

https://wiki.postgresql.org/wiki/YUM_Installation
https://wiki.postgresql.org/wiki/Apt

In addition we supply a Puppet module designed for this purpose which
we highly recommend, that has had contributions from a large number of
good people over the years:

https://forge.puppetlabs.com/puppetlabs/postgresql

And for those of you who are already using our PuppetDB module,
consult the documentation on how to change your configuration to use
PostgreSQL:

https://forge.puppetlabs.com/puppetlabs/puppetdb

In any case, know that this decision hasn’t been made lightly, and
we’ve been weighing it for some time. About a year ago we had almost
45% of our reported users using HSQLDB, but this number is now less
than 20%. So while we understand that people may have legitimate
reasons for using HSQLDB, we don't feel that given the substantial
disadvantages, there’s sufficient justification to maintain support,
when doing so negatively affects the majority of users.

Of course, it goes without saying that if you have any questions or
trouble migrating to PostgreSQL, the puppet-users mailing list and the
#puppet IRC channel are watched by a number of us in the PuppetDB team
(not to mention, by other avid community users who are also helpful),
so we can help where necessary with any problems.

Regards

Ken Barber
PuppetDB Team
Puppet Labs Inc.
Reply all
Reply to author
Forward
0 new messages