Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Java UDFs to compress and uncompress BLOBs

16 views
Skip to first unread message

Jeremy Rickard

unread,
Mar 31, 2023, 11:53:04 PM3/31/23
to
For interest in case useful. I've been working on this a little while. Due to the need to raise a case it took longer than I imagined, but here at last...

Release 0.1, see: https://github.com/easydataservices/db2-compress

Please note...
* An IBM case has confirmed that Java UDFs can return LOB data types. The documentation that says otherwise is out-of-date, and will be fixed later on.

* On my laptop running Db2 11.5.8 on Ubuntu, an uncompressed 25MB JSON document compresses 6.8x in about 2 seconds, and uncompresses in about 1 second.

* If you want to work with BLOBs larger than 64MB you will need to increase JAVA_HEAP_SZ.

* Performance is now perhaps adequate for archive databases of an appropriate design. I would not suggest using this in any database where response matters.

* Smaller LOBs containing similar data run faster, more-or-less proportional to size.

If you think you have a possible use for these functions, please read the notes in the README, and be sure to test it works well for you before deploying.

Jeremy Rickard

Alexander

unread,
Apr 5, 2023, 12:37:20 PM4/5/23
to
Jeremy Rickard <jrick...@gmail.com> wrote:
> For interest in case useful. I've been working on this a little while.
> Due to the need to raise a case it took longer than I imagined, but here at last...

It’s interesting exercise, but generally it’s wrong approach.
Just don’t store LOBs in the DB if DB size or LOBs access performance could
be an issue.

Alexander Veremev.


Jeremy Rickard

unread,
Apr 6, 2023, 8:36:22 AM4/6/23
to
> Just don’t store LOBs in the DB if DB size or LOBs access performance could t
> be an issue.
>
> Alexander Veremev.

I already advised caution, but that's a sweeping statement. In the real world, unfortunately, some of these archive databases can grow and grow. With the right kind of data (compresses well, seldom retrieved), there may be worthwhile benefits. For example:
* Cheaper HADR replication (logs are smaller), if needed
* Faster, smaller backups and recovery
* Leverage standard database access controls
* Leverage standard database solutions to encrypting sensitive data

I would generally not compress any LOB in a non-archiving context. By archiving, I mean writing data to a separate store where you don't (generally) expect to retrieve it again.

Jeremy Rickard
0 new messages