Google Groups

asynchbase 1.3.0-rc2 available for download and testing

tsuna Apr 29, 2012 1:19 AM
Posted in group: Async HBase
Hi all,
The second release candidate for the next feature release of asynchbase
is available for download.  The biggest changes are essentially atomic
increment coalescing and fine-grained statistics collection of the
client's activity.  Other changes include the ability to use atomic CAS
or set a timestamp filter on Scanners.

Atomic increment coalescing can easily speed up by an order of
magnitude applications that maintain various counters in HBase as they
serve queries.  This code was originally written in Scala for an
application server at StumbleUpon, where it's coalescing 40k counter
increments/s down to about 2k/s, thus significantly reducing the write
load on HBase by cutting by 20x the amount of data that RegionServers
have to write to their WAL (Write-Ahead Log).

Please note that v1.3.0 of asynchbase introduces a new dependency on
Google's excellent Guava library.  I don't add dependencies lightly,
and I'm very picky about choosing them, but Guava is a really solid,
beautifully written library, with coding standards far above average.

Here is the relevant excerpt of the NEWS file:

New public APIs:
  - PleaseThrottleException can now give you a Deferred.
  - There is now a `Counter' class that provides a highly concurrent
    replacement for AtomicLong from jsr166e's LongAdder.
  - HBaseClient now has a `bufferAtomicIncrement' method used to
    coalesce atomic counter increments.
  - A new method, `stats()', provides detailed statistics on the client's
    activities (number of calls for various RPCs, number of connections
    created, etc.)
  - `HBaseClient' has a new `compareAndSet(PutRequests, byte[])' method
    for atomic Compare-And-Set (CAS) operation.
  - The Scanner has new methods to allow specifying a time range to scan.
  - `PutRequest' has extra constructors so as to be able to affect multiple
    columns in a row.  This is required to be able to atomically CAS more
    than one column at a time on a given row.

Deprecated public APIs:
  - In HBaseClient, the methods `rootLookupCount()',
    `uncontendedMetaLookupCount()', and `contendedMetaLookupCount()' are
    deprecated in favor of the new `stats()' API.  These methods will be
    removed in the 2.0 release.

Noteworthy changes:
  - Upgraded to Netty 3.3.1, suasync to 1.2.0.
  - asynchbase now depends on Google's Guava library (v11.0.2).

Pre-compiled JAR: (also
available as asynchbase-1.3.0-SNAPSHOT.jar in Maven)

$ git diff --stat v1.2.0.. | tail -n 1
 32 files changed, 2460 insertions(+), 89 deletions(-)

$ git shortlog v1.2.0..
Benoit Sigoure (21):
      Start version 1.2.1.
      Use `:=' more.
      Have distclean remove Maven's target directory.
      Make PleaseThrottleException easier to handle.
      Add jsr166e LongAdder.
      Add a class to use LongAdder from JSR 166e.
      Switch code to JSR 166e atomic counter.
      Upgrade to Netty 3.3.1.
      Update NEWS and set version to 1.3.0-SNAPSHOT.
      Add a dependency on Google's Guava library.
      Update to suasync 1.2.0.
      Atomic increment coalescing.
      Add an integration test for increment coalescing.
      Keep track of usage statistics.
      Clean up any stale temporary file if we already have the right file.
      Add a CAS (Compare And Set) API.
      Allow passing a time range to a scanner.
      Clean up the test .class files too.
      Add a helper function to pretty-print an array of byte arrays.
      Enhance a toString helper for RPCs to also support values.
      Allow PutRequest to affect multiple columns in a row.

Berk D. Demir (1):
      Maven SDL compatibility & plugin updates.

Benoit "tsuna" Sigoure
Software Engineer @