asynchbase 1.3.0-rc2 available for download and testing

134 views
Skip to first unread message

tsuna

unread,
Apr 29, 2012, 4:19:01 AM4/29/12
to AsyncHBase
Hi all,
The second release candidate for the next feature release of asynchbase
is available for download.  The biggest changes are essentially atomic
increment coalescing and fine-grained statistics collection of the
client's activity. Other changes include the ability to use atomic CAS
or set a timestamp filter on Scanners.

Atomic increment coalescing can easily speed up by an order of
magnitude applications that maintain various counters in HBase as they
serve queries.  This code was originally written in Scala for an
application server at StumbleUpon, where it's coalescing 40k counter
increments/s down to about 2k/s, thus significantly reducing the write
load on HBase by cutting by 20x the amount of data that RegionServers
have to write to their WAL (Write-Ahead Log).

Please note that v1.3.0 of asynchbase introduces a new dependency on
Google's excellent Guava library.  I don't add dependencies lightly,
and I'm very picky about choosing them, but Guava is a really solid,
beautifully written library, with coding standards far above average.

Here is the relevant excerpt of the NEWS file:

New public APIs:
- PleaseThrottleException can now give you a Deferred.
- There is now a `Counter' class that provides a highly concurrent
replacement for AtomicLong from jsr166e's LongAdder.
- HBaseClient now has a `bufferAtomicIncrement' method used to
coalesce atomic counter increments.
- A new method, `stats()', provides detailed statistics on the client's
activities (number of calls for various RPCs, number of connections
created, etc.)
- `HBaseClient' has a new `compareAndSet(PutRequests, byte[])' method
for atomic Compare-And-Set (CAS) operation.
- The Scanner has new methods to allow specifying a time range to scan.
- `PutRequest' has extra constructors so as to be able to affect multiple
columns in a row. This is required to be able to atomically CAS more
than one column at a time on a given row.

Deprecated public APIs:
- In HBaseClient, the methods `rootLookupCount()',
`uncontendedMetaLookupCount()', and `contendedMetaLookupCount()' are
deprecated in favor of the new `stats()' API. These methods will be
removed in the 2.0 release.

Noteworthy changes:
- Upgraded to Netty 3.3.1, suasync to 1.2.0.
- asynchbase now depends on Google's Guava library (v11.0.2).

Pre-compiled JAR:
http://tsunanet.net/~tsuna/asynchbase/asynchbase-1.3.0-rc2.jar (also
available as asynchbase-1.3.0-SNAPSHOT.jar in Maven)
Source: https://github.com/tsuna/asynchbase
Javadoc: http://tsunanet.net/~tsuna/asynchbase/1.3.0/org/hbase/async/package-summary.html

$ git diff --stat v1.2.0.. | tail -n 1
32 files changed, 2460 insertions(+), 89 deletions(-)

$ git shortlog v1.2.0..
Benoit Sigoure (21):
Start version 1.2.1.
Use `:=' more.
Have distclean remove Maven's target directory.
Make PleaseThrottleException easier to handle.
Add jsr166e LongAdder.
Add a class to use LongAdder from JSR 166e.
Switch code to JSR 166e atomic counter.
Upgrade to Netty 3.3.1.
Update NEWS and set version to 1.3.0-SNAPSHOT.
Add a dependency on Google's Guava library.
Update to suasync 1.2.0.
Atomic increment coalescing.
Add an integration test for increment coalescing.
Keep track of usage statistics.
Clean up any stale temporary file if we already have the right file.
Add a CAS (Compare And Set) API.
Allow passing a time range to a scanner.
Clean up the test .class files too.
Add a helper function to pretty-print an array of byte arrays.
Enhance a toString helper for RPCs to also support values.
Allow PutRequest to affect multiple columns in a row.

Berk D. Demir (1):
Maven SDL compatibility & plugin updates.

--
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

tsuna

unread,
May 2, 2012, 4:19:11 AM5/2/12
to AsyncHBase
Just a quick update: I've replaced rc2 with rc3. The only changes are:
- Use Guava 12.0 instead of 11.0.2
- Use Netty 3.4.2 instead of 3.3.1
- One line of code has changed to enable statistics on the Guava
Cache, since Guava 12 disables them by default.

So everything essentially is the same, except that we're now using
newer dependencies. There is no point in releasing asynchbase 1.3.0
with dependencies that are older. I also published a new
1.3.0-SNAPSHOT for Maven users.

If you have feedback, please send it before this Friday. We're
running with this code in production at StumbleUpon and it's working
fine.

Roshan Naik

unread,
Nov 25, 2013, 11:48:14 PM11/25/13
to async...@googlegroups.com
Hi,
Apache Flume uses this library to talk to HBase. I am testing asynchbase with it and seeing unit test failures. Specifically it seems to be the case that the callback passed to
HBaseClient.ensureTableFamilyExists(cb, erb )   are not getting invoked. 

I used rc2... but i see no reason rc3 would be any diff.

-roshan

CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

tsuna

unread,
Nov 27, 2013, 7:17:53 AM11/27/13
to Roshan Naik, AsyncHBase

Hi Roshan,
Are you referring to AsyncHBase's unit tests failing, or Flume's?  Do they fail consistently?  Can you tell me more precisely how I could try to reproduce what you're seeing?  Can you pastebin the debug-level log of the failing tests?

Lots of questions, sorry :)

--
Sent from my Android phone

Roshan Naik

unread,
Dec 2, 2013, 3:26:06 PM12/2/13
to async...@googlegroups.com, Roshan Naik
Hi tsuna,
- I am referring to Flume's unit test for its Async HBase sink.. The sink uses the AsyncHbase library.
- Yes this is a consistent failure.
- Reproducing: 
   + checkout flume trunk from github, 
   + change pom.xml to use the rc2 of async-hbase,
   +  Build as follows:
           mvn clean install -DskipTests -Dhadoop.profile=2 -Dhadoop-two.version=2.2.0 -Dhbaseversion=0.96.0-hadoop2
   + Test as follows:
          mvn test -Dhadoop.profile=2 -Dhadoop-two.version=2.2.0 -Dhbaseversion=0.96.0-hadoop2  -rf :flume-ng-hbase-sink  -Dtest=TestAsyncHBaseSink 

The failure is in the testTimeOut


Here is the stack traces of all the threads....


Let me know if you need more info. Sorry for the delayed response. Meant to respond earlier but got side tracked.


-roshan


 

Roshan Naik

unread,
May 2, 2014, 5:55:23 PM5/2/14
to async...@googlegroups.com, Roshan Naik
Hi All,
  Checking back. Have not heard any updates on this issue. Would like to start using this in Flume soon.
-roshan

tsuna

unread,
May 4, 2014, 5:03:58 AM5/4/14
to Roshan Naik, AsyncHBase
On Fri, May 2, 2014 at 2:55 PM, Roshan Naik <ros...@hortonworks.com> wrote:
> Checking back. Have not heard any updates on this issue. Would like to
> start using this in Flume soon.

No, People using AsyncHBase 1.5 outside of Flume don't seem to have
run into this issue. I'm really sorry I don't have the bandwidth to
debug Flume.

PS: Let's start a new thread if you wanna keep discussing this.

--
Benoit "tsuna" Sigoure
Reply all
Reply to author
Forward
0 new messages