Missing documentation

Mark Bakker

unread,

Jan 31, 2015, 11:23:05 AM1/31/15

to tephr...@googlegroups.com

Hello,

I try to use tephra, I used it during development for a few days now.

I realy like it, but now I want to connect to a real HBase cluster.

I have 2 questions:

1. How do I start the Thephra within integration tests and connect to a local HBase minicluster.

private TransactionSystemClient txSystemClient;

2. Second, how do I get a client connection with the tephra server.
I can't find any documentation on getting a client connection.


I hope you can help me or point me to the correct documentation.


Kind regards,

Mark Bakker

Gary Helmling

unread,

Jan 31, 2015, 3:50:54 PM1/31/15

to Mark Bakker, tephr...@googlegroups.com

Hi Mark,

The easiest way to get a TransactionSystemClient is using Google Guice and the bundled Guice modules. This will allow you to get a fully configured TransactionServiceClient instance (which implements TransactionSystemClient):

Injector injector = Guice.createInjector(

new ConfigModule(conf),

new ZKModule(),

new DiscoveryModules().getDistributedModules(),

new TransactionModules().getDistributedModules(),

new TransactionClientModule()

);

ZKClientService zkClient = injector.getInstance(ZKClientService.class);

zkClient.startAndWait();

TransactionServiceClient client = injector.getInstance(TransactionServiceClient.class);

You can do this during your application startup and reuse the same TransactionServiceClient instance across all application threads.

Then, within each application thread, you can use a TransactionContext instance, along with TransactionAwareHTable instances to interact with HBase:

Configuration conf = HBaseConfiguration.create();

HConnection conn = HConnectionManager.createConnection(conf);

TransactionAwareHTable txTable = new TransactionAwareHTable(conn.getTable("mytable"));

TransactionContext txContext = new TransactionContext(client, txTable);

try {

txContext.start();

txTable.put(...); // perform normal operations

...

txContext.finish();

} catch (TransactionFailureException tfe) {

txContext.abort();

}

You're right that we are missing documentation on this setup as part of Tephra. I've added an issue to add this into the docs, you can track it here: https://issues.cask.co/browse/TEPHRA-61

1. How do I start the Thephra within integration tests and connect to a local HBase minicluster.
private TransactionSystemClient txSystemClient;

For unit tests, there is an in-memory version of TransactionSystemClient that you can use (InMemoryTxSystemClient). This will reference and call methods on a TransactionManager instance directly. For an example of how to set this up, you can take a look at TransactionContextTest in the source code:

https://github.com/caskdata/tephra/blob/develop/tephra-core/src/test/java/co/cask/tephra/TransactionContextTest.java

2. Second, how do I get a client connection with the tephra server.
I can't find any documentation on getting a client connection.

A TransactionSystemClient instance (such as TransactionServiceClient) is all that you need to connect to the Tephra server from your client. So the description above about how to use the Guice modules should be all that you need.

Hope this helps. Please let us know if you run into any problems.

Mark Bakker

unread,

Feb 16, 2015, 9:42:09 AM2/16/15

to Gary Helmling, tephr...@googlegroups.com

Hi Gary,

Many thanks for the help the last time. We are progressing a lot with Tephra. However we have another fundamental problem:

We must read our own writes. Currently we are not experiencing this behavior from Tephra. We figured that it has something to do with the fact that tx read pointer is a number smaller than the tx write pointer. Is this true? Why is there such a thing as a read pointer; we thought that the write pointer would be the same as the read pointer since we also have a list of invalid ids up to the point of the write pointer. What can we do to achieve reading our own writes?

Thanks again.

Kind regards,

Mark Bakker

Gary Helmling

unread,

Feb 17, 2015, 10:35:09 PM2/17/15

to Mark Bakker, tephr...@googlegroups.com

Hi Mark,

Tephra was originally derived from our Cask Data Application Platform (CDAP), which uses an additional client layer for HBase interactions. In this system, the client caches writes for a given transaction, and merges those pending writes back in to data that is read, providing the read-your-own-writes guarantee. However, when we extracted Tephra, the client side caching was not including, so Tephra alone seems to have lost this guarantee.

We can fix this by having Tephra set a read time range up to writePointer + 1. We already filter out any writes from transactions that were in progress when the given transaction started, so this will not pose any problems. I've opened a JIRA for fixing this: https://issues.cask.co/browse/TEPHRA-65.

When the client does cache writes from the same transaction, the read pointer could still provide some optimization. In that case, it is safe to set the time range for reads to readPointer + 1, which could limit some of the data read when there are many in-progress transactions.

Thanks for reporting the problem. If you would like to post a patch for a fix, I would be happy to review it. Otherwise, we'll get this fixed as soon as we can.

Reply all

Reply to author

Forward