[ISSUE] (TEPHRA-99) Make "long running" transactions usable with TransactionContext

2 views
Skip to first unread message

Gary Helmling (JIRA)

unread,
May 12, 2015, 4:42:26 PM5/12/15
to tephr...@googlegroups.com
Gary Helmling created an issue
 
Tephra / Improvement TEPHRA-99
Make "long running" transactions usable with TransactionContext
Issue Type: Improvement Improvement
Assignee: Gary Helmling
Components: core
Created: 12/May/15 1:41 PM
Priority: Major Major
Reporter: Gary Helmling

"Long running" transactions (type == LONG) are supported by the Tephra TransactionManager, but TransactionContext does not expose any way for clients to interact with them. I think this will require a couple changes:

  • add a startLong() method to TransactionContext
  • add a constructor to TransactionContext that takes an existing Transaction instance. Since long running transactions are often used in map reduce processing, the process committing the transaction may be different from the process that started the transaction. In this situation, we need a way to pass the serialized transaction all the way through to the other process.

Regarding map reduce support, we could use additional utilities or support in place to make transactions easier to use with map reduce. But this would at least serve as a first step.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.1.5#6160-sha1:a61a0fc)
Atlassian logo

Andreas Neumann (JIRA)

unread,
May 12, 2015, 10:35:25 PM5/12/15
to tephr...@googlegroups.com
Andreas Neumann commented on an issue
 
Re: Make "long running" transactions usable with TransactionContext

I am not sure whether I like this. TransactionContext also asks all transaction-awares for their change sets and performs conflict-detection and rollback. But for long-running transactions, I am not sure whether that makes sense or can even work. But adding support for long running txs to this class might suggest that.

Actually, reading the javadocs for this class, it only documents the full behavior for abort(), but not start and end. Javadocs should then be updated to explain exactly what happens for short and for long-running transactions.

"Long running" transactions (type == LONG) are supported by the Tephra {{TransactionManager}}, but {{TransactionContext}} does not expose any way for clients to interact with them. I think this will require a couple changes:
* add a {{startLong()}} method to TransactionContext
* add a constructor to TransactionContext that takes an existing {{Transacti...

Terence Yim (JIRA)

unread,
May 12, 2015, 10:51:25 PM5/12/15
to tephr...@googlegroups.com
Terence Yim commented on an issue

The driving use case actually doesn't need a long transaction. It's for the HBase queue debug tool so that it can read the queue states with the latest read pointer. All it needs is a read only transaction that never expires (vs right now it uses TransactionContext, which starts a short tx and get expired).

Gary Helmling (JIRA)

unread,
May 12, 2015, 11:04:26 PM5/12/15
to tephr...@googlegroups.com
Gary Helmling commented on an issue

Read-only transactions is another case we should consider. In that case, the transaction manager shouldn't need to track any state for the transaction.

But I actually filed this issue for Phoenix and map reduce use cases which seem to match with where we've said to use long running transactions in the past. In those cases no conflict detection is performed, but the transaction should still be able to commit, abort or invalidate.

I'm mostly concerned with how we provide a comprehensible client API. If TransactionContext is not the place to handle long running transactions, how should they be managed?

Gary Helmling (JIRA)

unread,
May 12, 2015, 11:10:27 PM5/12/15
to tephr...@googlegroups.com
Gary Helmling commented on an issue

As far as change sets are concerned, this issue would not change how long running transactions operate. I think that long running transactions would enforce a conflict detection level of NONE and no change sets would be tracked for any writes.

Andreas Neumann (JIRA)

unread,
May 13, 2015, 2:52:26 AM5/13/15
to tephr...@googlegroups.com
Andreas Neumann commented on an issue

+1 for read-only transactions.

For how to interact with transactions, it appears that the current TransactionContext is really centered around short transactions, because it does the full can-commit/commit/rollback lifecycle. It also manages TransactionAwares, and you can add another TransactionAware after a transaction has started. So this class really manages the lifecycle of a transaction from start to end.

If we want to add a way to create a transaction context from an existing transaction, what does that mean for the TransactionAwares that participate in the transaction? I think this class assumes that they are controlled by it, too, through the whole life of a transaction. What does it mean to create a TransactionContext from an existing transaction but with different TransactionAwares than when the tx was started? How do we guard against that?

Gary Helmling (JIRA)

unread,
May 13, 2015, 12:29:27 PM5/13/15
to tephr...@googlegroups.com
Gary Helmling commented on an issue

Andreas Neumann How do you think long transaction should be exposed in Tephra?

Andreas Neumann (JIRA)

unread,
May 13, 2015, 3:11:26 PM5/13/15
to tephr...@googlegroups.com
Andreas Neumann commented on an issue

I think it is a little more than allowing to create a TxContext from an existing transaction. This may be a little verbose, but I am just putting down my line of thoughts:

TransactionContext has a method named start() which:

  • starts a new transaction
  • calls startTx() on TransactionAwares

and finish() which:

  • checks for conflicts
  • calls private method persist() to make all participating TransactionAwares flush their changes
  • attempts to commit
  • calls postTxCommit() on all TransactionAwares

This is really centered around short transactions that run inside a single process and manage the entire lifecycle using the TransactionContext.

But for a long-running transaction, it is more likely that multiple separate processes participate in the transaction (for example in M/R, the mappers and reducers). For a mapper to participate in the transaction, it must:

  • call startTx() on all TransactionAwares before joining, and
  • persist all changes in the TransactionAwares by calling commitTx() on each of them - which is actually misnamed, shouldn't is be persist()? - before leaving.

But if persisting fails for any single one of them, the transaction must aborted across all other processes that participate. That requires coordination across all participants, and I don't see that this can be done with TransactionContext.

We can make TransactionContext support this case by adding a constructor that takes an existing transaction and a set of TransactionAwares, and will call startTx() on of them. It will also have to expose a flush() method (or you name it) that does a little less than the current persist() - it should call commitTx() on all TxAwares, but not abort the transaction if any of them fails.

You also want to make sure if the constructor with the existing transaction is used, then neither start() nor finish() or abort() may be called, because this TransactionContext does not "own" the transaction - or does it? Would we want to allow a mapper to commit or abort the transaction? More likely we want to prevent that, and rather depend on the coordination layer to communicate the failure to the process that "owns" the transaction, so that it can commit or abort it.

But now we get to a point where a mapper can only use the new constructor and the new finish() method, so the interface used by a mapper is disjoint from the interface used for short transactions. If they don't share any methods, we might as well define a separate class for the mapper to use. Maybe call it TransactionParticipantContext to indicate that this for someone who participates in the transaction but does not control its lifecycle? Or maybe you know a better name...

Does this make sense?

Andreas Neumann (JIRA)

unread,
May 13, 2015, 3:18:26 PM5/13/15
to tephr...@googlegroups.com
Andreas Neumann commented on an issue

Coming back to the original proposal: You are right, that the transaction may be committed by a different process than it was started in. For that case I agree we should add the constructor that you suggested.

James Taylor (JIRA)

unread,
May 20, 2015, 10:43:29 PM5/20/15
to tephr...@googlegroups.com
James Taylor commented on an issue

Let me explain the Phoenix use case for this to see if it helps clarify the functionality we're looking for. Many of our use cases at Salesforce are for write-once/append-only data. I've seen that most users start with this, as they are the lowest risk, simplest to implement/reason about, and fit many common big-data use cases. In this case, no conflict detection is required (TEPHRA-92). But as Gary Helmling pointed out, the change set would still need to be tracked on the client in order for the transaction to be abort. This JIRA is about providing an option that allows the client to choose not to do this. Instead the transaction would be invalidated instead of attempting to abort it. The reasoning is that when a failure occurs, it's most likely due to a write failure (and never due to a conflict). In this case, it's likely that the abort would fail anyway, so why take the memory hit of tracking the change set on the client?

The primary way this would help scaling is in simultaneous transactions on the same client. Each individual transaction would still be expected to fit into memory, committed in a batched manner. A good example would be a monitoring use case that buffers a configurable number of events before committing them. There's no rollback requirement for this, as the commit either succeeds or fails (i.e. it's not like a MR job that we want to be treated as one giant transaction).

Andreas Neumann (JIRA)

unread,
May 21, 2015, 4:10:26 PM5/21/15
to tephr...@googlegroups.com
Andreas Neumann commented on an issue

James Taylor thanks for the explanataion. I think when you say:

The primary way this would help scaling is in simultaneous transactions on the same client.

you really mean simultaneous clients using the same transaction, right? Then it makes a lot of sense, and it is actually almost the same as the MapReduce case. The simultaneous participants of the transaction only use the transaction to get a write version for their appends. But none of the participants should be able to commit or abort the transaction, because that requires coordination between all participants. So there will be a "master" or "coordinator" that starts the transaction, communicates it to all participants, then monitors all participants for success or failure, and finally either commits or invalidates the transaction.

That is the reason why I suggested to introduce a "ParticipantContext" that has all the methods (probably only two) needed by participants. Whereas for the coordinator, I already agreed with Gary's proposed changes.

Andreas Neumann (JIRA)

unread,
May 21, 2015, 4:23:27 PM5/21/15
to tephr...@googlegroups.com
Andreas Neumann edited a comment on an issue
[~jamestaylor] thanks for  the  explanataion  explanation . I think when you say:
{quote}

The primary way this would help scaling is in simultaneous transactions on the same client. 
{quote}

you really mean simultaneous clients using the same transaction, right? Then it makes a lot of sense, and it is actually almost the same as the MapReduce case. The simultaneous participants of the transaction only use the transaction to get a write version for their appends. But none of the participants should be able to commit or abort the transaction, because that requires coordination between all participants. So there will be a "master" or "coordinator" that starts the transaction, communicates it to all participants, then monitors all participants for success or failure, and finally either commits or invalidates the transaction. 

That is the reason why I suggested to introduce a "ParticipantContext" that has all the methods (probably only two) needed by participants. Whereas for the coordinator, I already agreed with Gary's proposed changes.

Andreas Neumann (JIRA)

unread,
Oct 12, 2015, 2:37:21 PM10/12/15
to tephr...@googlegroups.com

Andreas Neumann (JIRA)

unread,
Oct 12, 2015, 3:02:22 PM10/12/15
to tephr...@googlegroups.com
Andreas Neumann commented on an issue
 
Re: Make "long running" transactions usable with TransactionContext

Looking at this again after a while. So, we need two things:

  • ability to create a tx context from an existing transaction
  • ability to participate in a transaction without "owning" it, that is, without starting, committing, or aborting it.

One way to do that is to introduce:

  • a new enum Mode { LEADER, PARTICIPANT }
  • a constructor that takes an existing transaction and a Mode.

If a transaction is given, then the new tx context will not start a transaction, but use the provided one.
If the Mode is PARTICIPANT, then

  • start() will not start a transaction, but only call startTx on all TxAwares
  • finish() will only persist the changes of all tx-awares (but not commit the tx)
  • abort() will only rollback the changes of all tx-awares (but not abort or invalidate the tx)
"Long running" transactions (type == LONG) are supported by the Tephra {{TransactionManager}}, but {{TransactionContext}} does not expose any way for clients to interact with them. I think this will require a couple changes:
* add a {{startLong()}} method to TransactionContext
* add a constructor to TransactionContext that takes an existing {{Transacti...

James Taylor (JIRA)

unread,
Oct 13, 2015, 5:28:22 PM10/13/15
to tephr...@googlegroups.com
James Taylor commented on an issue

This sounds good, Andreas Neumann. This might be covered by the above, but it'd be good to have the ability to pass through a transaction ID through a client such that it could participate in an existing, in-progress transaction. Perhaps the actual state of the transaction could be returned from the transaction manager when the client joins the transaction? For example, you could pass through the transaction ID as a connection property to Phoenix and you'd have the ability to use Phoenix as a transaction participant in the CDAP platform.

Poorna Chandra (JIRA)

unread,
Oct 16, 2015, 4:29:33 PM10/16/15
to tephr...@googlegroups.com
Poorna Chandra updated an issue
Change By: Poorna Chandra
Fix Version/s: 0.6.4
Fix Version/s: 0.6.3

Poorna Chandra (JIRA)

unread,
Nov 12, 2015, 9:02:22 PM11/12/15
to tephr...@googlegroups.com
Poorna Chandra commented on an issue
 
Re: Make "long running" transactions usable with TransactionContext

James Taylor We typically serialize the whole transaction object and pass it on to callees. The callee can then use the same transaction object for its operations. The callee is expected to flush all changes before returning.

However, checkpointing (

TEPHRA-96 ) complicates this a little bit since now transaction objects are not immutable. Also, we may have to to rethink checkpoint visibility since now we can have multiple simultaneous checkpoints by different clients.
"Long running" transactions (type == LONG) are supported by the Tephra {{TransactionManager}}, but {{TransactionContext}} does not expose any way for clients to interact with them. I think this will require a couple changes:
* add a {{startLong()}} method to TransactionContext
* add a constructor to TransactionContext that takes an existing {{Transacti...

Poorna Chandra (JIRA)

unread,
Jan 12, 2016, 2:34:36 PM1/12/16
to tephr...@googlegroups.com
Poorna Chandra updated an issue
Change By: Poorna Chandra
Fix Version/s: 0.6.5
Fix Version/s: 0.6.4

Priyanka Nambiar (JIRA)

unread,
Feb 19, 2016, 4:42:37 PM2/19/16
to tephr...@googlegroups.com
Priyanka Nambiar updated an issue
Change By: Priyanka Nambiar
Fix Version/s: 0.7.1
Fix Version/s: 0.7.0
Reply all
Reply to author
Forward
0 new messages