[ISSUE] (TEPHRA-35) Prune invalid transaction set once all data for a given invalid transaction has been dropped

James Taylor (JIRA)

unread,

Jan 9, 2015, 8:23:47 PM1/9/15

to tephr...@googlegroups.com

James Taylor commented on an issue

Re: Prune invalid transaction set once all data for a given invalid transaction has been dropped

This would seem to inherently limit the scalability of Tephra. Any insight on when this will be implemented?

Add Comment

Tephra /

TEPHRA-35

Prune invalid transaction set once all data for a given invalid transaction has been dropped

In addition to dropping the data from invalid transactions we need to be able to prune the invalid set of any transactions where data cleanup has been completely performed. Without this, the invalid set will grow indefinitely and become a greater and greater cost to in-progress transactions over time.

To do this correctly, the TransactionDataJanitor copr...

This message was sent by Atlassian JIRA (v6.1.5#6160-sha1:a61a0fc)

Alex Baranau (JIRA)

unread,

Jan 9, 2015, 10:04:47 PM1/9/15

to tephr...@googlegroups.com

Alex Baranau commented on an issue

Re: Prune invalid transaction set once all data for a given invalid transaction has been dropped

James Taylor There's no target fix version or date set for it.

Note that in a healthy system it is very rare that transaction becomes invalid. Usually only if there's a client process crash or a datastore crash. In normal situation even if commit fails, the transaction gets rolled back and not put into invalid list. It is highly unlikely to accumulate a big size of invalid list. Though it can happen, hence the priority of fixing it is one of the highest.

Add Comment

James Taylor (JIRA)

unread,

Jan 9, 2015, 10:25:47 PM1/9/15

to tephr...@googlegroups.com

James Taylor commented on an issue

Re: Prune invalid transaction set once all data for a given invalid transaction has been dropped

Hmm. That's like saying transactions aren't important because usually everything works correctly. Unhealthy systems are one of the big reasons people want transactions.

Add Comment

Gary Helmling (JIRA)

unread,

Jan 9, 2015, 10:31:48 PM1/9/15

to tephr...@googlegroups.com

Gary Helmling commented on an issue

Re: Prune invalid transaction set once all data for a given invalid transaction has been dropped

This is definitely a scalability concern and we are working on a plan for addressing it. However, we don't yet have a target date.

There are two approaches we can take to mitigate this. The first is an operational approach, where Tephra would provide the ability for an administrator to truncate the invalid list up to a given point. The idea is that, as part of a normal operational policy handling major compactions, an admin would know up to what time all tables in the cluster have been major compacted. Since Tephra transaction IDs are time based, you could then manually issue a command to truncate the invalid list up to this time, since you know, by virtue of the major compactions completing, that any data from invalid transactions prior to that point have been purged. This isn't ideal, as it requires some operational coordination, but it is doable.

The second approach would build this processing and tracking into Tephra itself, so that it could make the determination to automatically truncate the invalid list. This will require quite a bit more complexity to do the tracking, and needs a detailed design around it.

Add Comment

Gary Helmling (JIRA)

unread,

Jan 9, 2015, 10:35:48 PM1/9/15

to tephr...@googlegroups.com

Gary Helmling commented on an issue

Re: Prune invalid transaction set once all data for a given invalid transaction has been dropped

James Taylor I don't think anyone is saying this issue isn't important. It is high priority.

As I mentioned, there are two approaches with differing levels of complexity. We will likely implement this in those two stages.

Add Comment

Alex Baranau (JIRA)

unread,

Jan 10, 2015, 4:30:47 PM1/10/15

to tephr...@googlegroups.com

Alex Baranau commented on an issue

Re: Prune invalid transaction set once all data for a given invalid transaction has been dropped

James Taylor I don't think anyone is saying this issue isn't important. It is high priority.

Yes! Exactly:

Though it can happen, hence the priority of fixing it is one of the highest.

Add Comment

Gary Helmling (JIRA)

unread,

Jan 30, 2015, 3:44:50 PM1/30/15

to tephr...@googlegroups.com

Gary Helmling updated an issue

Tephra /

TEPHRA-35

Prune invalid transaction set once all data for a given invalid transaction has been dropped

Change By:	Gary Helmling
Issue Type:	Bug Story
Priority:	Blocker

Add Comment

Reply all

Reply to author

Forward