[ISSUE] (TEPHRA-89) Allow converting existing tables to use transactions

已查看 2 次
跳至第一个未读帖子

Poorna Chandra (JIRA)

未读,
2015年9月28日 16:26:242015/9/28
收件人 tephr...@googlegroups.com
Poorna Chandra assigned an issue to Poorna Chandra
 
Tephra / Improvement TEPHRA-89
Allow converting existing tables to use transactions
Change By: Poorna Chandra
Assignee: Gary Helmling Poorna Chandra
Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.1.5#6160-sha1:a61a0fc)
Atlassian logo

Poorna Chandra (JIRA)

未读,
2015年9月28日 16:26:252015/9/28
收件人 tephr...@googlegroups.com

Poorna Chandra (JIRA)

未读,
2015年10月5日 18:58:212015/10/5
收件人 tephr...@googlegroups.com
We should provide a way to convert existing tables to be transactional. The main problem is the existing data in the tables, which will have been written with normal millisecond timestamps, instead of transaction IDs. If TTL is enabled on the table, for example, all of the existing data could be erroneously TTL'd due to the multiplier used for the times...

Poorna Chandra (JIRA)

未读,
2015年10月6日 19:23:222015/10/6
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

Add to documentation -

  • Delete marker of table before and after Tephra should be same.
  • We can only convert tables that use real timestamps in cells to Tephra tables.
  • We cannot write non-transactional data into table after Tephra is enabled on it.

Poorna Chandra (JIRA)

未读,
2015年10月7日 19:51:232015/10/7
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

James Taylor During the review of PR https://github.com/caskdata/tephra/pull/85, Andreas Neumann raised a point about relaxing the time range restriction on scans and gets to include existing data. This might lead to purely transactional tables paying performance penalty while doing scans and gets.

To avoid the performance penalty, we are planning on having a flag like "tephra.existing.data.readable=true" on the table descriptor of the tables that have existing data. Only on tables having this flag, we'll relax the time range restriction during scans and get. So purely transactional tables will not be paying a performance penalty.

Note that even if a table does not have "tephra.existing.data.readable=true" flag set, any existing data in the table will still have the right TTL applied. This is because we don't restrict time ranges during compaction and flushes. Hence there won't be any data loss. Existing data will just not be visible. Once you enable the flag, existing data will become visible again.

Let me know if this sounds good.

James Taylor (JIRA)

未读,
2015年10月8日 13:02:262015/10/8
收件人 tephr...@googlegroups.com
James Taylor commented on an issue

Poorna Chandra - yes, this is fine. This JIRA is important for folks who have an existing table and would like to make it transactional - I think it'll help adoption quite a bit.

Under what circumstances would you expect a performance penalty? In the typical case, your scans/gets need to see all of your data, even data written a long time ago. Is it the case where you've overwritten and/or deleted a lot of earlier data and compaction hasn't run yet? Or something else?

Poorna Chandra (JIRA)

未读,
2015年10月8日 16:53:212015/10/8
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

Today while doing scans in co-processor we restrict the scan time range to -
(TxUtils.getOldestVisibleTimestamp(ttlByFamily, tx), TxUtils.getMaxVisibleTimestamp(tx))

To consider any pre-existing data we will have to relax this time range to -
(current time - ttl, TxUtils.getMaxVisibleTimestamp(tx))

The performance penalty will be paid by purely transactional tables that don't have any pre-existing data. Since they will have to do an extra scan due to relaxing the restriction on the time range. If we know what tables have pre-existing data then we can relax the time range only on such tables.

James Taylor (JIRA)

未读,
2015年10月8日 17:06:222015/10/8
收件人 tephr...@googlegroups.com
James Taylor commented on an issue

When would there be data that would be filtered by restricting the min time to TxUtils.getOldestVisibleTimestamp(ttlByFamily, tx)?

Poorna Chandra (JIRA)

未读,
2015年10月8日 17:24:222015/10/8
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

While doing a scan, any cell that is older than (current txn time - ttl) need not be read as those cells have expired, right? Even if such cells are read, they will be rejected due to TTL filter, and will not be returned. Hence as an optimization, we restrict the time range during scan to read only non-expired cells.

Note that transactional cells have timestamps that are 10⁶ times more than regular timestamps in milliseconds. But once we mix in pre-existing data with regular timestamps, the time range will need to change to start from regular time - (current time - ttl). This will lead to extra scans. If we know what tables have pre-existing data then we can relax the restriction for only those tables.

Poorna Chandra (JIRA)

未读,
2015年10月9日 14:13:242015/10/9
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

James Taylor If you don't have any more questions on this then I'll start making the changes.

James Taylor (JIRA)

未读,
2015年10月9日 14:59:222015/10/9
收件人 tephr...@googlegroups.com
James Taylor commented on an issue

So the only time you'd see a performance degradation is when the table has a TTL, there's expired data, and a major compaction hasn't run yet, right? If an existing table has a TTL, then it'd be a TTL specified in the way HBase needs it. Is that the same way you specify a TTL in Tephra, or do you have your own metadata attribute? If you have your own way, then no existing table would ever have this attribute, so it's really a non issue.

Also, would an existing tables TTL need to be converted into some way? Or could it be left as-is and cooperate with Tephra's mechanism? If this is problematic, we could always disallow an existing table from being declared as transactional if it has a TTL.

Poorna Chandra (JIRA)

未读,
2015年10月9日 18:36:222015/10/9
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

So the only time you'd see a performance degradation is when the table has a TTL, there's expired data, and a major compaction hasn't run yet, right? If an existing table has a TTL, then it'd be a TTL specified in the way HBase needs it. Is that the same way you specify a TTL in Tephra, or do you have your own metadata attribute? If you have your own way, then no existing table would ever have this attribute, so it's really a non issue.

Tephra has its own column family attribute to specify TTL. The application of TTL is not an issue. After this PR, Tephra will always apply the right TTL to both pre-existing cells and transactional cells. The co-processor makes the determination on how to apply the TTL by looking at the timestamp of a cell. If the cell's timestamp is in current time millis range, then the cell is considered as pre-existing cell. If not, the cell is transactional.

The issue is while doing user scans. In user scans, we apply an extra filter to exclude expired cells. Since the timestamp of pre-existing cells are of much smaller range than trasactional cells, any change to this filter to include pre-exiisting cells' timestamp will render this filter ineffective for purely transactional tables. Also, since this filter is applied before a cell reaches the co-processor, we cannot influence this filter based on cell time range like we do for TTL.

Having a table attribute like "data.tx.read.pre.existing=true", will allow us to determine whether to add a scan filter that will allow pre-existing cells or not. Not having to read expired cells while doing user scans was the performance optimization I was talking about.

Also, would an existing tables TTL need to be converted into some way? Or could it be left as-is and cooperate with Tephra's mechanism? If this is problematic, we could always disallow an existing table from being declared as transactional if it has a TTL.

Existing table will need no other conversion than just specifying an additional table attribute like "data.tx.read.pre.existing=true". Tephra will always apply the right Tephra TTL as stated above.
Note that if an HBase table already had a TTL that is different from Tephra's TTL, the minimum of the TTLs will apply to pre-existing cells.

So to make an existing table transactional, the following steps need to be taken -

  • Add Tephra co-processor to the table.
  • Set ""data.tx.read.pre.existing=true" as the table attribute.

James Taylor (JIRA)

未读,
2015年10月9日 19:41:222015/10/9
收件人 tephr...@googlegroups.com
James Taylor commented on an issue

Thanks for the explanation. Just want to make sure I understand what conditions would have to occur for which having a data.tx.read.pre.existing=true flag will help. One other consideration is that there'll be an extra RPC for getting the HTableDescriptor based on this solution, right?

So the scan optimization will occur when someone:
1. switches an existing table from non transactional to transactional
2. adds a TTL to the now transactional table
3. a scan is done prior to a major compaction

Is that correct? At step (3), you can still set the min time range if data.tx.read.pre.existing != true.

Poorna Chandra (JIRA)

未读,
2015年10月9日 20:22:212015/10/9
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

Thanks for the explanation. Just want to make sure I understand what conditions would have to occur for which having a data.tx.read.pre.existing=true flag will help.

Scan optimization is already being done today for transactional tables that have TTL defined:
1. A transactional table is newly created.
2. A TLL is applied to this table.
3. A scan is done prior to major compaction.

The scan filter is applied in step 3 above to exclude expired cells from being read.

In the case that you are talking about -

1. switches an existing table from non transactional to transactional
2. adds a TTL to the now transactional table
3. a scan is done prior to a major compaction

The scan in step 3 will not return any pre-existing cells as the minimum transactional time in the scan filter will be much higher than the pre-existing cells' timestamp.

To include pre-existing cells in the scan result, we will have to modify the scan filter to include lower range timestamps. This will lead extra scans for purely transactional tables, as the lower time range will now allow expired cells to pass through the scan filter. If we know which tables are transactional, we can only relax the scan filter on those tables. Thus not affecting the performance of purely transactional tables.

One other consideration is that there'll be an extra RPC for getting the HTableDescriptor based on this solution, right?

HTableDescriptor is read in start method of co-processor, which we already do to figure out TTLs. So there will be no extra RPC to read the flag.

James Taylor (JIRA)

未读,
2015年10月10日 00:37:212015/10/10
收件人 tephr...@googlegroups.com
James Taylor commented on an issue

Good to know there's no extra overhead for getting HTableDescriptor.

Just to confirm, the optimization you're talking about is only for a table with a TTL and it's only a factor until a major compaction occurs at which time the TTLed cells will not be re-written.

Is that correct?

Poorna Chandra (JIRA)

未读,
2015年10月10日 00:56:212015/10/10
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

Yes, that is correct.

James Taylor (JIRA)

未读,
2015年10月10日 01:36:252015/10/10
收件人 tephr...@googlegroups.com
James Taylor commented on an issue

Great - thanks again for the explanation. Sounds like a good plan.

Poorna Chandra (JIRA)

未读,
2015年10月16日 16:29:362015/10/16
收件人 tephr...@googlegroups.com
Poorna Chandra updated an issue
Change By: Poorna Chandra
Fix Version/s: 0.6.4
Fix Version/s: 0.6.3

Poorna Chandra (JIRA)

未读,
2015年10月21日 14:56:282015/10/21
收件人 tephr...@googlegroups.com
Poorna Chandra updated an issue
Change By: Poorna Chandra
Fix Version/s: 0.6.3
Fix Version/s: 0.6.4

Poorna Chandra (JIRA)

未读,
2015年10月27日 22:43:232015/10/27
收件人 tephr...@googlegroups.com
We should provide a way to convert existing tables to be transactional. The main problem is the existing data in the tables, which will have been written with normal millisecond timestamps, instead of transaction IDs. If TTL is enabled on the table, for example, all of the existing data could be erroneously TTL'd due to the multiplier used for the times...

James Taylor (JIRA)

未读,
2015年10月28日 16:52:222015/10/28
收件人 tephr...@googlegroups.com
James Taylor commented on an issue

Based on our testing, this seems to work well, Poorna Chandra. Nice work! Our JIRA to allow this is PHOENIX-1821.

One question, for confirmation: for existing data, you don't interpret empty cell values as delete markers, right? This could be problematic for Phoenix, because we use an empty cell value for rows in non transactional tables. Once a table is switched to be transactional, newer data would not use an empty cell value, though.

Poorna Chandra (JIRA)

未读,
2015年10月28日 18:02:222015/10/28
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

Good point James Taylor! I missed it. We'll need to allow delete markers to go through for existing data.

James Taylor (JIRA)

未读,
2015年10月28日 18:13:212015/10/28
收件人 tephr...@googlegroups.com
James Taylor commented on an issue

Do you want a separate JIRA for this or is this one ok to use? I'll try to come up with a test in Phoenix where this is needed.

Poorna Chandra (JIRA)

未读,
2015年10月28日 18:31:212015/10/28
收件人 tephr...@googlegroups.com
Poorna Chandra commented on an issue

Created TEPHRA-143 for the deletes case. Let's close this JIRA.

Poorna Chandra (JIRA)

未读,
2015年10月28日 21:42:222015/10/28
收件人 tephr...@googlegroups.com
Poorna Chandra resolved an issue as Fixed
Change By: Poorna Chandra
Status: In Progress Resolved
Resolution: Fixed

Poorna Chandra (JIRA)

未读,
2015年11月6日 20:36:222015/11/6
收件人 tephr...@googlegroups.com
Poorna Chandra updated an issue
Change By: Poorna Chandra
Issue Type: Improvement New Feature
回复全部
回复作者
转发
0 个新帖子