Question about tephra delete

37 views
Skip to first unread message

jinge...@gmail.com

unread,
Mar 26, 2016, 4:08:43 AM3/26/16
to Tephra User
Hello, I'm newbie tephra :)

I have a question about tephra deletion process.

As I know, delete operation interpreted as put(with empty array) by TransactionProcessor.

And after major_compact, mark-deleted(put) is gone away.

But I tested, deleted-markers still leave in HBase after major compaction.

Below are my test environment.

hbase : 1.0.0-cdh5.4.7
tephra : 0.7.0(tephra-hbase-compat-1.0-cdh)
cf description : {NAME => 'd', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

Please tell me what went wrong.. 
Thank You!

Poorna Chandra

unread,
Mar 26, 2016, 8:49:55 PM3/26/16
to jinge...@gmail.com, Tephra User
Hi,

During a major compaction, Tephra removes a cell version only when there are no in-progress transactions for which the cell version might be visible. So, looks like there were some in-progress transactions for which the delete marker was visible, hence the delete was not removed after the major compaction. 

If you run the major compaction again after sometime (time greater than a long transaction timeout, which is one day by default), the delete marker should also get removed.

Thanks,
Poorna.


--
You received this message because you are subscribed to the Google Groups "Tephra User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tephra-user...@googlegroups.com.
To post to this group, send email to tephr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tephra-user/c3c18ecd-2cb6-4593-8d65-5126e66d3d1d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jinge...@gmail.com

unread,
Mar 27, 2016, 1:37:57 AM3/27/16
to Tephra User, jinge...@gmail.com
Poorna, thank you for your reply :D
Currently I have no in-progress transactions(I stopped service instance).
And I ran hbase shell, type "major_compact 't1'"
but deleted cells are still remained in HBase. 

hbase(main):003:0> scan 't1'
 rr2                  column=d:a, timestamp=1458360288251000000, value=sample
 rr2                  column=d:x, timestamp=1458360073954000000, value=
 rr2                  column=d:xx, timestamp=1458360073954000000, value=
 rr2                  column=d:xxx, timestamp=1458360073954000000, value=
 rr2                  column=d:xxxx, timestamp=1458360073954000000, value=
 rr2                  column=d:xxxxx, timestamp=1458360073954000000, value=

]$ date -d@1458360073
Sat Mar 19 13:01:13 KST 2016
deleted timestamp is already surpassed one-day.

Thanks, 
jingene 

jinge...@gmail.com

unread,
Mar 29, 2016, 1:35:07 AM3/29/16
to Tephra User, jinge...@gmail.com
I tested many times, but deleted cells aren't clean up.
I wanna solve this problem.

Thanks, 
jingene.

On Sunday, March 27, 2016 at 9:49:55 AM UTC+9, poorna wrote:
Message has been deleted

Terence Yim

unread,
Mar 29, 2016, 2:20:30 AM3/29/16
to jinge...@gmail.com, tephr...@googlegroups.com
Hi,

Would you mind sharing with us your testing program so that we can run it and verify the behavior as you see? Also, from the “cf description”, there is no coprocessor being configured. Was that HBase created through Tephra?

Terence

On Mar 28, 2016, at 10:36 PM, jinge...@gmail.com wrote:

jinge...@gmail.com

unread,
Mar 29, 2016, 5:18:08 AM3/29/16
to Tephra User
Here's my test step

0. tephra instance is running in hbase master's box.

1. setting hbase-site.xml
<!-- tephra -->
<property>
<name>data.tx.snapshot.dir</name>
<value>/snapshot</value>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>co.cask.tephra.hbase10cdh.coprocessor.TransactionProcessor</value>
</property>

2. copy some jars to ${HBASE_HOME}/lib
tephra-core-0.7.0.jar
tephra-api-0.7.0.jar
tephra-hbase-compat-1.0-cdh-0.7.0.jar
twill-zookeeper-0.6.0-incubating.jar
twill-core-0.6.0-incubating.jar
twill-discovery-api-0.6.0-incubating.jar
twill-discovery-core-0.6.0-incubating.jar
twill-api-0.6.0-incubating.jar
twill-common-0.6.0-incubating.jar

3. create a table
HBaseAdmin hBaseAdmin = new HBaseAdmin(hTable.getConfiguration());
HTableDescriptor table = new HTableDescriptor("testtable");
HColumnDescriptor family = new HColumnDescriptor("d");
table.addFamily(family);
hBaseAdmin.createTable(table);
hBaseAdmin.close();

4. TransactionServiceBean
@PostConstruct
public void setup() throws IOException {
System.setProperty("hadoop.home.dir", "/");
conf = new ConfigurationFactory().get();
conf.set(TxConstants.Manager.CFG_TX_HDFS_USER, "hadoop");
conf.setInt(TxConstants.Manager.CFG_TX_LONG_TIMEOUT, 2);
Injector injector = Guice.createInjector(
new ConfigModule(conf),
new ZKModule(),
new DiscoveryModules().getDistributedModules(),
new TransactionModules().getDistributedModules(),
new TransactionClientModule()
);

ZKClientService zkClient = injector.getInstance(ZKClientService.class);
zkClient.startAndWait();
provider = new PooledClientProvider(conf, injector.getInstance(DiscoveryServiceClient.class));
}
HTable hTable = new HTable(conf, tableName.getBytes());
TransactionAwareHTable txTable = new TransactionAwareHTable(hTable, TxConstants.ConflictDetection.COLUMN, false);
txContext = new TransactionContext(new TransactionServiceClient(conf, provider), txTable);

4. put a cell
// hbase put logic
txContext.start();
Put put = new Put(row.getBytes());
values.forEach(value -> put.addColumn(value.family(), value.qualifier(), value.value()));
transactionAwareHTable.put(put);
txContext.finish();

5. delete a cell
// hbase delete logic
txContext.start();
Delete delete = new Delete(row.getBytes());
delete.addColumns(family.getBytes(), qualifier.getBytes());
transactionAwareHTable.delete(delete);
txContext.finish();

6. run major_compact in hbase-shell
hbase(main):005:0> major_compact 'testtable'
0 row(s) in 0.1630 seconds

7. scan
hbase(main):006:0> scan 'testtable'
ROW COLUMN+CELL
row row column=d:a, timestamp=1459242025562000000, value=


Thanks,
jingene.

Message has been deleted

Terence Yim

unread,
Mar 29, 2016, 5:50:30 AM3/29/16
to jinge...@gmail.com, tephr...@googlegroups.com
Hi jingene,

Did you run the major compact right after the deletion completed? The coprocessor does cleanup based on the transaction snapshot produced by the TransactionManager. By default, it only produce a snapshot every 300 seconds (5 mins). This means the deleted cell won’t get cleanup until the next snapshot is being written after the delete happened. You can change that snapshot interval by setting “data.tx.snapshot.interval” in hbase-site.xml and restart the Tephra transaction server.

Terence


coprocessor>
 INFO  [RS_OPEN_REGION-dev-keephd004:60020-1] coprocessor.CoprocessorHost: System coprocessor co.cask.tephra.hbase10cdh.coprocessor.TransactionProcessor was loaded successfully with priority (536870911).


Thanks,
jingene.












-- 
You received this message because you are subscribed to the Google Groups "Tephra User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tephra-user...@googlegroups.com.
To post to this group, send email to tephr...@googlegroups.com.

jinge...@gmail.com

unread,
Mar 29, 2016, 6:22:27 AM3/29/16
to Tephra User
Hi, Terence Yim.
It's worked! Thank you for your support :D
I have another question : Can I create a transactional table in hbase shell directly instead of using HBaseAdmin?
Have a nice day!

Terence Yim

unread,
Mar 29, 2016, 6:31:42 AM3/29/16
to jinge...@gmail.com, Tephra User
Hi jingene,

I think so, as long as you have set the “hbase.coprocessor.region.classes" in the hbase-site.xml

Terence
> --
> You received this message because you are subscribed to the Google Groups "Tephra User" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to tephra-user...@googlegroups.com.
> To post to this group, send email to tephr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/tephra-user/161e35b1-b6d0-4c46-aca3-39e49075334e%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages