Version numbering for re-created records

3 views
Skip to first unread message

Michael Kurze

unread,
Nov 2, 2010, 1:04:56 PM11/2/10
to lily-discuss, deins...@mozilla.com
Hi,

First, congratulations on the lily-0.2 release! The binary
distribution already looks much more mature and easier to use than
0.1. The -- understandable -- package-rename and the namespace-rename
for the vtags is likely to cause some work. Hopefully the new name
sticks :).

I have a question on deleted records: When I remove a record that has
several versions and later recreate it, it appears that the version
numbering is not reset, but instead continues to count up from the
latest deleted version number (observed using cdh3b3+patches and lily-
r4393). I guess this is due to all versions being stored under the
same hbase row key.
Is this intended? And can I work around this somehow? IMO a record
recreation would best be idempotent, as otherwise I would have to
maintain two version identifiers, one internal for lily and one
natural for SOLR.

I notice that the latest API-documentation contains a placeholder for
ID reuse issues and I now think it would be a good thing to mention
possible effects on numbering :)

Best regards,
Michael

Bruno Dumon

unread,
Nov 3, 2010, 12:24:17 PM11/3/10
to lily-d...@googlegroups.com, deins...@mozilla.com
Hi,

On Tue, Nov 2, 2010 at 6:04 PM, Michael Kurze <mr.micha...@googlemail.com> wrote:
Hi,

First, congratulations on the lily-0.2 release! The binary
distribution already looks much more mature and easier to use than
0.1.

Thanks!
 
The -- understandable -- package-rename and the namespace-rename
for the vtags is likely to cause some work. Hopefully the new name
sticks :).

Yeah, sorry about that, especially since we do not offer upgrade documentation, though we'll do that from now on.
 

I have a question on deleted records: When I remove a record that has
several versions and later recreate it, it appears that the version
numbering is not reset, but instead continues to count up from the
latest deleted version number (observed using cdh3b3+patches and lily-
r4393). I guess this is due to all versions being stored under the
same hbase row key.
Is this intended? And can I work around this somehow? IMO a record
recreation would best be idempotent, as otherwise I would have to
maintain two version identifiers, one internal for lily and one
natural for SOLR.

I agree it would be better if the version numbering always started from 1 (in fact, IIRC the index updater even makes the assumption it does -- need to check on that). The reason the numbering continues has to do with HBase and how its version-dimension works.

When a delete is performed on HBase, it writes a delete marker (a 'thombstone') which tells that everything lower than a certain version number is deleted. As long as the thombstone is there, it is impossible to write anything with a version number lower than the one stored in the thombstone. It is only after the thombstone is processed by a major compaction, that those version numbers can be used again.

So that left us with some options:
 * do not allow to re-create a record at all (as long as the delete has not been processed by a major compaction)
 * or continue the version numbering

Another possibility would be to translate the version numbers in Lily but that would complicate reading a record.

In fact since record IDs are often UUIDs we thought the situation would not occur very often. Is it a common thing in your situation to delete and then recreate a record?

--
Bruno Dumon
Outerthought
http://outerthought.org/

Michael Kurze

unread,
Nov 8, 2010, 12:01:40 PM11/8/10
to lily-discuss
Hi Bruno,

thanks for the explanation.

On Nov 3, 5:24 pm, Bruno Dumon <br...@outerthought.org> wrote:
> When a delete is performed on HBase, it writes a delete marker (a
> 'thombstone') which tells that everything lower than a certain version
> number is deleted. As long as the thombstone is there, it is impossible to
> write anything with a version number lower than the one stored in the
> thombstone. It is only after the thombstone is processed by a major
> compaction, that those version numbers can be used again.
>
> So that left us with some options:
>  * do not allow to re-create a record at all (as long as the delete has not
> been processed by a major compaction)
>  * or continue the version numbering

I can only guess as to how other applications work with record IDs,
but I imagine that users will often not be aware of the implications.
This alone might warrant to error out whenever re-using a record that
has not been compacted away yet (your first suggestion).

>
> Another possibility would be to translate the version numbers in Lily but
> that would complicate reading a record.
>
> In fact since record IDs are often UUIDs we thought the situation would not
> occur very often. Is it a common thing in your situation to delete and then
> recreate a record?

For us, the recreation of record does not happen during default
processing. It is usually only done during development / testing or
when an error happened during import. So I guess that I am fine with
manual major compactions (I’ll integrate it into our delete script).

Best,
Michael

Michael Kurze

unread,
Nov 8, 2010, 6:56:29 PM11/8/10
to lily-discuss
Hello again,

while everything in my above previous message remains true, I need to
add that I am actually not able to delete records on the hbase level
using lily (tested with the REST interface).

After I HTTP DELETE a record (200 OK), that delete is reflected by
lily (I will receive a 404 trying to GET the same record). But on the
hbase-level, the record does not seem to be deleted. A major_compact
has no effect either. Instead, a "system-nonversioned:deleted" cell is
present with ts=1, vaue=\xFF (which is respected by lily I think, but
not by hbase). Only after I actually enter the hbase shell and execute
a deleteall with the record key (sans leading "USER.") will the record
disappear.

It follows that I cannot reset the version numbering by deleting a
record from lily (even if followed by a major compaction).

Is this maybe a soft-delete feature (if so, can I turn it off)?
And if not, can I manually use hbase deleteall as a drop-in
replacement (what about linked records)?


Thanks in advance,
Michael

On Nov 8, 6:01 pm, Michael Kurze <mr.michael.ku...@googlemail.com>
wrote:

Evert Arckens

unread,
Nov 9, 2010, 4:54:02 AM11/9/10
to lily-d...@googlegroups.com
Hi,

We decided to go for the second option (cfr Bruno's mail), i.e. continue with the version numbering. When deleting a record in Lily we put a marker to indicate that it has been deleted, but we don't actually delete the row from HBase.
The row in HBase also contains information we need to update the (link-)indexes. If you would remove the row from HBase before the indexes have been updated we lose that information.

If it is just for development or test purposes a delete at hbase level would be an option, but keep in mind that there could be stale information left in the indexes.

Regards,
Evert Arckens.
--
Evert Arckens
http://outerthought.org/
Open Source Content Applications
Makers of Kauri, Daisy CMS and Lily

Bruno Dumon

unread,
Nov 10, 2010, 5:11:01 AM11/10/10
to lily-d...@googlegroups.com
On Mon, Nov 8, 2010 at 6:01 PM, Michael Kurze <mr.micha...@googlemail.com> wrote:
Hi Bruno,

thanks for the explanation.

On Nov 3, 5:24 pm, Bruno Dumon <br...@outerthought.org> wrote:
> When a delete is performed on HBase, it writes a delete marker (a
> 'thombstone') which tells that everything lower than a certain version
> number is deleted. As long as the thombstone is there, it is impossible to
> write anything with a version number lower than the one stored in the
> thombstone. It is only after the thombstone is processed by a major
> compaction, that those version numbers can be used again.
>
> So that left us with some options:
>  * do not allow to re-create a record at all (as long as the delete has not
> been processed by a major compaction)
>  * or continue the version numbering

I can only guess as to how other applications work with record IDs,
but I imagine that users will often not be aware of the implications.
This alone might warrant to error out whenever re-using a record that
has not been compacted away yet (your first suggestion).

Personally I don't like any of the two choices, but currently it is impossible to make it work as desired.

To give some more background, there are again two options if we would would like to avoid the above:

 * wait for it to be fixed in HBase (see issues HBASE-2847/HBASE-2256/HBASE-2856). This is pretty unlikely to happen soon, as it seems the fix would require changing HBase's HFile format by adding an extra sequence number to each key-value pair (= each cell value). So besides being a backwards incompatible change, the solution would also require extra storage space, which would also be a disadvantage for Lily, since its cell values are typically very fine-grained.

* change Lily so that no application meaning (i.e. anything exposed to the Lily user) is attached the HBase timestamps. Timestamps could then be always increasing (hence avoiding the delete problem), but to offer the logical version numbering (whether it be the current version numbers or user-visible timestamps) would require a separate mapping, which would make reading and writing records more complex (slower).


>
> Another possibility would be to translate the version numbers in Lily but
> that would complicate reading a record.
>
> In fact since record IDs are often UUIDs we thought the situation would not
> occur very often. Is it a common thing in your situation to delete and then
> recreate a record?

For us, the recreation of record does not happen during default
processing. It is usually only done during development / testing or
when an error happened during import. So I guess that I am fine with
manual major compactions (I’ll integrate it into our delete script).

I think this counts as an unusual situation. If you want to start from a blank slate (all records removed), I'd suggest cleaning all tables directly on the HBase level (it will be much faster), trigger a major compact, waiting for it to end, and then start inserting content in Lily again. We do this in the Lily testcases: see HBaseProxy.cleanTables() (see svn trunk, I saw it still used the old tables/cf names so fixed that).
Reply all
Reply to author
Forward
0 new messages