modified bitmap optimization patch and some performance data

Jiaying Zhang

unread,

Apr 25, 2008, 6:43:05 PM4/25/08

to Zumastor

I tested Daniel's bitmap optimization patch on a couple of different
hardwares. I found the optimization provides around 10~20% speedup
compared with the original ddsnap implementation. I attached two
performance graphs in this email. One is measured on a Dell 2950
that has six-spindle SATA disks. The other is measured on a HP
x4100 workstation that has a single-spindle ATA disk.

I need to take out some cleanup code from the original patch in order
to apply it. There is also a bug in the original patch. With the optimization,
we also need to check if the chunk is in the deferred_alloc list besides
getting the unused bitmap bit in the alloc_chunk_from_range() function.
The original patch includes this checking but does not cover the case
when the deferred allocated chunk crosses the byte boundary. I.e.,

+ for (int i = 0, bit = 1; ; i++, bit <<= 1)

should be changed to

+ for (int i = 0, bit = 1; i <= 7; i++, bit <<= 1)

Otherwise, 'ddsnap server' would die at the later assertion
"assert(!get_bitmap_bit(buffer->data, chunk & bitmap_mask));"

The modified patch is attached. I think the patch looks good overall.
I am going to review it more carefully this weekend.

Jiaying

bitmap_dell2950.pdf

bitmap_hpx4100.pdf

optimize.bitmap.update.patch3

Jiaying Zhang

unread,

Apr 28, 2008, 8:57:15 PM4/28/08

to Zumastor

I looked at the journal replay code and also tested the code more
by turning on freespace checking at the end of commit_transaction().
Here are some other bugs I found.

In init_super, we call new_leaf and new_node that indirectly calls
alloc_chunk_from_range(). With deferred allocation, those chunks
are not marked allocated in the bitmap block. I think we need to call
commit_deferred_allocs at the end of init_snapstore so that the
chunks allocated during initialization are written to disk.

The current limit_deferred_allocs calls flush_deferred_allocs that
does not set barrier in the commit block. This does not look right.
As I understand, we should always call commit_transaction(sb, 1)
after flush_deferred_allocs. I.e., why not merge flush_deferred_allocs
and commit_deferred_allocs() into one function?

The commit_deferred_allocs() function writes all of chunks recorded
in the deferred_alloc to disk. It then sets sb->deferred_allocs to zero
and then calls commit_transaction with barrier parameter set to 1. The
problem is that in this case, sb->defer.count can be non-zero. So the
commit_deferred_allocs() function does not flush all of the deferred
allocated chunks to disk. This can cause problems in journal replay.
I guess one way to fix the problem is to add a commit_transactions(sb, 0)
at the beginning of commit_deferred_allocs() so that there is no pending
deferred alloc during the flush.

Another problem is that we clear SB_BUSY at the beginning of cleanup()
without flushing dirty buffers and deferred allocs to disk. We will NOT call
replay_journal if the server dies right after this. I think we should add
commit_deferred_allocs before clearing the SB_BUSY flag in cleanup().

I am going to do tests and post the modified patch later.

Jiaying

Jiaying Zhang

unread,

Apr 28, 2008, 11:18:54 PM4/28/08

to Zumastor

The new patch is attached. The current alloc_chunk_from_range()
and replay_journal() functions are a little hard to understand. Maybe we
should change them a little bit to be more readable. We may also
want to polish the comments on 'deferred allocation'. And more tests.

Jiaying

optimize.bitmap.update.3.patch

Daniel Phillips

unread,

Apr 29, 2008, 3:45:23 AM4/29/08

to zuma...@googlegroups.com, Jiaying Zhang

On Monday 28 April 2008 17:57, Jiaying Zhang wrote:
> I looked at the journal replay code and also tested the code more
> by turning on freespace checking at the end of commit_transaction().
> Here are some other bugs I found.
>
> In init_super, we call new_leaf and new_node that indirectly calls
> alloc_chunk_from_range(). With deferred allocation, those chunks
> are not marked allocated in the bitmap block. I think we need to call
> commit_deferred_allocs at the end of init_snapstore so that the
> chunks allocated during initialization are written to disk.

Correct, good catch.

> The current limit_deferred_allocs calls flush_deferred_allocs that
> does not set barrier in the commit block. This does not look right.
> As I understand, we should always call commit_transaction(sb, 1)
> after flush_deferred_allocs. I.e., why not merge flush_deferred_allocs
> and commit_deferred_allocs() into one function?

You are right, this was fuzzy thinking on my part. I think the
limit_deferred_allocs logic needs to move into commit_transaction.
I will apply your version of the patch tomorrow, fix that and post
it.

> The commit_deferred_allocs() function writes all of chunks recorded
> in the deferred_alloc to disk. It then sets sb->deferred_allocs to zero
> and then calls commit_transaction with barrier parameter set to 1. The
> problem is that in this case, sb->defer.count can be non-zero. So the
> commit_deferred_allocs() function does not flush all of the deferred
> allocated chunks to disk. This can cause problems in journal replay.
> I guess one way to fix the problem is to add a commit_transactions(sb, 0)
> at the beginning of commit_deferred_allocs() so that there is no pending
> deferred alloc during the flush.

That was intentional: the transaction that flushes deferred allocs
can itself defer a new allocation. I saw no reason not to allow this.
But the logic is subtle indeed and I could easily have made a mistake
in the implementation. It seems ok, but let me fix up the other thing
first so we can discuss this additional "feature" in isolation.

(Once again, good spotting.)

> Another problem is that we clear SB_BUSY at the beginning of cleanup()
> without flushing dirty buffers and deferred allocs to disk. We will NOT call
> replay_journal if the server dies right after this. I think we should add
> commit_deferred_allocs before clearing the SB_BUSY flag in cleanup().

That was all messed up as we discussed earlier today. In fact,
flush_buffers should never be used any more, it is always wrong. So we
need a cleanup patch to remove and remaining uses and delete it.

Daniel

Daniel Phillips

unread,

Apr 29, 2008, 3:58:27 AM4/29/08

to zuma...@googlegroups.com, Jiaying Zhang

On Monday 28 April 2008 20:18, Jiaying Zhang wrote:
> The new patch is attached. The current alloc_chunk_from_range()
> and replay_journal() functions are a little hard to understand. Maybe we
> should change them a little bit to be more readable. We may also
> want to polish the comments on 'deferred allocation'. And more tests.

It is not the most obscure code I have seen, or even the most
obscure code in that file (find_victim might qualify for that).

One excessively obscure bit:

"for (length -= n; ..."

I decremented length there just because n will be destroyed by the
loop. A little too terse.

One boneheaded thing I did was, I should not have called _from_range
twice, I should have had just one allocation function that wraps back
to the beginning of the bitmap when necessary. I don't know what I
was thinking when I did that, but of course it is easiest to leave it
as is for now. Maybe add a FIXME.

Daniel

Reply all

Reply to author

Forward