DuraCloud-Akubra update and a couple questions

12 views
Skip to first unread message

Chris Wilper

unread,
Jan 30, 2010, 12:14:33 PM1/30/10
to duracl...@googlegroups.com
Hi guys,

I've now got the Akubra plugin updated to use the new exceptions, id
verification, and paging. While testing I noticed a couple things and
wanted to get clarification:

1) Is ami-3333d05a (v0.2) no longer a good AMI to test with?

I tried it and it didn't seem to auto-deploy the duracloud webapps,
but I discovered ami-6c58ba05 (v0.3), and that worked as expected. I
assume this is the best one to go against, but just wanted to
verify...

2) Are uploads intended to be synchronous?

One of my integration tests uploads, then immediately downloads a
content item. I noticed it succeeds most of the time, but sometimes
fails to immediately detect the content as existing. This could
easily be a problem on my end, but I wanted to verify the intended
behavior with durastore to be sure. After I finish an update, the
content should be immediately available, right?

Thanks,
Chris

Andrew Woods

unread,
Jan 30, 2010, 1:16:09 PM1/30/10
to duracl...@googlegroups.com
Hello Chris,
It is great that the Akubra/DuraCloud integration is advancing.

In regards to question 1, yes ami-6c58ba05 is the best one to use
right now. It brings up a good point, however, about having a
consistent and visible strategy for keeping track of AMI's, what is on
them, and how they should be used, etc.

In regards to question 2, the upload of content is a synchronous call.
Due to the underlying storage provider's load balancers, it sometimes
happens that updates with regards to the addition/deletion of content
on one slave server does not get propagated to another slave (which
may be servicing a subsequent download request) before a subsequent
query takes place. In theory, the update should be visible immediately
after the DuraCloud call returns. In practice, this is not always the
case.

Andrew

> --
> You received this message because you are subscribed to the Google Groups "DuraCloud Dev" group.
> To post to this group, send email to duracl...@googlegroups.com.
> To unsubscribe from this group, send email to duracloud-de...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/duracloud-dev?hl=en.
>
>

Chris Wilper

unread,
Jan 31, 2010, 4:41:34 AM1/31/10
to duracl...@googlegroups.com
Hi Andrew,

On Sat, Jan 30, 2010 at 1:16 PM, Andrew Woods <awo...@fedora-commons.org> wrote:
> Hello Chris,
> It is great that the Akubra/DuraCloud integration is advancing.
>
> In regards to question 1, yes ami-6c58ba05 is the best one to use
> right now. It brings up a good point, however, about having a
> consistent and visible strategy for keeping track of AMI's, what is on
> them, and how they should be used, etc.

So far I've just been keeping notes in a README. As long as I can
ping you for the latest and greatest info, I'm fine.

> In regards to question 2, the upload of content is a synchronous call.
> Due to the underlying storage provider's load balancers, it sometimes
> happens that updates with regards to the addition/deletion of content
> on one slave server does not get propagated to another slave (which
> may be servicing a subsequent download request) before a subsequent
> query takes place. In theory, the update should be visible immediately
> after the DuraCloud call returns. In practice, this is not always the
> case.

Thanks for the clarification. I didn't realize that, but indeed it's
considered normal behavior with S3.

http://docs.amazonwebservices.com/AmazonS3/latest/index.html?ConsistencyModel.html

There are good engineering reasons for a looser model of "eventual
consistency". But so far, the assumption with Akubra (and certainly
Fedora) has been that stored content is immediately read-after-write
consistent.

Since DuraCloud is agnostic about this, I think anything written to
work against it in the general case must assume the looser guarantee.

And since Fedora currently depends on the stronger guarantee, it seems
like the logical place to "bridge the gap" is from within the
Akubra-DuraCloud plugin: Some sort of delay/poll logic for writes,
which can be turned off if you don't need it.

- Chris

Chris Wilper

unread,
Feb 1, 2010, 2:48:15 PM2/1/10
to duracl...@googlegroups.com
Just an update:

On Sun, Jan 31, 2010 at 4:41 AM, Chris Wilper <cwi...@duraspace.org> wrote:
> [..]


> Thanks for the clarification.  I didn't realize that, but indeed it's
> considered normal behavior with S3.
> http://docs.amazonwebservices.com/AmazonS3/latest/index.html?ConsistencyModel.html

> [..]


> And since Fedora currently depends on the stronger guarantee, it seems
> like the logical place to "bridge the gap" is from within the
> Akubra-DuraCloud plugin: Some sort of delay/poll logic for writes,
> which can be turned off if you don't need it.

I went ahead and added a READ_AFTER_WRITE option to do this: After
any write, it immediately polls to ensure the write is reflected, and
returns if successful. If not, it sleeps 200ms and polls again, and
continues unless successful (increasing the sleep time by a factor of
3 between polls) up to a total of ~2.6 seconds, after which it logs a
warning that, while the content has been committed, it may not have
been propagated yet.

In my tests so far (small files only), about 80% of the time, an
immediate poll is successful. The remainder of the time, the second
poll (after ~200ms) is successful.

- Chris

Andrew Woods

unread,
Feb 1, 2010, 3:06:53 PM2/1/10
to duracl...@googlegroups.com
Thanks for the implementation example, Chris.
I am glad it does the trick. We had considered a similar approach on
top of the storageprovider implementations, but ultimately opted away
from the latency overhead of making a second, third, ... call after
every create/update/delete operation.
Fundamentally, cloud storage has a different behavior set from that of
a filesystem. From the Akubra perspective it makes sense to reconcile
those differences. From the DuraCloud perspective the answer is less
clear. At the very least we need to make those differences clear to
the user where they are not reconciled.
Andrew

Reply all
Reply to author
Forward
0 new messages