SyncStorage Proposal: change X-Timestamp to X-Last-Modified

20 views
Skip to first unread message

Ryan Kelly

unread,
Mar 30, 2012, 9:03:11 PM3/30/12
to servic...@mozilla.org

A disclaimer: this is probably too ambitious and risky at this point in
the Sync2.0 timeline. Still, I'm raising it because it's been driving
me mad.


TL;DR: Using "the current time on the server" to drive coordination
between clients leads to race conditions and corner-cases that
can't be resolved without brutal session-level locking. If we
instead used "the time this resource was last modified" then we
wouldn't have this problem.


Consider this two-client scenario, which has been discussed in various
forms on this list before:

* Client A requests to write some new records at timestamp T1
* This takes a long time, so we handle other requests concurrently
* Client B requests to read the latest records at timestamp T2
* Client B's request completes with X-Timestamp = T2
* Client A's request continues, writing more records with timestamp T1
* Since T1 < T2, Client B will never fetch the records written by A


The problem arises because when Client B receives X-Timestamp = T2, it
assumes that it now has all records that have been created before this
time. In the current world, that's just not true.

To avoid the race condition we have two options:

1) Locking. This could be either client-driven or server-driven,
but it has to ensure that only a single client is accessing a collection
at any one time. Note that it must block both readers and writers for
this to be effective, which is suboptimal.


2) Lie to Client B. Instead of returning the actual time on the
server, return a timestamp that is strictly before all
currently-executing writes. That's just plain yuck.


If we changed things so that Client B explicitly received the
last-modified time of the collection, we would not have to lie and we
would not have this problem:

* Client A requests to write some new records at timestamp T1
* This takes a long time, so we handle other requests concurrently
* Client B requests to read the latest records at timestamp T2
* Since A's request hasn't completed, last-modified time is T0 < T1
* Client B's request completes with X-Last-Modified = T0
* Client A's request continues, writing more records with timestamp T1
* The last-modified time is now T1
* Client B's next request for updates will fetch records written by A


As further evidence that X-Last-Modified is the semantic that we're
actually interested in, consider the Sync2.0 addition of the
X-If-Modified-Since header. Surely it makes more sense to pair this
with X-Last-Modified than with X-Timestamp...


Thoughts?

Ryan
_______________________________________________
Services-dev mailing list
Servic...@mozilla.org
https://mail.mozilla.org/listinfo/services-dev

Toby Elliott

unread,
Apr 1, 2012, 8:06:45 PM4/1/12
to Ryan Kelly, servic...@mozilla.org
There's a gap in the logic there. X-Timestamp is sent as a convenience, but it's not what the client should be using there.

The client should be using the maximum timestamp of the records they have downloaded in the collection. If the GET request comes back with no results, it should issue the same query later, not update to use the new timestamp. This ensures that they won't miss any records in the scenario you describe below.

There is a race condition, because if B does a write, it should update its internal timestamp to the new value, and that could cause it to miss a couple of client A updates. However, internal locking should preclude this problem.

I'm not actually sure what scenario further session-based locking would prevent.

Toby

Richard Newman

unread,
Apr 1, 2012, 9:26:09 PM4/1/12
to Toby Elliott, servic...@mozilla.org
I'm going to sidestep a little bit of what Ryan wrote, because I think there's a little more to this.

Apologies for the length, and all the ASCII diagrams.


> There's a gap in the logic there. X-Timestamp is sent as a convenience, but it's not what the client should be using there.
>
> The client should be using the maximum timestamp of the records they have downloaded in the collection.

The client doesn't care about the maximum timestamp; they care about the minimum timestamp of their next fetch so as to achieve continuity.

If X-Timestamp is computed pseudo-atomically (that is, as if concurrent requests are strictly ordered in timestamp) when a request is received (it should be, and used as an upper bound on the query), then two things are true:

* Any subsequent write will have a higher timestamp, even if there are two clients making simultaneous requests
* All returned records will have an earlier timestamp (unless the server is lazy or non-ACID).

That's exactly what the client wants: a number that it can hand back to the server to get "records that have changed since my last query".

This fits really nicely in the modern Sync client architecture ("repositories" on desktop, and Android Sync). Firefox Sync tracks record timestamps as you describe because it's legacy code, which only uses timestamps on the server leg of the sync (it uses the tracker locally), but Android Sync uses X-Weave-Timestamp in the GET response.

So that's the sunshine world.


The issue that Ryan is describing (albeit not mentioning all of the timestamps) is one where this is true:

Write lifespan:
|-------------------------------------------|
W1 W2

Read lifespan:
|---------------------------|
R1 R2

Now, if all of the records being written have the same timestamp W1, there is *no* value that can be returned to the reader that will allow it to retrieve just the records it missed, because the write started but did not complete before the reader went away. No matter what kind of isolation is being used, there will be records written between R2 and W2 which have timestamp W1, and thus will never be seen.

This even applies if the reader tracks timestamps of the records it sees, regardless of isolation. If a parallel write completes between W1 and R1, then the reader would see it, advance to that time and miss every subsequent record written with modified = W1.

That is: if we're writing records with modified = W1, we are screwing up. Diagram of that second case:


Write lifespan:
|-------------------------------------------|
W1 W2

Write lifespan:
|--------|
W3 W4

Read lifespan:
|-----------------|
R1 R2


Oops! The reader will advance its timestamp to W3, and will never see records marked as W1, regardless of whether it looks at the header or the records.


If the records are being written with the current timestamp at each time, then using R1 as the X-Timestamp will return the missing records in the subsequent fetch (apart from any that were missed due to isolation). Nearly there.

If the records are written with timestamp W2, then returning either R1 or R2 -- or even W4! -- for the read timestamp will work. (And if the write is isolated, the reader won't even see duplicate records!)


That's what I'd like to see happen: atomic writes with timestamp = W2/W4, and the reader receiving X-Timestamp = R1. (Not R2, so we can catch the third case of a write starting after the read has begun.)

It is also acceptable to return any value earlier than R1, if it won't cause excessive duplicate record fetches. For example, one could return W4.

Here's the worst case:


Write lifespan:
|-------------------------------------------|
W1 W2

Write lifespan:
|--------|
W3 W4

Read lifespan:
|-----------------|
R1 R2

Write lifespan:
|--------|
W5 W6


The only correct solution for this is to use W2/W4/W6 as record timestamps, and return R1 as the reader's X-Timestamp.

I advise caution in using the last modified time of the collection, here: that would be W6, which -- depending on whether you're using a database with correct isolation -- might cause all of the records in the third write to be skipped. (Isolated = the read never sees them, but we use the timestamp as if we did.)

If you can guarantee that W5-W6 are returned in the query R1-R2, then by all means use W6 as the X-Timestamp.

(Yes, looking at the record timestamps would solve the W5-W6 case. But I think I could draw a bunch more lines and find a way to screw that up, too.)

The safest answer is, IMO, to peg writes to their completion times and reads to their start times, always. It'll work with or without isolation, but more neatly with.

> If the GET request comes back with no results, it should issue the same query later, not update to use the new timestamp. This ensures that they won't miss any records in the scenario you describe below.

I don't think it really matters; the timestamp should be computed as a snapshot at the very start of a request handler, prior to any DB operation, so it should always be prior to the modified timestamp of any concurrent write: either that write has completed, and thus its timestamp is earlier and the records will be retrieved, or the write hasn't completed and thus will receive a later timestamp. This is a trivial attempt at isolation.

And in most cases there'll be a subsequent upload anyway, the race for which you can drive a bus through. :/


> There is a race condition, because if B does a write, it should update its internal timestamp to the new value, and that could cause it to miss a couple of client A updates. However, internal locking should preclude this problem.

Clients update their timestamps after an upload simply to avoid getting their own records back. They sacrifice correctness in the face of race conditions as a result. Yes, locking will shrink or eliminate that hole.


> I'm not actually sure what scenario further session-based locking would prevent.

By allowing the client to determine what constitutes a write that requires exclusion.

If you're trying to guess, by using HTTP or database lifetimes, then everything breaks when a client has to batch POSTs (which we do all the time, because there's a limit of 1MB per request). Now you have what the client considers to be a single upload, but which the server considers to be separate writes. And that introduces the opportunity for really obvious races due to interleaving.

Ryan Kelly

unread,
Apr 1, 2012, 10:19:32 PM4/1/12
to Richard Newman, servic...@mozilla.org
On 02/04/12 11:26, Richard Newman wrote:
>
> The issue that Ryan is describing (albeit not mentioning all of the timestamps) is one where this is true:
>
> Write lifespan:
> |-------------------------------------------|
> W1 W2
>
> Read lifespan:
> |---------------------------|
> R1 R2
>
> Now, if all of the records being written have the same timestamp W1, there is *no* value that can be returned to the reader that will allow it to retrieve just the records it missed, because the write started but did not complete before the reader went away. No matter what kind of isolation is being used, there will be records written between R2 and W2 which have timestamp W1, and thus will never be seen.

The server currently *does* write all records with timestamp W1.

In the model I imagined, the reader here would be isolated so it did not
see *any* of the concurrently-performed writes. Thus it would see an
X-Last-Modified value of less-than-or-equal-to W1, and a subsequent read
would retrieve all the new data.

Of course, this kind of isolation may not be practical at the database
level, but it bears thinking about.

> This even applies if the reader tracks timestamps of the records it sees, regardless of isolation. If a parallel write completes between W1 and R1, then the reader would see it, advance to that time and miss every subsequent record written with modified = W1.
>
> That is: if we're writing records with modified = W1, we are screwing up.

We are, and we are. Hence "driving me mad" in the initial email ;-)

> Diagram of that second case:
>
>
> Write lifespan:
> |-------------------------------------------|
> W1 W2
>
> Write lifespan:
> |--------|
> W3 W4
>
> Read lifespan:
> |-----------------|
> R1 R2
>
>
> Oops! The reader will advance its timestamp to W3, and will never see records marked as W1, regardless of whether it looks at the header or the records.

Indeed. I suppose this could be solved by enforcing serialization of
the write requests - for example by giving the second writer a "409
Conflict" response rather than processing overlapping writes.

> If the records are being written with the current timestamp at each time, then using R1 as the X-Timestamp will return the missing records in the subsequent fetch (apart from any that were missed due to isolation). Nearly there.

The spec currently states explicitly that all writes in a batch will
have the same timestamp. I would be quite happy for this requirement to
go away if we win something in return.

> If the records are written with timestamp W2, then returning either R1 or R2 -- or even W4! -- for the read timestamp will work. (And if the write is isolated, the reader won't even see duplicate records!)
>
> That's what I'd like to see happen: atomic writes with timestamp = W2/W4,

OK. But assume per the spec that these are actual server-side
timestamps, so that W2 is the time at which the write completes. How do
we perform such a write, short of time travel?

We could probably construct some sort of artificial timestamp to
simulate this, but I'm not convinced it could be done with any more
efficiency than just serializing all the write requests.

> and the reader receiving X-Timestamp = R1. (Not R2, so we can catch the third case of a write starting after the read has begun.)
>
> It is also acceptable to return any value earlier than R1, if it won't cause excessive duplicate record fetches. For example, one could return W4.

Which in this case would be equal to X-Last-Modified for the collection...

> Here's the worst case:
>
>
> Write lifespan:
> |-------------------------------------------|
> W1 W2
>
> Write lifespan:
> |--------|
> W3 W4
>
> Read lifespan:
> |-----------------|
> R1 R2
>
> Write lifespan:
> |--------|
> W5 W6
>
>
> The only correct solution for this is to use W2/W4/W6 as record timestamps, and return R1 as the reader's X-Timestamp.
>
> I advise caution in using the last modified time of the collection, here: that would be W6, which -- depending on whether you're using a database with correct isolation -- might cause all of the records in the third write to be skipped. (Isolated = the read never sees them, but we use the timestamp as if we did.)
>
> If you can guarantee that W5-W6 are returned in the query R1-R2, then by all means use W6 as the X-Timestamp.

This sounds like it could work, and even if there's some corner-case
that's missing I'm sure we could come up with something that did work.

But part of my point is, X-Timestamp is currently defined as "the
current time on the server". It's clear that this is pretty yuck.

If we do the above then we're re-defining X-Timestamp to be something
more complicated and less obvious, something that's driven by the
particular behavior of a particular client. And we're potentially
calling two related-but-different concepts (completion time of write,
start time of read) by the same name.

I'd prefer if we figured out precisely what characteristic of these
timestamps makes them the right choice for coordination, and then called
them by a more descriptive name. X-Last-Modified is one possibility but
perhaps not the best one.

>> I'm not actually sure what scenario further session-based locking would prevent.
>
> By allowing the client to determine what constitutes a write that requires exclusion.
>
> If you're trying to guess, by using HTTP or database lifetimes, then everything breaks when a client has to batch POSTs (which we do all the time, because there's a limit of 1MB per request). Now you have what the client considers to be a single upload, but which the server considers to be separate writes. And that introduces the opportunity for really obvious races due to interleaving.

Fair enough. And assuming everyone does locking and respects the locks
of others, none of these issues will really matter.

I'd still prefer to resolve them if we can - particularly because we
have Apps-in-the-Cloud using the same underlying semantics with no
intention of doing client-driven locking.


Cheers,

Ryan

Richard Newman

unread,
Apr 1, 2012, 10:52:08 PM4/1/12
to Ryan Kelly, servic...@mozilla.org
> In the model I imagined, the reader here would be isolated so it did not see *any* of the concurrently-performed writes. Thus it would see an X-Last-Modified value of less-than-or-equal-to W1, and a subsequent read would retrieve all the new data.

And that's fine, apart from the more complex case I outlined. Now you somehow have to coordinate writes across different webheads so as to exclude multiple writes, or find the earliest timestamp that a reader might want.

(Feel free to draw your own absurdly complicated diagram with overlapping lines :D)


> Of course, this kind of isolation may not be practical at the database level, but it bears thinking about.
>
>> This even applies if the reader tracks timestamps of the records it sees, regardless of isolation. If a parallel write completes between W1 and R1, then the reader would see it, advance to that time and miss every subsequent record written with modified = W1.
>>
>> That is: if we're writing records with modified = W1, we are screwing up.
>
> We are, and we are. Hence "driving me mad" in the initial email ;-)

:)


> Indeed. I suppose this could be solved by enforcing serialization of the write requests - for example by giving the second writer a "409 Conflict" response rather than processing overlapping writes.

But if we can do that, we can do it for reads, too, and avoid this whole problem -- serialize *all* requests whenever there's a writer. But I doubt most clients would be happy routinely receiving a 409 at any point in a sync. And serializing could be expensive.


>> If the records are being written with the current timestamp at each time, then using R1 as the X-Timestamp will return the missing records in the subsequent fetch (apart from any that were missed due to isolation). Nearly there.
>
> The spec currently states explicitly that all writes in a batch will have the same timestamp. I would be quite happy for this requirement to go away if we win something in return.

Nah, was just pre-emptively addressing an alternative.


>> If the records are written with timestamp W2, then returning either R1 or R2 -- or even W4! -- for the read timestamp will work. (And if the write is isolated, the reader won't even see duplicate records!)
>>
>> That's what I'd like to see happen: atomic writes with timestamp = W2/W4,
>
> OK. But assume per the spec that these are actual server-side timestamps, so that W2 is the time at which the write completes. How do we perform such a write, short of time travel?

By performing two writes: an INSERT, then an UPDATE with the timestamp at which the INSERT completed. (Or pick an arbitrary default pad, and only update if the insert takes longer than that -- cover the 95% case by using now() + 200msec.)

There's probably a better way to do it, like hanging on to some cursor and re-walking it. Dunno.


> We could probably construct some sort of artificial timestamp to simulate this, but I'm not convinced it could be done with any more efficiency than just serializing all the write requests.

Perhaps not!

Vector clocks ;)

>> and the reader receiving X-Timestamp = R1. (Not R2, so we can catch the third case of a write starting after the read has begun.)
>>
>> It is also acceptable to return any value earlier than R1, if it won't cause excessive duplicate record fetches. For example, one could return W4.
>
> Which in this case would be equal to X-Last-Modified for the collection...

Yup, that's what I was implying.


> But part of my point is, X-Timestamp is currently defined as "the current time on the server". It's clear that this is pretty yuck.

I believe it should be more precisely defined as "the snapshot instant in time on the server against which this operation ran", or something else, yes. Everything is dandy if we can adequately achieve something vaguely ACID here.


> If we do the above then we're re-defining X-Timestamp to be something more complicated and less obvious, something that's driven by the particular behavior of a particular client. And we're potentially calling two related-but-different concepts (completion time of write, start time of read) by the same name.

Even worse: potentially four different things. But I don't think it's too complicated:

* For read requests, X-Timestamp is the instant in time at which the request was received. Queries apply to a snapshot at that instant.
* For write requests, X-Timestamp is the modified time of the modified records, which will be the instant in time at which the operation completed.

or

* For write requests, X-Timestamp is the last modified time of the collection at the conclusion of the processing of this request.

These can be given different names if desired, so write requests would get *two* response headers.

X-Request-Processed
X-Write-Completed
X-Last-Modified
?

> I'd prefer if we figured out precisely what characteristic of these timestamps makes them the right choice for coordination, and then called them by a more descriptive name. X-Last-Modified is one possibility but perhaps not the best one.

Agreed. Consider that big fat email my input :D

> Fair enough. And assuming everyone does locking and respects the locks of others, none of these issues will really matter.

Yup. This is belt-and-braces, but more importantly it's well-defined semantics.

> I'd still prefer to resolve them if we can - particularly because we have Apps-in-the-Cloud using the same underlying semantics with no intention of doing client-driven locking.

Agreed.

Toby Elliott

unread,
Apr 2, 2012, 12:35:08 AM4/2/12
to Richard Newman, servic...@mozilla.org

On Apr 1, 2012, at 7:52 PM, Richard Newman wrote:

>> In the model I imagined, the reader here would be isolated so it did not see *any* of the concurrently-performed writes. Thus it would see an X-Last-Modified value of less-than-or-equal-to W1, and a subsequent read would retrieve all the new data.
>
> And that's fine, apart from the more complex case I outlined. Now you somehow have to coordinate writes across different webheads so as to exclude multiple writes, or find the earliest timestamp that a reader might want.
>
> (Feel free to draw your own absurdly complicated diagram with overlapping lines :D)
>

Yeah, this is brutally hard to do without a whiteboard.

>
>> Of course, this kind of isolation may not be practical at the database level, but it bears thinking about.
>>
>>> This even applies if the reader tracks timestamps of the records it sees, regardless of isolation. If a parallel write completes between W1 and R1, then the reader would see it, advance to that time and miss every subsequent record written with modified = W1.
>>>
>>> That is: if we're writing records with modified = W1, we are screwing up.
>>
>> We are, and we are. Hence "driving me mad" in the initial email ;-)
>
> :)

I'm still not convinced, which may be me being stupid, or may be me making bad assumptions. With one possible exception, I believe that if we do a server-side write-lock, we're in good shape.

Assume the max(modified) of the collection at the start is W0. Assume we write-lock when the request comes in from C1 at W1. That means no other writes can happen while this thread holds the lock.

Also assume that the writes are being done in a single transaction. That means the batch size is less than 100, or whatever we end up with it set to. (yes, there's a hole here if we're doing multiple batches. That can't be solved without more complex locking and we probably shouldn't do them)

In this situation, C1 is writing. Any request C2 does will only return records up to W0 - the other transaction hasn't finished being added to the table. Until the write lock on W1 is released, C2 cannot see that data. Therefore, the next time C2 comes in they'll issue a since=W0 and get back all of W1s. In the meantime, C1 gets a timestamp back from the server and is assured that nothing got written between W1 and the present (since they had the lock), therefore, they can update their timestamp to W1.

What am I missing here?


>
>> But part of my point is, X-Timestamp is currently defined as "the current time on the server". It's clear that this is pretty yuck.
>
> I believe it should be more precisely defined as "the snapshot instant in time on the server against which this operation ran", or something else, yes. Everything is dandy if we can adequately achieve something vaguely ACID here.
>

I believe that this result is only "relevant" for certain queries. We've added it to a bunch of other calls because it was free and has some uses in logging and reconciliation, but if we're trying to give it new semantic value, we're probably making a mistake.

>> Fair enough. And assuming everyone does locking and respects the locks of others, none of these issues will really matter.
>
> Yup. This is belt-and-braces, but more importantly it's well-defined semantics.
>

And the advantage of doing all the locking at the server level is that we don't have to assume clients will respect other locks - they don't have a choice.

Toby

Mike Connor

unread,
Apr 2, 2012, 12:27:58 PM4/2/12
to Richard Newman, servic...@mozilla.org

On 2012-04-01, at 10:52 PM, Richard Newman wrote:
>
>> Indeed. I suppose this could be solved by enforcing serialization of the write requests - for example by giving the second writer a "409 Conflict" response rather than processing overlapping writes.
>
> But if we can do that, we can do it for reads, too, and avoid this whole problem -- serialize *all* requests whenever there's a writer. But I doubt most clients would be happy routinely receiving a 409 at any point in a sync. And serializing could be expensive.

If it's an documented API response, I would expect clients to handle this. We already expect clients to handle 401/503, and in 2.0 we're already exposing 412 as a possible/likely failure state, so what's one more mid-sync error to handle?

-- Mike

Richard Newman

unread,
Apr 2, 2012, 2:11:28 PM4/2/12
to Mike Connor, servic...@mozilla.org
> If it's an documented API response, I would expect clients to handle this. We already expect clients to handle 401/503, and in 2.0 we're already exposing 412 as a possible/likely failure state, so what's one more mid-sync error to handle?

Yeah, I'm thinking about recoverability. 409 in this case is tantamount to "hold on a minute, someone else is working on this"; 401/503/412 are firmer errors.

If we routinely hit race conditions, it would seem desirable to pause and retry, which is more complicated.

(That is, right now a non-200 is ZOMG abort the sync.)

Mike Connor

unread,
Apr 2, 2012, 2:20:58 PM4/2/12
to Richard Newman, servic...@mozilla.org
On 2012-04-02, at 2:11 PM, Richard Newman wrote:

>> If it's an documented API response, I would expect clients to handle this. We already expect clients to handle 401/503, and in 2.0 we're already exposing 412 as a possible/likely failure state, so what's one more mid-sync error to handle?
>
> Yeah, I'm thinking about recoverability. 409 in this case is tantamount to "hold on a minute, someone else is working on this"; 401/503/412 are firmer errors.
>
> If we routinely hit race conditions, it would seem desirable to pause and retry, which is more complicated.
>
> (That is, right now a non-200 is ZOMG abort the sync.)

So… why wouldn't a client abort and retry from scratch once the 409s are done? Is there something I'm missing?

-- Mike

Richard Newman

unread,
Apr 2, 2012, 2:48:49 PM4/2/12
to Mike Connor, servic...@mozilla.org
>> If we routinely hit race conditions, it would seem desirable to pause and retry, which is more complicated.
>>
>> (That is, right now a non-200 is ZOMG abort the sync.)
>
> So… why wouldn't a client abort and retry from scratch once the 409s are done? Is there something I'm missing?

409 can mean as little as "someone else is reading from this collection" (if we do pure locking), or "someone is writing to this collection" (if we do write-only exclusion).

It seems very heavyweight to completely abort a sync when waiting 250ms and retrying would probably work just fine.

We have no idea how frequent these races are right now. We might double our traffic for all we know.

Richard Newman

unread,
Apr 2, 2012, 3:37:39 PM4/2/12
to Toby Elliott, servic...@mozilla.org
>> Really depends on the size of the writes. If it's a set of large batch uploads 250ms will be race condition hell.
>
> There's no way we're going to let a lock persist for more than 30s, so a retry-after set to that would seem to be fine. My reading suggests that 409 expects you to provide information on how to resolve the conflict, and a Retry-After seems to meet that.

I think we're spinning off a little bit here.

The issue, six replies deep, was "should we completely serialize all storage operations when there's a writer?". Doing that would cause 409s to occur at any point in a sync, but also might carry additional expense (and impact the instant-ness of syncing).

If we want to achieve a workable set of semantics without locking or serialization (and I think rfkelly and I both agree that we do), then what advantage would such serialization of _readers_ bring?

Can we get to a point where we can get safe behavior out of clients without any kind of locking, or by only serializing writers?

Ryan Kelly

unread,
Apr 2, 2012, 5:52:17 PM4/2/12
to Toby Elliott, servic...@mozilla.org
On 02/04/12 14:35, Toby Elliott wrote:
>>>> This even applies if the reader tracks timestamps of the records it sees, regardless of isolation. If a parallel write completes between W1 and R1, then the reader would see it, advance to that time and miss every subsequent record written with modified = W1.
>>>>
>>>> That is: if we're writing records with modified = W1, we are screwing up.
>>>
>>> We are, and we are. Hence "driving me mad" in the initial email ;-)
>>
>> :)
>
> I'm still not convinced, which may be me being stupid, or may be me making bad assumptions. With one possible exception, I believe that if we do a server-side write-lock, we're in good shape.
>
> Assume the max(modified) of the collection at the start is W0. Assume we write-lock when the request comes in from C1 at W1. That means no other writes can happen while this thread holds the lock.
>
> Also assume that the writes are being done in a single transaction. That means the batch size is less than 100, or whatever we end up with it set to. (yes, there's a hole here if we're doing multiple batches. That can't be solved without more complex locking and we probably shouldn't do them)

We could solve it by removing the requirement that all items in a POST
receive the same timestamp. Each batch of 100 could be committed
atomically and become visible to readers in timestamp order.

The writer would have to hold the write lock for the whole POST though,
otherwise they might get confused by other people's writes interleaved
with their own.

> In this situation, C1 is writing. Any request C2 does will only return records up to W0 - the other transaction hasn't finished being added to the table. Until the write lock on W1 is released, C2 cannot see that data. Therefore, the next time C2 comes in they'll issue a since=W0 and get back all of W1s. In the meantime, C1 gets a timestamp back from the server and is assured that nothing got written between W1 and the present (since they had the lock), therefore, they can update their timestamp to W1.
>
> What am I missing here?

Assuming the client behaves as you describe, I think you're right given
the two properties of the server we have discussed:

1) Serialised writers
2) Isolated transactions

I think we're both arguing for roughly the same thing - using the
last-modified time of the collection. Whether it's something the server
sends back or something the client calculates from the returned data
doesn't make much difference IMHO.

(They could be different if there was the possibility of returning
partial data to the client, so that max(modified) over the returned data
could be less than X-Last-Modified. But that hasn't made it into the
spec, yet.)

But my understanding is that this is what the X-Timestamp header is
*for*. The wording of our spec encourages you to think of it as a
coordination mechanism, and that's certainly how it's being used in
practice right now. Hence my temptation to provide another header that
does a better job of it.


Cheers,

Ryan

Ryan Kelly

unread,
Apr 2, 2012, 6:01:47 PM4/2/12
to Richard Newman, servic...@mozilla.org
On 03/04/12 04:11, Richard Newman wrote:
>> If it's an documented API response, I would expect clients to handle this. We already expect clients to handle 401/503, and in 2.0 we're already exposing 412 as a possible/likely failure state, so what's one more mid-sync error to handle?
>
> Yeah, I'm thinking about recoverability. 409 in this case is tantamount to "hold on a minute, someone else is working on this"; 401/503/412 are firmer errors.
>
> If we routinely hit race conditions, it would seem desirable to pause and retry, which is more complicated.
>
> (That is, right now a non-200 is ZOMG abort the sync.)

But if "someone else is working on this", shouldn't you try to read and
reconcile their changes before pushing ahead with your own? Honest
question - I've no idea whether or not that would be necessary.

One possible downside to the ZOMG-abort-the-sync approach is the
possibility for live-locking between two clients. Suppose client A is
most of the way through its sync while client B is just starting one.
Client A gets a 409-conflict and aborts, client B continues. When
client B is almost finished *its* sync, client A starts again and this
time B gets the 409-conflict and aborts. Dance dance dance. They might
never complete a successful sync session.

There are ways to deal with that of course, but they might be more
complicated than we're willing to force upon third-party clients.

Ryan

Toby Elliott

unread,
Apr 2, 2012, 6:06:33 PM4/2/12
to Ryan Kelly, servic...@mozilla.org

On Apr 2, 2012, at 2:52 PM, Ryan Kelly wrote:
> understanding is that this is what the X-Timestamp header is *for*. The wording of our spec encourages you to think of it as a coordination mechanism, and that's certainly how it's being used in practice right now. Hence my temptation to provide another header that does a better job of it.
>
A ha. Now we get to the source of the issue.

X-Timestamp is essential for a couple of calls, notably PUT. At some point along the way, we said "hey, since this is being generated by the controller and always available, why don't we just always return it." Apparently that has given the header a lot more weight than I think anyone had ever intended, and if it is causing this much confusion, maybe we should remove it from nonessential responses.

Toby

Ryan Kelly

unread,
Apr 2, 2012, 6:15:25 PM4/2/12
to Toby Elliott, servic...@mozilla.org
On 03/04/12 08:06, Toby Elliott wrote:
>
> On Apr 2, 2012, at 2:52 PM, Ryan Kelly wrote:
>> understanding is that this is what the X-Timestamp header is *for*. The wording of our spec encourages you to think of it as a coordination mechanism, and that's certainly how it's being used in practice right now. Hence my temptation to provide another header that does a better job of it.
>>
> A ha. Now we get to the source of the issue.
>
> X-Timestamp is essential for a couple of calls, notably PUT. At some point along the way, we said "hey, since this is being generated by the controller and always available, why don't we just always return it." Apparently that has given the header a lot more weight than I think anyone had ever intended, and if it is causing this much confusion, maybe we should remove it from nonessential responses.

For added context, this is an excerpt from my extensive archive of
rnewman's-explanations-of-all-things-sync:

"""
A Sync client, overly simplified, can broadly be considered a loop:

x = 0; // A timestamp in server time.
every hour:
fetch and apply records since x
update x

You can see the consequence of the first point: once we advance
`x`, we'll never refetch in a way that we'll get older records, so
we absolutely can't skip any.

Secondly, there's a worry here about the tracking timestamp.

If anything goes wrong in the fetch-and-apply stage, we will abort
and won't advance the timestamp. In the success case we use X-Weave-
Timestamp as the new value of `x`; "the server has given us every
record up until that point, so start there next time".

(Why don't we use a high-watermark of the modified time of processed
records? Because they're not in modified time order.)
"""

The last point is particularly relevant to your suggestion. Richard,
would having the server send such a high-watermark make things
substantially easier for the client? If so, why is this so much better
than doing to work client-side?

Also, quoting from the AITC wiki spec page:

https://wiki.mozilla.org/Apps/AITC
"""
GET {endpoint}/apps/?full=1&after={timestamp}

Returns:

{
apps: [apps added or updated since after]
}

The value of the X-Timestamp header should be used for the next
value of after. You may use after=0 or no after argument to get
all apps.
"""

Given the discussion so far, this seems to be clearly The Wrong Thing To
Do. We should either have the server provide X-Last-Modified or some
other value suitable for this purpose, or we should explicitly require
the client to figure out such a value for itself.


Ryan

Richard Newman

unread,
Apr 3, 2012, 1:31:26 PM4/3/12
to Ryan Kelly, servic...@mozilla.org
> If anything goes wrong in the fetch-and-apply stage, we will abort
> and won't advance the timestamp. In the success case we use X-Weave-
> Timestamp as the new value of `x`; "the server has given us every
> record up until that point, so start there next time".
>
> (Why don't we use a high-watermark of the modified time of processed
> records? Because they're not in modified time order.)
> """
>
> The last point is particularly relevant to your suggestion. Richard, would having the server send such a high-watermark make things substantially easier for the client? If so, why is this so much better than doing to work client-side?

I might have misled you slightly here.

Desktop Sync does compute the high watermark. Android Sync (and the next internal implementation of Firefox Sync) does/will not.

Under time-based ORDER BY, they're vaguely equivalent. But we don't do that; (almost?) all of our queries are sortindex-based.


The reason for not using the watermark is to avoid the edge case of:

T1: query received, ordering by sortindex, with limit
T2: writer adds a new record with low sortindex, R1
T3: writer adds a new record with high sortindex, R2
T4: query returns R2 amongst results, either by starting late or being non-isolatled.

If X-Weave-Timestamp is T1, R1 will be retrieved on the next query, and all is well.

If the last sync timestamp is advanced to the high watermark of retrieved records, it'll be T3, and R1 will never be retrieved. That's bad.

(Note that this can still occur with limit queries *without* simultaneous writes -- you just never fetch R1 because it's not important -- but it's more pernicious if it's a current write that got skipped rather than unimportant record 5,123 from six months ago.)

Returning the high watermark saves the client some effort if that's the approach it uses.

If you achieve total ordering of requests under writes (that is, a simultaneous read cannot occur alongside a write), or can promise total isolation, then using the high watermark is safe, and I support that idea.

I still think it's simpler and more consistent for the server to treat these operations as notionally atomic: that is, take a precise timestamp before doing the work, both use that as the upper bound on the read and return it as X-Timestamp, and ensure that no write completing after another operation can write modified timestamps earlier than the other operation began.

(Wow, that's a mouthful.)


> The value of the X-Timestamp header should be used for the next
> value of after. You may use after=0 or no after argument to get
> all apps.
> """
>
> Given the discussion so far, this seems to be clearly The Wrong Thing To Do. We should either have the server provide X-Last-Modified or some other value suitable for this purpose, or we should explicitly require the client to figure out such a value for itself.

This is still a valid/consistent approach if one can guarantee some ordering of reads and writes. That is, if the previous X-Timestamp is guaranteed to be the earliest point for any writes that finished after you began reading, you can use it for timeslicing through the record list without missing anything or hitting duplicates.

Reply all
Reply to author
Forward
0 new messages