Syncing: How should conflicts be handled?

Jannik Schürg

unread,

Jan 9, 2016, 4:36:33 AM1/9/16

to Budget First

I'd like to outsource the conflict discussion since the topic about syncing in general is already big and people might stopped following.

Consider these cases: You have two or more devices.

On at least two devices you change the same thing (but to different values). In this case the memo/income/outcome/category/… of a transaction. They sync afterwards. How should be decided which value will stay?
This time you change a category value (this might be only possible on non-mobile devices!). Same question.

There seem to be two different strategies.

We make a (deterministic) choice by software (call 'automatic')
We present some UI which asks the user to decide (call 'UI')

Of course combinations are possible, too. We could start with strategy 2 and provide a checkbox 'don't ask me again'. Or do strategy 1 for case 1 and strategy 2 for case 2.

A question which also came up is, whether changes to the category value should be interpreted as 'setting a value' or 'adding money' (My positions is 'setting' which fits better into the spreadsheet idea). This would only matter if we chose 'automatic'.

Jannik Schürg

unread,

Jan 9, 2016, 5:22:08 AM1/9/16

to Budget First

I forget to mention: This issue will only appear, if the change is not synced to the second device before the second change is made. For example: Both devices have no internet connection for a while.

The automatic strategy is easier to implement and more user-friendly. Consider this scenario: I decide to change a value (any) on device D1 at home and get a change C1. Soon after I have switched to a mobile device and change it again (change C2). Meanwhile a second person in my household (or me again) changed it, too (arguably a rare scenario).

If syncing went bad, we might first get a conflict on the mobile device and resolve it there. Again later the third change might again lead to a conflict and has to be resolved again. Both possibly on multiple devices and with new conflicts emerging of these possibly conflicting conflict decisions.

This is definitely quite unlikely, I agree. But having the conflict UI appear on two devices at the same time is not unlikely if they sync. Even further: If I see on the mobile device that my prior change is not synced yet but I still decide to overwrite it, I will know at that moment, that I have resolve it later, which is annoying.

Therefore my proposed strategy: Do it automatically, based on timestamps. Quite rightly, timestamps are not 100 percent reliable. But they provide a deterministic strategy and in most cases should be accurate enough to lead to the 'right' decision. It's an easy solution for a quite rare problem.

Of course if you decide on two not synced devices to add money to a category and set the cell value, this will lead to the loss of one change. Or we chose to interpret changes as 'adding money', but this leads to unexpected behavior, I think: If I change a cell value to something else I don't think of this as 'adding', by just looking at the UI. The UI communicates 'set'.

Request for comments.

urbanhusky

unread,

Jan 9, 2016, 6:29:08 AM1/9/16

to Budget First

Personally, I'm still not convinced that last wins is a strategy that would not surprise me as a user, especially if parts of my budget that I just touched suddenly change without my interaction - possibly causing a negative budget or something else that needs fixing afterwards.

We could give the user the choice, that is a good idea - but I would put this under the options too.

When the conflict resolution UI is shown, provide an option like "Don't ask me again. Always resolve automatically (last change wins)"

Synchronisation conflicts (?): (Can happen whenever you make a change on two different devices that affects the same thing. For example: changing a note, adjusting the budget of a category etc. Only happens if both devices are offline or otherwise did not sync before you made the second change.)

Ask me how to resolve conflicts This works best if the device you're working on is online or otherwise syncing. It gives you full control over what happens.
Automatically resolve conflicts (last wins) The last change that was made will be the one that is applied, which might be the one from the other device and could overrule what you did on this device. "Last" is the change that was made at the later time, so please note that the clock and timezone of your devices must be correct.

Please make sure that your device is online and capable of synchronising changes before modifying this setting.

Regarding the commands of add vs set: I'd use set generally, unless dedicated actions like "adjust budget by taking from other categories" or the reverse (e.g. category groceries is at -25€, so take 5€ from spending money and 20€ from clothing)

We would have to keep the preconditions in those commands/operations even for these actions.

dominik...@gmail.com

unread,

Jan 9, 2016, 7:23:51 AM1/9/16

to Budget First

+1 that "last transaction wins" is not suitable in all cases.

But I think the possibility with multiple conflicts emerging is worth a thought. So

we have two devices D1, D2, with some an associated state S1, S2.

Let us assume in our case S1 and S2 are identical and now we do on both devices a change to it so that both following states S1' and S2' are not identical.

Now device D1 has a stable internet connection and can save its state to the synchronized folder.

Device D2 get a internet connection later on and now sees the mismatching data as part of its pulling of new data.

It asks the user for some conflict resolution, which the user decided with picking the change of D2.

D2 now writes some special merge transaction (UID from transaction on D1 and D2 and the solution) in its journal.

D1 pulls the updates from D2 and get the conflict as well as the solution for it. No need to show anything to the user.

Btw. possible solutions might be (in a shared budget):

- first one wins

- last one wins

- merge of both wins

- undecidable for now

urbanhusky

unread,

Jan 9, 2016, 9:22:39 AM1/9/16

to Budget First

For single operations/conflicts, this is comparatively easy to solve (first/last/user).

But consider the following scenario, where two people are working on the same budget at the same time, adjusting categories etc.

Now we have not only one operation from D1 that conflicts with another operation from D2, but a full chain of operations that might cause conflicts.

Even worse: if we resolve one conflict, we might create another (because the precondition of the succeeding operations are no longer met).

This might be an argument against letting the user resolve it.

dominik...@gmail.com

unread,

Jan 9, 2016, 10:15:48 AM1/9/16

to Budget First

This would only be true for additional operations on the same data set.

E.g. - Create, Modify, (optional Delete)

In such an scenario I would suggest we fold all operations into one single local and one single remote one.

And on such an operation should the user resolve the conflict.

The only way this might result in additional conflicts is, if we actually delete an category.

Assume on D2 (mobil device) we enter a sequence of transactions T1, ..., Tn all of them falling into category C.

If another user now deletes such category on D1, we surely must solve all conflicts T1, ...., Tn and there is

no way to do this in a automatically way.

On the other hand we have on D2 a new transaction T1, which we modify (add 10), again (change name) and again (change name) and sort this into a category C.

If another user now deletes such category on D1, we only need to solve one conflict.

I do not see any reason to capture preconditions on a operation besides the existence of and value of that category or transaction.

However I would say, show me the code. This way we actually have something to improve in one way or in the other.

urbanhusky

unread,

Jan 9, 2016, 10:20:23 AM1/9/16

to Budget First

The reason to capture the preconditions is to detect conflicts. If the precondition of an operation is no longer met, then we have a conflict.

This is also the way a user might see this: "I have 150€ saved in this category, that means I can remove 10€ from it without causing a huge impact". vs. "I have 10€ saved in this category, removing 10€ would destroy all effort made so far so I don't want that."

Deleting a category would remove all saved money from it and make it available for budgeting. If a category is deleted, then any operations on it could safely be ignored?

dominik...@gmail.com

unread,

Jan 9, 2016, 11:10:23 AM1/9/16

to Budget First

The precondition is the previously value we change in this operation. Not the complete state, if we agree on this, everything is fine, and we can indeed fold every following operation, which does have the new value as our precondition into this operation as well.

150 € saved O1(-10) o O2(-10) o O3(-20) o O4(+10) = O1234(-30) for the conflict, right?

I assume every transaction needs to be associated to exactly one category (I see splits as multiple transactions).

In that case every transaction referencing a deleted category needs to be categorized again. But everything else can be ignored :).

urbanhusky

unread,

Jan 9, 2016, 11:15:27 AM1/9/16

to Budget First

I'm not sure how to read your notation.

Example of how I'm currently seeing it:

Set budget of C1 to 150€. Precondition: budget of C1 is 0€
Set budget of C1 to 160€. Precondition: budget of C1 is 150€
Set budget of C1 to 160€. Preconcition: budget of C1 is 150€ - this is a conflict, but it can automatically be resolved because it has the same effect: setting the budget to 160€.
Set budget of C1 to 170€. Precondition: budget of C1 is 150€ - this is a conflict, which cannot be automatically resolved (unless first/last wins).

Splitting transactions could be seen as adding child-transactions, which are separate transactions with slightly different constraints.

dominik...@gmail.com

unread,

Jan 9, 2016, 1:57:43 PM1/9/16

to Budget First

Yes, that pretty much sums it up, except for the synchronization I would sum up some operations.

My notation (ID, action, new value, previous value)

So for D1:

- (1, Set budget C1, 150, was 0) (last state both devices observed)

- (3, Set budget C1, 160, was 150)

- (5, Set budget C1, 100, was 160)

and for D2:

- (1, Set budget C1, 150, was 0) (last state both devices observed)

- (2, Set budget C1, 120, was 150)

Now D2 syncs by pulling fresh data from D1

- (3, Set budget C1, 160, was 150)

- (5, Set budget C1, 100, was 160)

Which imho it should reduce to following conflict:

Data changed on two sides for budget C1 (120 and 100), pick one.

Committing a new (3,5,2, Set budget C1, 110, was 150) operation.

With that information anybody else, who does not have a change for that particular data as well, could just pick that resolve operation as well.

urbanhusky

unread,

Jan 9, 2016, 4:29:05 PM1/9/16

to Budget First

I'd also put the device id into the record, because generating unique IDs is not that trivial - but generating incrementing IDs for each device would yield unique ids when combined with the device id.

(logical clock, device, operation)

...and also adding the UTC timestamp to that: (logical clock, device, utc timestamp, operation)

The operation is the serialized form of an object of type Command. Each operation (set budget, add transaction etc.) has a it's own Command (i.e. SetBudgetCommand inherits from Command; AddTransactionCommand inherits from command).

These commands can be executed on a command bus, which then performs the corresponding actions (after checking the precondition that must be met for each command). They also contain properties/fields for all the required data (essentially they wrap both the arguments of a method call, as well as describing the method call).

The following is mostly my thought process when looking at the example:

In your third step, D2 would pull the entire history from D1 (i.e. IDs 1, 3 and 5). It then sees that it has ID1 already, but not ID 3 and 5.

So first D2 would have to determine when 3 and 5 happened. They must have happened after 1, because otherwise D2 would have seen them already.

If we can rely on the timestamps, we might see that ts(2) < ts(3) < ts(5). Therefore the order must be: 1, 2, 3, 4, 5.

In that order, we check the preconditions.

1: Still valid

2: still valid

3: not valid, conflict

5: not valid, conflict

If we resolve the conflict on 3 by letting 3 win, then we somehow have to note that (mark 2 invalid or add a compensating operation before 3 to tell 3 "hey, your precondition should be X" or some other way?)

If we resolve the conflict on 3 by letting 2 win, then we have to mark 3 as invalid... but what about 5? That would yield yet another conflict...

When resolving conflicts, we would have to look at the last resulting state (i.e. total budget in that category) for each device and possibly present those to the user. Then we have to figure out how to reach that state.

Trevor Phillips

unread,

Jan 9, 2016, 10:18:41 PM1/9/16

to Budget First

How about storing a full history of all changes (or at least, the latest change from each device) - that is for example, if a transaction has values changed, then all versions of the transaction are stored.

If you compare the sync time and update time of each row, you can determine if there is a conflict. Then you can default to picking the "latest" update, but highlight the entry with a colour/symbol indicator as a warning, and then let the user optionally review and select the correct row.

Jannik Schürg

unread,

Jan 9, 2016, 10:50:41 PM1/9/16

to Budget First

On Saturday, January 9, 2016 at 10:29:05 PM UTC+1, urbanhusky wrote:

When resolving conflicts, we would have to look at the last resulting state (i.e. total budget in that category) for each device and possibly present those to the user. Then we have to figure out how to reach that state.

I think it the right way. If we do it like this, we don't need preconditions, do we? Our algorithm would basically look like this:

Get the changes from other devices
Order them using vector clocks.
Look at the last state change now for every key and every entity separately. If there are concurrent changes (defined by vector clocks), these are the conflicting ones. The result can be communicated by just adding another change, which will be at the end of our order now (again, vector clocks).

First apply diffs which create
Then look at delete (tombstone flag)
Then the ones who change a key-value property.

Possibly this last conflict-solution-change might conflict again, of course.

I still think that a user choice is not worth the trouble. Btw, if the user clicks on 'don't ask me again', how is this decision synced? ;-) Funny scenario: If one device is on automatic mode and another asks the user, but both sync constantly, the following might happen: The device with automatic mode resolves a conflict, the user choses differently on the other device. Same situation as before (after sync). The user gets the same dialog again, and might not understand why. I highly recommend to pick the easy solution, it saves time and its mistakes are less annoying. It's a complicated systems with many possible scenarios, I don't think a multi-master non-automatic conflict strategy can be stable, i.e. has no weird/annoying scenarios.

Maybe I will implement a prototype after sleeping.

urbanhusky

unread,

Jan 10, 2016, 5:06:58 AM1/10/16

to Budget First

You cannot just add a compensating/conflict-resolving action at the end because the conflict happens before that.

I also realised as to why last wins is the more solid strategy than ever considering first wins: you mess up your preconditions.

I also see preconditions as an integrity check - because you made that operation based on that state. If the state is different, you might not have made that exact decision.

With preconditions, we would have an easier algorithm than having to map exactly which entity an operation modifies and if any imported operation we insert before messes with that.

urbanhusky

unread,

Jan 10, 2016, 5:08:29 AM1/10/16

to Budget First

That has always been the idea - we want to store the entire history of operations (create account, create transaction, create budget category, set budget category value for month...)

I'm using month as an example, the actual budgeting cycle might be different.

Jannik Schürg

unread,

Jan 10, 2016, 5:30:20 AM1/10/16

to Budget First

On Sunday, January 10, 2016 at 11:06:58 AM UTC+1, urbanhusky wrote:

You cannot just add a compensating/conflict-resolving action at the end because the conflict happens before that.

If you see in the log, that both conflicting diffs 'happen before', the former conflict would be ignored. This works, because we only look at changes key-value based and then on last state.

I also see preconditions as an integrity check - because you made that operation based on that state. If the state is different, you might not have made that exact decision.

If we have the log, this won't be necessary I think. The 'diff' propagation routine must guarantee, that every 'diff' which was sent to D1 is then also synced to any device D2, when syncing with D1 (which is easy for Dropbox, for example).

With preconditions, we would have an easier algorithm than having to map exactly which entity an operation modifies and if any imported operation we insert before messes with that.

Going with last-wins it becomes irrelevant if there are operations before (?). Only the last concurrent diffs matter, anytime. But maybe you can give an example for transaction/category updates? At the moment I don't see how it helps :-(

urbanhusky

unread,

Jan 10, 2016, 6:09:29 AM1/10/16

to Budget First

On Sunday, 10 January 2016 11:30:20 UTC+1, Jannik Schürg wrote:

On Sunday, January 10, 2016 at 11:06:58 AM UTC+1, urbanhusky wrote:
You cannot just add a compensating/conflict-resolving action at the end because the conflict happens before that.

If you see in the log, that both conflicting diffs 'happen before', the former conflict would be ignored. This works, because we only look at changes key-value based and then on last state.

I'm not following what you mean by that. Especially conflicting diffs, happen before, former conflict, last state, key-value.

I also see preconditions as an integrity check - because you made that operation based on that state. If the state is different, you might not have made that exact decision.

If we have the log, this won't be necessary I think. The 'diff' propagation routine must guarantee, that every 'diff' which was sent to D1 is then also synced to any device D2, when syncing with D1 (which is easy for Dropbox, for example).

Things get tricky when you introduce devices D3 and onwards. I would still argue that preconditions are the most easy way to identify conflicts. Precondition not met = something conflicting happened.

Also, the term diff is a bit fuzzy in our discussion right now.

With preconditions, we would have an easier algorithm than having to map exactly which entity an operation modifies and if any imported operation we insert before messes with that.

Going with last-wins it becomes irrelevant if there are operations before (?). Only the last concurrent diffs matter, anytime. But maybe you can give an example for transaction/category updates? At the moment I don't see how it helps :-(

Again, what exactly do you mean by diffs and how do they help solving the problems?

Nancy Pickering

unread,

Jan 16, 2016, 2:26:06 PM1/16/16

to Budget First

I think this discussion is around handling conflicts in changes both to transactions (by which I mean real world financial transactions) and budgets (decisions made inside the software alone).

Do we need to apply some real world pragmatism to the problem? How many of us share a budget with someone else at all, let alone with someone else who independently makes changes to the budget? This tends not to be the way relationships work! Could we have the budget locked unless you explicitly choose to edit it, in which case it remains editable only that device until you release the lock? (Or, say, 2 hours have passed given surely no-one spends more time modifying their budget than that?!)

For transactions, I can see it's more likely you might enter a transaction on one device and then want to change it on another (and a third given it is D3 that makes this a head-spinner of a problem). To be editable on the second device the transaction must first have synced with the budget (or the second device wouldn't be able to see it). If the change presents the version number of the transaction to which the change applies, the application can detect changes which are not being made to the current version of the transaction, i.e. version number in master is 3, transaction is presenting a change to be applied to version 1. Change should not be applied without user intervention.

urbanhusky

unread,

Jan 16, 2016, 3:00:08 PM1/16/16

to Budget First

This is actually a fare more technical topic than it might initially let on. Even such simple things as entering transactions on another device (e.g. smartphone) must be handled correctly.

There are enough people that share the same budget with their partner, possibly on multiple devices. Keeping the correct order of actions done, handling conflicts (Partner changes budget to 50€, I change budget to 45€) etc. is essential.

We already have a pretty good idea on how to handle this and discuss this on Slack and Taiga (keywords: optimistic replication, vector clocks, replicating operations)

Nancy Pickering

unread,

Jan 16, 2016, 3:08:08 PM1/16/16

to Budget First

Yep - appreciate that this is a complex technical area and thought I had requested an invite to Taiga some days ago so have some conversations to catch up on.

I'd rate stability over flexibility here - I need a piece of software I can rely on to behave in a certain way and have an attitude approaching 'zero data loss' to what I need from the system. For me personally I would put limits on what conflict conditions the software had to handle if it kept the conflict resolution logic simpler.

I'll have a read through on Taiga, though, and contribute there.

Reply all

Reply to author

Forward