Once all blocks have run (or not if they couldn't), and the computation doesn't move forward (no blocks have anything new to say), then the net changes to the database are packaged as a "transaction" for the next round of computation.
--
You received this message because you are subscribed to the Google Groups "Eve talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eve-talk+u...@googlegroups.com.
To post to this group, send email to eve-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/eve-talk/60ad8878-faf0-4a19-8084-2a98d8052926%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Unlike Dedalus, our goal isn't to create a language to create distributed systems in, but instead to make Eve be a distributed system by default. So while it could ostensibly be good at writing a replication protocol, we don't want you to ever have to do that. By leveraging the semantics that make it simple to reason about building distributed systems, we can make a platform that is itself automatically distributed based on a set of constraints that you set. All of that will come much later though as part of the world scale computer milestone.
Once we've reached a stable point, we consider that the end of the "transaction". At that point, we'd check if there were any errors and assuming there are none, we atomically notify the world of the complete set of changes we accumulated in all of our time steps.
We don't currently have "deductive rules", which allow you talk about changes in the current time step. Because of that decision, we might go through time steps where we're in the middle of a computation and the results don't yet make any sense.
Once we've reached a stable point, we consider that the end of the "transaction". At that point [...] we atomically notify the world of the complete set of changes we accumulated in all of our time steps.
Much of the pain in traditional distributed programming comes from this mismatch: programmers are expected to bridge from an ordered programming model into a disordered reality that executes their code.
To view this discussion on the web visit https://groups.google.com/d/msgid/eve-talk/41b2ac0e-17c8-4cae-beeb-6d52e9e7de27%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/eve-talk/f08cee89-0d01-4d30-b092-3ffe2706e44e%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/eve-talk/1603cc35-ac8c-472f-8f80-f9d13a1eae77%40googlegroups.com.
> But isn't this basically identical to the code we'd write on a LAMP stack? ... doesn't the mainstream implementation (SQL transactions) already offer this?
Let's explore that assumption. Since you brought up the LAMP stack, I'll look at using MySQL and PHP to accomplish a sharded, transactional bank transfer. The clustered implementation of MySQL does appear to support distributed transactions, but with a whole list of caveats. Here are a couple choice excerpts from that document:
> when a transaction modifying a number of rows commits concurrently with a transaction reading the same rows, the transaction performing the read can observe “before” values, “after” values, or both, for different rows among these, due to the fact that a given row read request can be processed either before or after the commit of the other transaction.
That's pretty terrifying (we'll get to that in a second), but I thought this one was even better - basic operations like count can't be trusted:
> it is not possible to guarantee the transactional consistency of the COUNT() function on the slave. In other words, when performing on the master a series of statements (INSERT, DELETE, or both) that changes the number of rows in a table within a single transaction, executing SELECT COUNT(*) FROM table queries on the slave may yield intermediate results.
So this boils down to the first couple of paragraphs in that document which say that the clustered implementation of MySQL only supports transactions at the "read committed" level. The only guarantee that isolation level provides is that if something is committed you will read it - but it makes no guarantees about *when* you will read it with respect to other transactions. This means that while our transfer is happening, a select can read rows from entirely different snapshots of the database. In the case of a bank, that means I can read values during my transaction that by the end are no longer correct. That alone is sufficient for the ATM attack. This means that distributed transactions aren't enough. Fortunately, MySQL also provides row level locking, so we could lock the row until the entire distributed transaction is complete. The problem with that approach is that the distributed transaction is already at least 10x slower than a normal one and because we are locking the row to prevent any other updates, we're now queuing transactions behind us. With just one slow network trip and a little bit of contention, your entire banking system can trivially come to a halt and there's a good chance it may never catch back up. To top all of this off, MVCC isn't enabled in the clustered implementation so your per machine throughput is down and the increase in memory usage per transaction dramatically decreases the size of data we can work with.
After digging into the MySQL docs for a bit, it's not looking good. Distributed transactions in this context are basically useless and the associated performance issues negate the scaling we were after to being with. The queuing and locking will also open us up to really straightforward attacks (e.g. a bad actor constantly attempts to transfer $1 from one account to another, causing contention which cascades into the whole system stalling). We could ostensibly use a cache in front of the database to address some of that, but then we're back to non-transactional stale reads and updates. So where do we go from here?
I guess we could not use the clustered MySQL and instead use innodb. That pushes sharding and distributed transaction management up to the application level, so we'd have to do a lot of this in our PHP. The naive approach would be to write something like this pseudo code:
$amount = 10;
$from = "chris's account";
$to = "joe's account";
// queryShard will find the right database to query from based on our account
$balance = queryShard($from, "
BEGIN
SELECT balance FROM account WHERE account.id = $from AND account.balance > $amount
UPDATE account WHERE account.id = $from SET account.balance = balance - $amount
COMMIT
");
if(!$balance) {
log("insufficient balance for transfer");
return;
}
// these can't be in one db transaction because they can be from different shards, we could potentially add
// some more complex batching logic to try and take advantage of that when we can, but it'll make the rest
// of the code more complicated
$balance2 = queryShard($to, "
BEGIN
SELECT balance FROM account WHERE account.id = $to
UPDATE account WHERE account.id = $to SET account.balance = balance + $amount
COMMIT
");
// it's possible that the $to account doesn't exist
if(!balance2) {
// we need to roll our transfer back
queryShard($from, "
BEGIN
SELECT balance FROM account WHERE account.id = $from
UPDATE account WHERE account.id = $from SET account.balance = balance + $amount
COMMIT
");
}
$startingTotal = $balance + $balance2;
// ensure total balance remained the same
$updatedBalance = queryShard($from, "SELECT balance FROM account WHERE account.id = $from");
$updatedBalance2 = queryShard($to, "SELECT balance FROM account WHERE account.id = $to");
if($updatedBalance + $updatedBalance2 != $startingTotal) {
// reset the balances
queryShard($from, "
BEGIN
SELECT balance FROM account WHERE account.id = $from
UPDATE account WHERE account.id = $from SET account.balance = balance + $amount
COMMIT
");
queryShard($to, "
BEGIN
SELECT balance FROM account WHERE account.id = $to
UPDATE account WHERE account.id = $to SET account.balance = balance - $amount
COMMIT
");
log("balance transfer failed for $from $to $amount");
}
Note that this code is magically doing a bunch of stuff that it couldn't do in reality - it'd probably be several times longer if we wrote out all the helpers and things that I'm glossing over. Disregarding that though, it's also fraught with issues. Every time we call queryShard, we're exposing ourselves to inconsistency. Even if we checked previously that an account exists, there's no guarantee by the time we attempt to make the transfer that the account does still exist or that its balance is high enough. So to counter that we check the balance and update it in one transaction, but in between the time we withdraw the funds and attempt to deposit them into the $to account, that account may have disappeared. If it did, we need to undo the withdrawal, which we can't really guarantee will succeed either since more time has passed. We could try to address this by writing a bunch of business logic that locks down what operations are valid while we're going through this process, but we'd need to make sure we accounted for that at every single place we make an update. We'd also need to add some sort of queue to handle the updates we intend to make, but can't yet because we're in our custom transaction process.
Next, when we attempt to check that our total balance has remained constant throughout this process we again run into the possibility that other transactions have taken place and our balances have been changed. If we then attempt to reset, it may be the case that the funds have already been withdrawn from the to account and we've just had a lot of money stolen. Nowhere in any of this are we handling the case where one of our queries just fails because of a partition or the machine disappearing on us. We also have no guarantees about what would happen if the machine that is running this code goes down in the middle of the process. We'd need to implement mechanisms to make sure that if we do experience a failure, money hasn't just disappeared.
To do that correctly, we'd at least need to implement project-wide account locking and even then, for something dealing with money, we shouldn't rely solely on us being diligent enough to check the lock everywhere in our code. Realistically we probably need to implement a two-phase commit or some other consistency protocol here to ensure that we all agree what the values are at the start and what they should be at the end. Now that we're implementing our own consistency protocol, our very simple code for a relatively simple problem requires us to be experts in an intensely difficult field - one that even the experts constantly mess up.
I'm not even going to attempt to show what that code would start to look like, because a) I'd get it wrong and b) it's... a lot. So back to the question at hand: "But isn't this basically identical to the code we'd write on a LAMP stack?" Not even close.
> I believe this property of @account should be expressed somewhere in the example code.
Fair enough, that would be an attribute on the database record:
```
commit @database
[#database name: "account" shared: true secure: true]
```
Also, if we're including that as part of the cost of the implementation, it seems like we should include all of the configs and schemas that the LAMP stack would entail as part of the comparison too. Those are going to be quite a lot more than two lines.
> In other words, I think Eve has to distinguish when it's "making an API call" in this example.
Nope. :) A database is a database. It doesn't matter if it's on the server or somewhere else. It doesn't matter if it needs to be protected or if it can be optimistically shared. Those are all properties of the system, not of the code. Your program doesn't need to change even if how the system is laid out does.
All of that being said, if there's an alternative approach, I'd love to hear it. While MySQL is probably an extreme case here, most databases will have similar but different caveats in their implementations. If anyone knows of other solutions that would allow you to write the simple code we started with, we'd certainly love to hear about them.
To view this discussion on the web visit https://groups.google.com/d/msgid/eve-talk/e2cb20ff-40ce-4bb1-beb7-938ceeebeac4%40googlegroups.com.
...
To view this discussion on the web visit https://groups.google.com/d/msgid/eve-talk/d24323b4-3d38-4003-88b1-80824a2fd789%40googlegroups.com.
...