Newbie Query: ActiveRecord multi-thread/process safe and auto-reload how-to

302 views
Skip to first unread message

Chris Mayan

unread,
Nov 18, 2010, 1:47:47 AM11/18/10
to rails-...@googlegroups.com
Hello,

I've done a lot of bing searching on this (perhaps I should try googling instead...) with no luck, so perhaps the oceania crew can help as I seem to have a complete misunderstanding of how Rails Object-Relational Mapping is supposed to work:

Question: How on earth do you make Active Record model objects be multi thread / multi process safe, so that when an attribute is changed in one process (such as from a delayed job) whilst another process (such as a user / web server process which has loaded / mapped the same db row in question as an AR model object), access the attribute, will automatically uses the new modified attribute value... _without_ having to call model.reload manually? 
(i.e. I would like Rails in the background to automatically call the model.reload when it detects that another process has modified any part of the underlying model's DB row mapping, which is more desirable then calling model.reload programatically before every access as that seems just insane...)

Is this possible or have I misunderstood how rails does ORM?

So a concrete example:
Assume you have an AR model "ARModel" with tables ar_model and a string attribute "attr" which has a default of "unchanged".

0. Open 2 terminals
1. Start ./script/consoles in each terminal
2. Terminal_1: >> tst_ar_model = ARModel.find(1)
3. Terminal_2: >> tst_ar_model = ARModel.find(1)
4. Terminal_1: >> tst_ar_model.attr
    "unchanged"
5. Terminal_2: >> tst_ar_model.attr
    "unchanged"
6. Terminal_1: >> tst_ar_model.attr = "something different"
7. Terminal_1: >> tst_ar_model.save!
8. Terminal_1: >> tst_ar_model.attr
    "something different"
9. Terminal_2: >> tst_ar_model.attr
    "unchanged"                                 
10. Terminal_2: >> tst_ar_model.attr_changed?
     False
    
    # What would it take for this to reflect the updated attribute value "something different" automagically?

Of course it works if you call in Terminal 2 tst_ar_model.reload; tst_ar_model.attr... but it seems rather nuts to have to call this on every attribute access everywhere!

I have tried config.threadsafe!  (and it's deprecated config.action_controller.allow_concurrency = true)
I've even tried on a longshot attr_will_change! ...

I am working with Rails 2.3.5

Is there something else I need to set or have I completely misunderstood something? I understand that the 2 AR model objects are 2 separate objects, but being mapped to the same underlying db table entry, should update concurrently I would have thought, or, at least know that the underlying table object has changed and is dirty via _changed? ...

Thus shouldn't Rails ORM when you declare threadsafe mode detect that when an underlying DB table entry has changed, ensure that every invocation of an attribute access will know that it has been changed or is dirty and force a reload (or a partial one if it is smart enough to detect only a certain field change?)

Thanks for your help :)

-Chris

Simon Russell

unread,
Nov 18, 2010, 1:54:36 AM11/18/10
to rails-...@googlegroups.com
I'd say you probably don't want to do this; even if it did work, which
it doesn't (basically by design) ... what's the problem you're trying
to solve?

> --
> You received this message because you are subscribed to the Google Groups
> "Ruby or Rails Oceania" group.
> To post to this group, send email to rails-...@googlegroups.com.
> To unsubscribe from this group, send email to
> rails-oceani...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/rails-oceania?hl=en.
>

Chris Mayan

unread,
Nov 18, 2010, 2:05:17 AM11/18/10
to rails-...@googlegroups.com
Well suppose 2 people are collaborating on an item.

The first person through a web client sees a value as "pending".
The 2nd person through a different non web client posts a change, that is only picked up by delayed_job, modifies that value from "pending" to "done" 
and before_filters are involved that trumps the second person's status changes as the first person never knows anything about the new status change that has occurred underneath.

What exactly does "config.threadsafe!" do?

And what did "ActiveRecord::Base.allow_concurrency = true" ever do or was intended to do?

Thanks
-Chris

Mikel Lindsaar

unread,
Nov 18, 2010, 2:05:22 AM11/18/10
to rails-...@googlegroups.com
On 18/11/2010, at 5:47 PM, Chris Mayan wrote:
> Question: How on earth do you make Active Record model objects be multi thread / multi process safe, so that when an attribute is changed in one process (such as from a delayed job) whilst another process (such as a user / web server process which has loaded / mapped the same db row in question as an AR model object), access the attribute, will automatically uses the new modified attribute value... _without_ having to call model.reload manually?

You can't... well... you can... but you have to call reload somewhere.

The way this is handled best is using record locking in conjunction with database transactions which provides access to row and table locking.

Then while you are doing the update, check to make sure that the record has not already been updated before performing your action, all within a transaction.

Hope that nudges you in the right direction.


Mikel Lindsaar
http://rubyx.com/
http://lindsaar.net/


Dmytrii Nagirniak

unread,
Nov 18, 2010, 2:20:44 AM11/18/10
to rails-...@googlegroups.com
Hi Chris,

There is only one way (that I am aware of) of keeping things in sync between different processes (and as a result even different machines). It is MagLev.

But as other pointed, you'd better reconsider your design as there is no single programming language (maybe apart from SmallTalk) that does what you want.

Regards,
Dima
http://ApproachE.com


--

Chris Mayan

unread,
Nov 18, 2010, 2:29:05 AM11/18/10
to rails-...@googlegroups.com
SOLVED!

Thanks Mikel for sending me in the right direction - I ended up finding a pure rails way of doing it... (i'm switching search engines I think).

The solution for anyone else interested is "Optimistic Locking" which by default gets turned on on any models with a column that contains an integer column called "lock_version" (which you initialise to 0 and leave alone thereafter as AR will manage it then).

Then when 2 different processes tries to update the table it will check the lock version sequence number and it will realise it is working with a stale version (as another process has updated it) and thus will throw an exception (ActiveRecord::StaleObjectError), where then you can call .reload(), before you try to save again, more effeciently.

Brilliant... thanks :)

Cheers
Chris






--

Korny Sietsma

unread,
Nov 18, 2010, 4:47:38 PM11/18/10
to rails-...@googlegroups.com
You should still think carefully about what it is you are trying to do.
If you just catch the optimistic locking exception, and automatically reload and save, then you are ensuring the second user's change overwrites the first.  The second user will never see the first user's change.  That's not really optimistic locking at all.  Even if you merge the two sets of changes, merging is tricky in all but the most trivial of cases.

More usually, optimistic locking involves *informing* the second user : "Another user has changed this record - please review their changes and try again" - this guarantees that you won't get merge collisions or other surprises.

Be warned - much pain and trauma has been experienced in the past by people trying to automatically manage simultaneous changes in an automated way.  The best answer is usually to say "why do you really want to do this?"

- Korny

--
Kornelis Sietsma  korny at my surname dot com http://korny.info
"Every jumbled pile of person has a thinking part
that wonders what the part that isn't thinking
isn't thinking of"

Dmytrii Nagirniak

unread,
Nov 18, 2010, 5:04:45 PM11/18/10
to rails-...@googlegroups.com
On 19 November 2010 08:47, Korny Sietsma <ko...@sietsma.com> wrote:
You should still think carefully about what it is you are trying to do.
If you just catch the optimistic locking exception, and automatically reload and save, then you are ensuring the second user's change overwrites the first.  The second user will never see the first user's change.  That's not really optimistic locking at all.

Yep. And this is how AR (basically most of ORMs) works by default - last wins. Nothing to do with the locking.
Wondering why not just use a proper database transaction if atomicity is a requirement?
 

Mikel Lindsaar

unread,
Nov 18, 2010, 6:09:11 PM11/18/10
to rails-...@googlegroups.com
Well as Korny was saying transactions and row locking won't help you if both data sets are valid.

Simplistic model, but you could do this with a person record.  First user is updating the phone from invalid data to something, second user is updating the address from blank to something.  However, the second user sees there is invalid data in the phone number field and decides to be a good citizen and remove the invalid data, making it blank.

With transactions you guarantee that both actions run, completely.

So if the transactions hit the database as:

First User
Second User

The result would be that the phone number is blank and the address has been updated.

If they run:

Second User
First User

The result is what we want, both phone number and address have been updated.

This is handling merge conflicts, and a good way to do this is row version locking and reloading data before committing:

When the first user reads the record, the person object has a version of "1"

The second user reads the record at the same time and the person object also has a version of "1"

The first user submits his changes to the phone number, and the object before save, inside a transaction, reads the row and makes sure the version number is still 1.  It is, so it commits the changes,  increments the version number to "2" and then commits the transaction.

The second user then submits his updates, the person model does the same thing, opens a transaction, reloads the row, finds the version number has changed, and then aborts and shows the new data to the user at which point the process can repeat.

Fun stuff if you don't catch it.

Julio Cesar Ody

unread,
Nov 18, 2010, 6:13:28 PM11/18/10
to rails-...@googlegroups.com
I find that using versioning for records that are shared among many
users always turns out to be a good choice from a usability
perspective. Display somewhere on the interface something like:

- Update by John Doe at <time>
- Update by Jane Doe at <time>
- etc...

Make those changes viewable by the users, so everyone knows where
changes are coming from.

If you're concerned about generating too many records, save up to,
say, 20 versions maximum.

> --
> You received this message because you are subscribed to the Google Groups
> "Ruby or Rails Oceania" group.
> To post to this group, send email to rails-...@googlegroups.com.
> To unsubscribe from this group, send email to
> rails-oceani...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/rails-oceania?hl=en.
>

--
http://awesomebydesign.com

Eaden McKee

unread,
Nov 18, 2010, 7:01:38 PM11/18/10
to rails-...@googlegroups.com
What I do is when making an edit form, I have a hidden field - last
changed at - this is the last updated at field from rails.

On edit submit, I load the last updated time of the object I'm saving.
If it has changed then I know someone has edited the object since.
Could use some ajax to poll for this change and warn the user before
they submit too.

Eaden

Rufus Post

unread,
Nov 18, 2010, 7:07:36 PM11/18/10
to rails-...@googlegroups.com
Nice, you could have an updated by as well then you are nearly at a google doc style ui (ok quite a way off).

Cheers Eaden.


--

Simon Russell

unread,
Nov 18, 2010, 7:08:44 PM11/18/10
to rails-...@googlegroups.com
What do you do if both people submit the form at the same time?
Wouldn't lock_version do a better job?

Chris Mayan

unread,
Nov 18, 2010, 7:11:30 PM11/18/10
to rails-...@googlegroups.com
Hello,

What I really wanted I now believe has to be done and provided by the native DB or implemented via some plumbing connection between rails and the db (which I don't think is provided by any vendors…).

Something akin to a subscriber/publisher design pattern, where I can subscribe to the DB from rails and say, "Notify me when anything to do with this row" has changed.. and then when something does change the DB is the one that communicates back through the db connection signalling to call .reload…
Like I don't understand why this is not a standard in any ORM implementation… surely it's never a good situation to be dealing with stale objects at any time
That's what I expect from an ORM I guess which probably doesn't exist.

To me the idea of encapsulating a complete DB row as an object is meaningless if the underlying data can change at any time, and I can't tell until I try to save it. Like once I create that ActiveRecord object.. it's stale. So all calculations that I perform on those attributes of that object could well and truly be meaningless.
Wouldn't it be great if rails automatically knew that the underlying db actually has changed either via an attribute on the AR model (e.g. model.attribute_db_value_changed?), and or a config setting that allows you to always work with updated db values in real time if you choose to, all without polling the DB, but Rails being informed by exception.

The current situation which I am now in is to always assume that I am the only one in the world, and that everything is ok, until I try to save, and then only then do I realise my object I have been dealing with is stale, so go back, reload, and re-work out everything again (doing any necessary merges of state) that you were trying to do with the updated object, and then try to save again (with the possibility of just doing it all over again if someone trumps you again).

I would rather have it so that I am midway calculating the state of an object, and realise halfway that the underlying model has changed, so even before I attempt to save it I can rectify my calculations on the fly.

I'm assuming that when you say row locking thats pessimistic locking? Which is the other alternative which I will give a go as well, although I think on initial inspection has the possibility of deadlocking this particular scenario..

But the other thing that I understand is that you can't have a read lock (where you block a select statement from even coming back with values because someone else is updating it), only a read lock on "select for update" scenarios…
So that's actually my problem - I read a value and depend on that value, and so during the time of me calculating things with that value if it has changed, I need to know, but I can't until I have done my calculations, and find that value has changed when I try to save - but some instances, you don't even need to save.. you just do the calculation and pass that along.. which by that stage is incorrect.

I think everyone is right in that I think the best way is to re-think really what it is I am trying to do…perhaps I have just fell in love with Rails automagically doing everything for me that my expectations are now beyond reasonable :)

Cheers
-Chris

Mark Wotton

unread,
Nov 18, 2010, 7:12:04 PM11/18/10
to rails-...@googlegroups.com
I think you'd want a database transaction around the whole action too,
otherwise there's a race condition either way.

mark

--
A UNIX signature isn't a return address, it's the ASCII equivalent of a
black velvet clown painting. It's a rectangle of carets surrounding a
quote from a literary giant of weeniedom like Heinlein or Dr. Who.
        -- Chris Maeda

Warren Seen

unread,
Nov 18, 2010, 7:19:38 PM11/18/10
to rails-...@googlegroups.com

On 19/11/2010, at 11:11 AM, Chris Mayan wrote:

> Something akin to a subscriber/publisher design pattern, where I can subscribe to the DB from rails and say, "Notify me when anything to do with this row" has changed.. and then when something does change the DB is the one that communicates back through the db connection signalling to call .reload…
> Like I don't understand why this is not a standard in any ORM implementation… surely it's never a good situation to be dealing with stale objects at any time

Don't take this the wrong way, but you seem to be misunderstanding the lifecycle of a rails request, and the models it loads during a request?

James Healy

unread,
Nov 18, 2010, 7:21:41 PM11/18/10
to rails-...@googlegroups.com
Chris Mayan wrote:
> Something akin to a subscriber/publisher design pattern, where I can
> subscribe to the DB from rails and say, "Notify me when anything to do with
> this row" has changed.. and then when something does change the DB is the
> one that communicates back through the db connection signalling to call
> .reload…

The thing is, you're working in a stateless web world.

Even if there was some way for the DB to notify you're AR instance that
the underlying data has changed, that instance may have been garbage
collected 30 minutes ago when the original HTTP request was completed.
Meanwhile the user has been staring at their web form for 30 minutes and
has only just decided to hit submit.

Given the nature of HTTP, your realistic options are:

* accept occasional data loss when changes are over-ridden
* optimistic locking like that provided by lock_version
* big chunky application layer locks. Create an AR model called Lock,
have your edit action create a lock, and subsequent requests to the
same action are denied until the user submits the form. Messy, but
sometimes required.

-- James Healy <ji...@deefa.com> Fri, 19 Nov 2010 11:18:50 +1100

Dmytrii Nagirniak

unread,
Nov 18, 2010, 7:22:32 PM11/18/10
to rails-...@googlegroups.com
On 19 November 2010 11:11, Chris Mayan <chris...@gmail.com> wrote:
Hello,

What I really wanted I now believe has to be done and provided by the native DB or implemented via some plumbing connection between rails and the db (which I don't think is provided by any vendors…).

Something akin to a subscriber/publisher design pattern, where I can subscribe to the DB from rails and say, "Notify me when anything to do with this row" has changed..

While I don't believe it is a good idea to do, but definitely MSSQL2005>= provides that. They even invalidated application cache if something has changed on the DB.
 
Like I don't understand why this is not a standard in any ORM implementation…
It seems you are looking into MS Access days when all the connections were stateful. And that was a problem too :)
 
To me the idea of encapsulating a complete DB row as an object is meaningless if the underlying data can change at any time, and I can't tell until I try to save it. Like once I create that ActiveRecord object.. it's stale. So all calculations that I perform on those attributes of that object could well and truly be meaningless.

I do not think that it is very accurate. In *most* cases the data is up-to-date because you connect to database for a short period of time (UnitOfWork) do the work and disconnect. That is the problem when you hold an object for long period of time (which you never do in a web app).
 

I would rather have it so that I am midway calculating the state of an object, and realise halfway that the underlying model has changed, so even before I attempt to save it I can rectify my calculations on the fly.
Did you have a look at MagLev?

I'm assuming that when you say row locking thats pessimistic locking? Which is the other alternative which I will give a go as well, although I think on initial inspection has the possibility of deadlocking this particular scenario..
You do not want to have overoptimistic locking in a web app. Deadlock is almost guaranteed :) 


So that's actually my problem - I read a value and depend on that value, and so during the time of me calculating things with that value if it has changed, I need to know, but I can't until I have done my calculations, and find that value has changed when I try to save - but some instances, you don't even need to save.. you just do the calculation and pass that along.. which by that stage is incorrect.
If you will wrap all your calculations within a transaction you are guaranteed to have correct values. Doesn't it solve the problem?
 

Dmytrii Nagirniak

unread,
Nov 18, 2010, 7:27:52 PM11/18/10
to rails-...@googlegroups.com
On 19 November 2010 10:09, Mikel Lindsaar <raas...@gmail.com> wrote:
On 19/11/2010, at 9:04 AM, Dmytrii Nagirniak wrote:
On 19 November 2010 08:47, Korny Sietsma <ko...@sietsma.com> wrote:
You should still think carefully about what it is you are trying to do.
If you just catch the optimistic locking exception, and automatically reload and save, then you are ensuring the second user's change overwrites the first.  The second user will never see the first user's change.  That's not really optimistic locking at all.
Yep. And this is how AR (basically most of ORMs) works by default - last wins. Nothing to do with the locking.
Wondering why not just use a proper database transaction if atomicity is a requirement?

Well as Korny was saying transactions and row locking won't help you if both data sets are valid.

But the main problem for Chris is that the data is being updated while calculations are in progress. To avoid having stale objects, the transactions can be used.

Of course, merging is totally different thing and is way more complicated. But as far as I understand Chris needs to ensure that there are no stale objects in the middle of an action (not user's one but a system's).

Clifford Heath

unread,
Nov 18, 2010, 8:09:59 PM11/18/10
to rails-...@googlegroups.com
On 19/11/2010, at 11:27 AM, Dmytrii Nagirniak wrote:
> But as far as I understand Chris needs to ensure that there are no
> stale objects in the middle of an action (not user's one but a
> system's).

That's not even theoretically possible in a parallel system. All you
can do is code each
transaction semantically (add 1 to X rather than set X to old-X+1) and
then rely on
database serialisation to make it work. In order to reduce deadlocks,
you can use
intent-update locks (which AR's :lock gives you) or just ensure that
multi-record
updates are ordered (i.e. if you update A then B in one transaction,
you never update
B then A in another).

With long-polling or websockets, you could potentially notify the user
that the record
they're seeing has been updated, but you'd have to think that through
from a XP point
of view.

However, I don't think that's what Chris was concerned about. It seems
to me he was
worried that while I'm looking at some data and perhaps changing it,
but haven't yet
submitted my changes, someone else will submit a change that make my
changes
invalid. The correct answer is to detect that using versioning or row-
value checksums,
where the check value is hidden in the form I will submit, and must
match if my changes
are to be accepted.

The nice thing about doing this with checksums (as opposed to
versions) is that if the
record is large, and I'm only looking at a subset of the fields, I can
have a checksum
over just those fields, rather than the whole record. When I submit,
the record is re-fetched
and those fields re-checksummed. As long as *nothing I was looking at*
has changed,
my update is still valid. In most applications. YMMV. You might not
have encountered a
situation where this was even possible... but for example, the Claims
table in an insurance
app might well have a hundred fields, and those will never all be
shown at the same time.
You could break the record into separate records and use row versions
for each, but
then your UI is leaking into your schema.

Clifford Heath, Data Constellation, http://dataconstellation.com
Agile Information Management and Design


Chris Mayan

unread,
Nov 18, 2010, 8:26:19 PM11/18/10
to rails-...@googlegroups.com
Yes that's right
 
But the main problem for Chris is that the data is being updated while calculations are in progress. To avoid having stale objects, the transactions can be used.

Of course, merging is totally different thing and is way more complicated. But as far as I understand Chris needs to ensure that there are no stale objects in the middle of an action (not user's one but a system's).

Also that MagLev is really cool! (http://www.vimeo.com/1147409)

So the concept there is like having a singleton-like ActiveRecord model object that persists across processes and even when the underlying contents changes, 2 different processes still have the same AR values (all because of their MagLev cache).
*video just froze for me 3/4 of the way in* but I must say that is very impressive.

I'll also take on board that concept of UnitOfWork, and perhaps the way I think about this should be different

Thanks guys for all your input and feedback, I really appreciate all of your comments and it's made me think differently this morning.

Cheers
-Chris

Dmytrii Nagirniak

unread,
Nov 18, 2010, 8:41:06 PM11/18/10
to rails-...@googlegroups.com
On 19 November 2010 12:09, Clifford Heath <cliffor...@gmail.com> wrote:
On 19/11/2010, at 11:27 AM, Dmytrii Nagirniak wrote:
But as far as I understand Chris needs to ensure that there are no stale objects in the middle of an action (not user's one but a system's).

That's not even theoretically possible in a parallel system.

I guess we are talking about different things here.
But I believe we all agree that you cannot guarantee consistency without locking.

The database transaction with serialisable isolation level does provide locking, thus removes parallelism.

So that user2 has to wait until user1 would have finished the transaction, even to read data.

As a result we can say that using transactions it is possible to achieve atomicity, isolation and non-stale data within that transaction.

I guess, this is how most payments are being done.





Reply all
Reply to author
Forward
0 new messages