Patch Testers Sought

13 views
Skip to first unread message

Michael Koziarski

unread,
Aug 11, 2007, 12:55:39 AM8/11/07
to rubyonra...@googlegroups.com
Hey guys,

I've posted a fairly large refactoring of rails' attribute
related-methods to trac at

http://dev.rubyonrails.org/ticket/9241

As it makes a few non-minor changes to the core of activerecord, I
thought I'd post it for review here. I'm especially interested in any
breakage that you notice related to either the NoMethodError or
@attributes_cache changes.

The patch itself isn't done yet, so don't worry about any
'peculiarities' of the implementation, what I'm concerned about is
whether the subtly different semantics and caching introduce any bugs
in your applications.

== description follows ==

This patch refactors the attribute-generation methods in
ActiveRecord::Base in the following ways:

* Move generation from instance methods to class methods
* In addition to accessors, generate mutators
* Always generate accessors and mutators, the overhead in
development mode is minimal.

It includes one functional change, results of typecasted attributes
are cached in a new hash @attributes_cache. The rationale for this
change is that repeated access to columns requiring typecasting (such
as Times) can be a significant overhead. Code like the following will
repeatedly call the Time.parse functions.

50.times { object.created_at }

This can contribute to performance degradation in datetime-heavy applications.

Finally the behaviour of the respond_to? and method_missing methods
has changed when you use find with :select.

# OLD
@customer = Customer.find(1, :select=>"first_name").
@customer.respond_to?("last_name") # => false
@customer.has_attribute?("last_name") # => false
@customer.last_name # raises NoMethodError

# NEW
@customer = Customer.find(1, :select=>"first_name").
@customer.respond_to?("last_name") # => true
@customer.has_attribute?("last_name") # => false
@customer.last_name # raises MissingAttributeError

This change simplifies the code somewhat, and also makes it easier to
track down bugs in this area. A simple hook is provided so plugin
authors can experiment with lazily fetching the missing columns. While
this could be useful for large TEXT or BLOB attributes, I don't intend
to include functionality like that without battle-testing it in some
way.

--
Cheers

Koz

cfis

unread,
Aug 11, 2007, 10:37:13 PM8/11/07
to Ruby on Rails: Core
Hi Koz,

Your proposal looks sound to me, however I think it only goes half
way. In my view there are two main issues with active record:

* The mixing of Ruby objects and raw database values (strings) in the
attributes hash table

* The lack of proper column objects

For more details, I wrote up some of my thoughts at
http://cfis.savagexi.com/articles/2007/08/11/making-rails-better-fixing-architecture-flaws-in-active-record.

But in a nutshell, in think much of the grunginess of ActiveRecord
could be eliminated by introducing a column object per data type. So
FloatColumn, StringColumn, BooleanColumn, etc. Each column would
provide an api for serializing/deserializing Ruby objects to and from
the database's string representation. That would clean up ActiveRecord
by removing a large number of case statements in the connection
adapters and schema_definitions.rb. And more importantly, it would
make it much easier for developers to add their own custom data types
when needed. To see what I mean, try to add your own custom data type
(say Postgresql PostGIS data types) to ActiveRecord and see how
difficult it is (without of course modifying ActiveRecord itself).

Adding a column type directly influences your proposal, because the
serialization methods would be used to convert to and from the
attributes hash (sounds like you are using that for the raw value from
the database) and the attributes_cache hash (which stores the
serialized form as a Ruby object). So something like:

attributes_cache['foo'] =
self.columns['foo'].serialize(attributes['foo'])

And of course vice-versa.

Last, it would be really useful to add a "changed" flag for each field
in each record. Then when ActiveRecord creates an UPDATE query it
would only modify the values that actually changed. Right now,
ActiveRecord updates all columns, which turns out to be a pain because
it has a tendency to botch values for columns that have a default
expression or have a type that it doesn't know about (postgis fields
for example).

So just my two cents - happy to discuss more if these seem like useful
ideas.

Charlie

Michael Koziarski

unread,
Aug 12, 2007, 1:35:47 AM8/12/07
to rubyonra...@googlegroups.com
> Your proposal looks sound to me, however I think it only goes half
> way. In my view there are two main issues with active record:
>
> * The mixing of Ruby objects and raw database values (strings) in the
> attributes hash table
>
> * The lack of proper column objects

The column objects we currently have are certainly anemic, and I agree
it'd be nice if the type casting code was pushed out of Base and into
the adapters. All things in good time, assuming the caching approach
works, I'd definitely like to tidy the implementation a little more.
I definitely like the general shape of:

attributes_cache['foo'] = self.columns['foo'].serialize(attributes['foo'])

So assuming we don't see any breakage, I'll investigate that kind of a change.

> Last, it would be really useful to add a "changed" flag for each field
> in each record. Then when ActiveRecord creates an UPDATE query it
> would only modify the values that actually changed. Right now,
> ActiveRecord updates all columns, which turns out to be a pain because
> it has a tendency to botch values for columns that have a default
> expression or have a type that it doesn't know about (postgis fields
> for example).

We've discussed partial-row updates in the past here, and they're
incompatible with model validations. Concurrent updates can each
see a valid state, issue an update, and leave the record invalid. So
it's unlikely we'll support them within the 2.0 timeframe, however it
could definitely be put into a plugin, and I'm happy to roll in any
hooks needed to simplify that.

As for change tracking, jeremy has had a plugin around for ages
(http://code.bitsweat.net/svn/dirty/) which does the job nicely the
few times I've used it. The fact it hasn't gathered a large number of
users indicates to me that perhaps it's a bit of a niche feature.


> So just my two cents - happy to discuss more if these seem like useful
> ideas.
>
> Charlie
>
>
> >
>


--
Cheers

Koz

Jack Christensen

unread,
Aug 12, 2007, 3:18:02 PM8/12/07
to rubyonra...@googlegroups.com
What about requiring optimistic locking for partial row updates? That would seem to allow it to work without greatly contorting the rest of ActiveRecord.

As for change tracking,  jeremy has had a plugin around for ages
(http://code.bitsweat.net/svn/dirty/) which does the job nicely the
few times I've used it.  The fact it hasn't gathered a large number of
users indicates to me that perhaps it's a bit of a niche feature.


  
So just my two cents - happy to discuss more if these seem like useful
ideas.

Charlie


    

  
Jack

cfis

unread,
Aug 12, 2007, 4:33:51 PM8/12/07
to Ruby on Rails: Core
> The column objects we currently have are certainly anemic, and I agree
> it'd be nice if the type casting code was pushed out of Base and into
> the adapters. All things in good time, assuming the caching approach
> works, I'd definitely like to tidy the implementation a little more.
> I definitely like the general shape of:
>
> attributes_cache['foo'] = self.columns['foo'].serialize(attributes['foo'])
>
> So assuming we don't see any breakage, I'll investigate that kind of a change.

Excellent - that would be great.

> We've discussed partial-row updates in the past here, and they're
> incompatible with model validations. Concurrent updates can each
> see a valid state, issue an update, and leave the record invalid.

Hmm. Thinking out loud, the validations would only check values that
have changed, so that seems ok. Then when you do an update, you'd
send the key fields and the changed fields.

Now if 2 people tried to update the same record, and ActiveRecord was
doing partial updates, then I could see how might end up in an invalid
state on the database side. Of course that's easily solved on the
database side - but there would be no way to solve it on the rails
side.

> could definitely be put into a plugin, and I'm happy to roll in any
> hooks needed to simplify that.

A possible hook would be a setter method you could override, or for a
cleaner implementation, fire a changed event when a field is updated.
Then you could roll your own "changed" hash table.

> The fact it hasn't gathered a large number of
> users indicates to me that perhaps it's a bit of a niche feature.

Interesting - ok.

Anyway, I'd agree keeping track of changes is not nearly as important
as separating out serialized/unserialized values in the attributes
hash table and having a good column implementation.

I'm excited you're looking at this, this will be a big improvement for
ActiveRecord.

Charlie

Michael Koziarski

unread,
Aug 12, 2007, 5:55:36 PM8/12/07
to rubyonra...@googlegroups.com
> What about requiring optimistic locking for partial row updates? That would
> seem to allow it to work without greatly contorting the rest of
> ActiveRecord.

Yeah, if someone wanted to implement this kind of behaviour with a
plugin, they'd definitely need to be checking locks_optimistically? or
whatever that boolean is. Starting with Jeremy's dirty plugin, I
imagine it won't be a huge amount of work.

--
Cheers

Koz

Reply all
Reply to author
Forward
0 new messages