coercions of blank strings to boolean and integer values

115 views
Skip to first unread message

Josh Susser

unread,
Aug 19, 2008, 1:40:34 AM8/19/08
to rubyonra...@googlegroups.com
This may seem a minor issue, but I think it's worth a bit of
discussion to get it right. Let me start with the use case that led
me smack into the issue.

Start with some model that has a boolean field that can be nil/null.
Create a view with a form that uses a select box to set the field's
value, and set :include_blank => true. The obvious assumption is that
if you select blank (as opposed to Yes/1/true or No/0/false), the
field value will be saved as nil/null. Not so - it is saved as 0/
false. This has the weird behavior of the select box showing the
false choice when you select blank and get bounced back for a second
try at the form.

create_table :widgets do |t|
t.string :name
t.boolean :fancy
end

w = Widget.create(:name => "weather", :fancy => true)
w.fancy # => true
w.update_attributes(:fancy => nil)
w.fancy # => nil
w.update_attributes(:fancy => "")
w.fancy # => false

It's not too much trouble to patch ActiveRecord to change it so that
setting a boolean field to a blank string gets saved as NULL in the
database instead of 0. The question is: what is the desired
behavior? More specifically, when should the value be coerced, with
special consideration to the case when the field doesn't allow a nil/
NULL value?

I think the answer is that blank values should always be coerced to
nil in memory, even when the field is created with :null => false.
This is consistent with how integer fields are handled. You can have
a nil value in memory, and the db will barf when you try and save it
as NULL. AR expects you to use a validation to catch that and give
the user another shot at the form. However... there is what looks to
me like a bug in AR where assigning a blank-but-not-empty string to an
integer field has a different result from assigning an empty string or
nil. An empty string is coerced to nil, but a string of one or more
whitespace characters is coerced to 0. It's also easy to fix it so
that all blank strings are coerced to 0, and I think that would be the
best thing to do. But I wonder if anyone out there might be relying
on this behavior. There are only 2 tests that break when this change
is made (in validations_test.rb).

By the way, strings representing integers are coerced to their correct
values, but non-numeric objects like arrays and hashes are coerced to
1. WTF? Why is that useful?

I was going to create a ticket with a patch for this stuff to refer
to, but the code is sitting on a computer at work so that won't happen
until the morning (sorry, Pratik).

To summarize:

I propose that all blank strings should be coerced to nil, for both
boolean and integer fields. Any issues with that? Anyone know if
they are relying on that behavior?

--
Josh Susser
http://blog.hasmanythrough.com


Chris Cruft

unread,
Aug 19, 2008, 7:58:23 AM8/19/08
to Ruby on Rails: Core
The scenario that you mention is a classic one of empty strings being
returned from forms. And the problem is
bigger than just booleans. It also applies to strings, numbers and
lists.

In the scenario Josh mentions, a boolean field should have the empty
string or a blank string coerced to false (or nil) long before it gets
saved by ActiveRecord.

Putting the burden on ActiveRecord to massage the crap it is handed
into something meaningful seems out
of place. Why not fix the problem at the source and get
ActionController to return meaningful values from empty form fields?

-1 for getting ActiveRecord to bail out ActionController with coercion
of empty strings and blanks.

As for the coercion of non-numerics objects, I agree that "1" seems
totally outrageous. I would hope for the coercion to use to_i/to_f
conventions and raise an exception when they fail.

+1 for fixing/removing the coercion of non-numeric objects beyond a
simple to_i/to_f depending on the field type.

-Chris

PS - Here is some code I have in ApplicationController that attacks
the empty string problem at the source. It could be cleaned up with
"returning" and other recent goodness:

# Coerce empty string values in hash -particularly useful for HTTP
POSTed forms. Values are coerced based on the attrs
# mapping of params keys (attrs keys) to coerced values (attrs
values).
# See http://dev.rubyonrails.org/ticket/5694 for background
def coerce_empty_strings(params_hash, attrs = {})
return unless params_hash
params_hash.inject(HashWithIndifferentAccess.new) do |h,(k,v)|
h[k] = (v.is_a?(String) && v.empty?) ? attrs[k.to_sym] : v
h
end
end

Ryan Bates

unread,
Aug 19, 2008, 10:52:07 AM8/19/08
to Ruby on Rails: Core
I don't think the fault is in ActionController. The params hash should
echo whatever parameters are being passed from the browser. The
browser sends an empty string, it's ActiveRecord's job to coerce the
strings into more suitable objects depending on the attribute you're
assigning to.

I'm okay with an empty string being coerced into nil for boolean
attributes (for consistency). But I do think the :null => false
setting is much more common in boolean attributes than any other type.
Otherwise it's not a true boolean attribute (it can hold 3 states).
Too often I've been bitten by a bug where null slipped into a boolean
field and it didn't match a "false" search.

Regards,

Ryan
>   # Seehttp://dev.rubyonrails.org/ticket/5694for background

Chad Woolley

unread,
Aug 19, 2008, 11:07:48 AM8/19/08
to rubyonra...@googlegroups.com
On Tue, Aug 19, 2008 at 4:58 AM, Chris Cruft <cc...@hapgoods.com> wrote:
> Putting the burden on ActiveRecord to massage the crap it is handed
> into something meaningful seems out
> of place. Why not fix the problem at the source and get
> ActionController to return meaningful values from empty form fields?

I agree. This follows the guideline of "validate early", which is
generally a good one.

Also, it makes sense to handle it early if you were to use another
persistence layer. This is a UI-layer problem, solve it close to the
UI.

-- Chad

Ian White

unread,
Aug 19, 2008, 11:38:43 AM8/19/08
to rubyonra...@googlegroups.com
> I agree. This follows the guideline of "validate early", which is
> generally a good one.
I also agree - ActiveRecord should not need to know about how browsers
handle params hashes. For example, If you're talking to ActiveRecord
from somewhere else (the console, a drb script, or whatever) you
shouldn't have to pretend you're an html client

Ian

--
Argument from Design--We build web applications
Western House 239 Western Road Sheffield S10 1LE
Mob: +44 (0)797 4678409 | Office: +44 (0)114 2667712
<http://www.ardes.com/> | <http://blog.ardes.com/ian>


Ryan Bates

unread,
Aug 19, 2008, 12:01:55 PM8/19/08
to Ruby on Rails: Core


On Aug 19, 8:38 am, Ian White <ian.w.wh...@gmail.com> wrote:
> I also agree - ActiveRecord should not need to know about how browsers  
> handle params hashes.  For example, If you're talking to ActiveRecord  
> from somewhere else (the console, a drb script, or whatever) you  
> shouldn't have to pretend you're an html client

Interesting, to me it is just the opposite. This is not specific to
HTML clients and Action Controller. We're talking about how Active
Record handles empty strings. This applies to console and other
scripts too. If the client is passing empty strings in any of those
environments, Active Record should be able to coerce appropriately.

Perhaps both Action Controller and Active Record need fixing? But I
look at it as two different problems because they are completely
separate modules.

Regards,

Ryan

Josh Susser

unread,
Aug 19, 2008, 12:23:55 PM8/19/08
to rubyonra...@googlegroups.com

On Aug 19, 2008, at 4:58 AM, Chris Cruft wrote:
> The scenario that you mention is a classic one of empty strings being
> returned from forms. And the problem is
> bigger than just booleans. It also applies to strings, numbers and
> lists.
>
> In the scenario Josh mentions, a boolean field should have the empty
> string or a blank string coerced to false (or nil) long before it gets
> saved by ActiveRecord.
>
> Putting the burden on ActiveRecord to massage the crap it is handed
> into something meaningful seems out
> of place. Why not fix the problem at the source and get
> ActionController to return meaningful values from empty form fields?

I think the problem with that approach stems from the fact that form
data is submitted as untyped strings. There's no way to look at the
string "1,100" and guess if that means the string itself, the integer
1100, or the list [1, 100]. Currently ActiveRecord does the best it
can to convert string data from forms into appropriate values for
fields, and sometimes it falls over (bug or flawed design?).

I see three paths we can take to improve things:

1) incrementally improve ActiveRecord to more sensibly process string
inputs and convert to the correct data type for fields, i.e. blank
string handling

2) significantly alter ActiveRecord for more flexible and targeted
processing of string inputs

3) create some kind of middle-man object to assist in converting form
input strings to correct data types

Path 1 seems like a good approach in the short term, and there seems
to be little reason not to fix obvious errors in how ActiveRecord
operates. Even if we do something else, it doesn't seem like a good
idea to remove this functionality from AR, since that would break
virtually *every* Rails app in existence.

Path 2 could be interesting as a generic approach. I've done exactly
this in specific situations often before - e.g. I fake up a tags_list
accessor on the model to allow user input of a list of tags like
"rails, ruby, sighting". You can of course do this without any
special support, but perhaps a bit of syntactic sugar could improve
things.

Path 3 sounds great in theory. It's like a presenter that runs in
reverse too. But I wonder if separating the processing of form input
data into a separate object is going to be worth the effort. I'd be
interested in seeing someone's proposal for what that might look like
(unfortunately I have a few other science experiments higher in my own
priority queue right now).

In the mean time, I propose Path 1 as the simplest thing that could
possibly work to fix the use case where submitting forms with blank
values gives a non-nil value in the field.


> -1 for getting ActiveRecord to bail out ActionController with coercion
> of empty strings and blanks.
>
> As for the coercion of non-numerics objects, I agree that "1" seems
> totally outrageous. I would hope for the coercion to use to_i/to_f
> conventions and raise an exception when they fail.

That's just what it does (schema_definitions.rb:65):

when :integer then value.to_i rescue value ? 1 : 0

I'm still puzzling over that one, especially since `value` will never
be nil (thanks to a test a few lines above).

Chris Cruft

unread,
Aug 20, 2008, 11:34:31 AM8/20/08
to Ruby on Rails: Core
+1 for "when :integer then value.to_i"

And if I get up on the wrong side of the bed, then I might propose
"when :integer then value"

Honestly, having the persistence layer guess at what is "intended" to
be stored seems like a losing proposition. What's next? Guessing
that for a boolean field "Nein" means false and "Oui" means true and
that for an integer field that "Two" means 2 and "a lot" means 3?

ActiveRecord is doing two jobs: modelling and persisting. You could
make a case for AR modelling being a good place to address the
coercion problem... but not generically. If you want to solve the
problem in AR, then let's give the user the ability to model coercion
rules as well.

But ultimately, WHEREVER the modelling of coercion is performed, it
needs to be available right from the start of processing in the
controller action. And coercion rules/modelling needs to be available
for input data that it is not persisted (or at least not by an
ActiveRecord model).

A good solution might be a reverse presenter (to use Josh's term) that
can be modelled independently, with AR automatically generating a
default reverse presenter bound to each model class. More complicated
Reverse Presenters could accept multiple AR models and their
associations as well as manually managed attributes. A
ReversePresenter would be instantiated with the request's params at
the start of a controller action. It's all downhill from here...

Of course it's worth considering combining the Reverse Presenter and
the classic Presenter into one class. That might help with building
some enthusiasm for the idea and it would probably reduce the amount
of modeling required.

Damian Janowski

unread,
Aug 20, 2008, 6:47:29 PM8/20/08
to rubyonra...@googlegroups.com
On Wed, Aug 20, 2008 at 12:34 PM, Chris Cruft <cc...@hapgoods.com> wrote:
> Honestly, having the persistence layer guess at what is "intended" to
> be stored seems like a losing proposition. What's next? Guessing
> that for a boolean field "Nein" means false and "Oui" means true and
> that for an integer field that "Two" means 2 and "a lot" means 3?

Quick thought: I don't think calling #to_i is guessing. I see it as "I
expect an integer, please provide me with whatever you can to satisfy
that". I think most Ruby APIs are like that. So I always call #to_s or
#to_i in methods that expect such types.

Just like AR will call #to_i on #find when passed a string. It's ok,
and it lets you do some more magic and makes your code look a lot
better.

Just my $0.2

Chris Cruft

unread,
Aug 21, 2008, 8:01:11 AM8/21/08
to Ruby on Rails: Core
Damian,
I think we are saying the same thing: using the object's own methods
(to_i, to_a, to_f, etc.) to typecast/coerce is absolutely the right
thing to do. Unfortunately, Rails currently goes beyond that level of
coercion and Josh's original proposal on this post was to go even
further to coerce booleans using external logic. It's the use of
external logic that I characterized as "guessing."

-Chris

On Aug 20, 6:47 pm, "Damian Janowski" <damian.janow...@gmail.com>
wrote:

Damian Janowski

unread,
Aug 21, 2008, 12:47:55 PM8/21/08
to rubyonra...@googlegroups.com
On Thu, Aug 21, 2008 at 9:01 AM, Chris Cruft <cc...@hapgoods.com> wrote:
>
> Damian,
> I think we are saying the same thing: using the object's own methods
> (to_i, to_a, to_f, etc.) to typecast/coerce is absolutely the right
> thing to do. Unfortunately, Rails currently goes beyond that level of
> coercion and Josh's original proposal on this post was to go even
> further to coerce booleans using external logic. It's the use of
> external logic that I characterized as "guessing."

Agreed :-)

Michael Koziarski

unread,
Aug 21, 2008, 3:39:25 PM8/21/08
to rubyonra...@googlegroups.com
>> I think we are saying the same thing: using the object's own methods
>> (to_i, to_a, to_f, etc.) to typecast/coerce is absolutely the right
>> thing to do. Unfortunately, Rails currently goes beyond that level of
>> coercion and Josh's original proposal on this post was to go even
>> further to coerce booleans using external logic. It's the use of
>> external logic that I characterized as "guessing."
>
> Agreed :-)

I'd definitely love to see some work done on tidying up the conversion
code and making it more consistent and testable. In the meantime the
patch is a nice pragmatic solution until someone has the time and the
inclination to spend time doing that rework.

If you're that person, drop us a line :)


--
Cheers

Koz

Reply all
Reply to author
Forward
0 new messages