Declaring indexes with hobofields

21 views
Skip to first unread message

Helder Ribeiro

unread,
Aug 11, 2009, 3:34:45 AM8/11/09
to Hobo Users
Is there any way of declaring indexes in the models themselves, like
with fields?

Thanks,

Helder

http://helderribeiro.net

Tiago Franco

unread,
Aug 11, 2009, 6:02:18 AM8/11/09
to Hobo Users
I've been declaring indexes on the migrations. This feature could come
handy.

TF

Owen

unread,
Aug 11, 2009, 10:11:42 AM8/11/09
to Hobo Users
I've been pushing for that in Hobo. How would you like the
declaration to look?

Owen

unread,
Aug 11, 2009, 10:13:43 AM8/11/09
to Hobo Users
A great request for Hobo 1.1

Matt Jones

unread,
Aug 11, 2009, 10:18:30 AM8/11/09
to hobo...@googlegroups.com
The most straightforward would be a direct translation of the
parameters of add_index inside of fields:

fields do
# ... blah blah blah

index :column_name
index :other_column, :unique => true
index :a_whole, :bunch_of, :columns_to_index, :name => 'foobar'
end

The only question mark is the level of support for getting these
settings back out of the DB, so the migration generator can figure out
what's changed. That needs some investigation.

--Matt Jones

Owen

unread,
Aug 11, 2009, 3:46:19 PM8/11/09
to Hobo Users, Dmotola Awofolu
Thanks, Matt. I like what you outlined. Yes, syncing with the DB is
the trickiest.

Tom Locke

unread,
Aug 13, 2009, 5:34:51 AM8/13/09
to hobo...@googlegroups.com
> The only question mark is the level of support for getting these
> settings back out of the DB, so the migration generator can figure out
> what's changed. That needs some investigation.

Does anyone know if this information shows up in schema.rb after a
rake db:schema:dump ?

If not, that's a whole new level of complication to get this working -
so far we have relied on the Rails schema dumper.

Tom

Scott Bronson

unread,
Aug 13, 2009, 6:19:16 AM8/13/09
to hobo...@googlegroups.com
On Thu, Aug 13, 2009 at 2:34 AM, Tom Locke <t...@tomlocke.com> wrote:
Does anyone know if this information shows up in schema.rb after a
rake db:schema:dump ?

Yes, it does.  A section of my schema.rb created by db:schema:dump:


  create_table "favorite_shows", :force => true do |t|
    t.datetime "created_at"
    t.integer  "show_id"
    t.integer  "user_id"
  end

  add_index "favorite_shows", ["show_id"], :name => "index_favorite_shows_on_show_id"
  add_index "favorite_shows", ["user_id"], :name => "index_favorite_shows_on_user_id"

Owen

unread,
Aug 13, 2009, 1:16:30 PM8/13/09
to Hobo Users
Great. That makes it easier...

Matt Jones

unread,
Aug 15, 2009, 3:15:14 PM8/15/09
to hobo...@googlegroups.com
It looks like the major DB adapters can get at the index info, so that
hurdle is passed.

This post got me thinking more on syntax:
http://robots.thoughtbot.com/post/163627511/a-grand-piano-for-your-violin

The syntax described below is good for generic indexes (multiple
fields, etc), but decidedly un-DRY for basic indexes. What about
adding an :index option to the fields DSL? So it might look more like:

fields do
some_field :integer
indexed_field, :integer, :index => true
end

Some of the options to add_index can be determined automatically - for
instance, a field declared with :unique => true would seem likely to
want a unique index. Passing a string instead of true would allow a
custom name for the index.

It could even be taken a step farther, adding an implicit :index to
belongs_to, so foreign keys are always indexed.

Thoughts? If people like this, I can work on rolling it into the other
changes I've got waiting in the queue for the migration generator
(better behavior of HABTM, adding fields in STI subclasses, better
ignoring of non-Hobo tables).

--Matt Jones

Tiago Franco

unread,
Aug 15, 2009, 5:11:38 PM8/15/09
to Hobo Users
Hi Matt,

I link the DSL but two issues should be supported: multiple columns
and index_name.

The following command can create an index with Rails:

add_index :suppliers, [:location, :name], :name => 'by_location'

TF

On Aug 15, 8:15 pm, Matt Jones <al2o...@gmail.com> wrote:
> It looks like the major DB adapters can get at the index info, so that  
> hurdle is passed.
>
> This post got me thinking more on syntax:http://robots.thoughtbot.com/post/163627511/a-grand-piano-for-your-vi...

Tom Locke

unread,
Aug 17, 2009, 5:08:13 AM8/17/09
to hobo...@googlegroups.com
> fields do
> some_field :integer
> indexed_field, :integer, :index => true
> end

Looks good, or even

indexed_field :integer, :indexed

We do this already for flag-like options like :required and :unique

Then again, I like you're idea of :index => "my_index_name" so your
way is probably better.

> a field declared with :unique => true would seem likely to
> want a unique index.

Naturally : )

> It could even be taken a step farther, adding an implicit :index to
> belongs_to, so foreign keys are always indexed.

Someone once explained to me that depending on the density of the
data, this is not always the right thing to do - i.e. scanning the
index can become more work than searching the unindexed table. Someone
who knows more about databases than me better chime in : )

> Thoughts? If people like this, I can work on rolling it into the other
> changes I've got waiting in the queue for the migration generator
> (better behavior of HABTM, adding fields in STI subclasses, better
> ignoring of non-Hobo tables).

That would be excellent : )

Tom

Owen

unread,
Aug 17, 2009, 3:04:29 PM8/17/09
to Hobo Users
Adding these features to the migration generator soon will be a big
win...

In my experience it is always better to have indexes on foreign keys,
as most modern database optimizers test if using an index will be
better than a full table scan before actual running.

Also, having the belongs_to indexes provide the information that data
modeling tools need to buid ERD relationships.

I agree with Tiago about supporting multiple fields in one index, but
that would probably need to be declared on a separate line, as there
are times when one field could be part of more than one index. Would
it by un-DRY to require a separate line when you want multiple field
indexes?


fields do
#-----
email :string, :unique, :indexed
name :string, :index => "my_name_index"
state :string, :required, :indexed
legislative_district :string, :required, :indexed

multiple_field_index => "state_legislative_district_index",:unique,
[:state, :legislative_district]
multiple_field_index => "name_by_location_index", :unique,
[:location, :name]

end

Owen

unread,
Aug 17, 2009, 3:21:05 PM8/17/09
to Hobo Users
Another way I have used in the past is to declare which part of the
index a field would be

fields do
#-----
email :string, :unique, :index ### -- the DB default name would
be used here ---###
name :string, :index => "my_name_index"
state :string, :required, :index =>
"state_legislative_district_index", :part_1
legislative_district :string, :required, :index =>
"state_legislative_district_index", :part_2
end

Perhaps it is OK to assume a field can't be part of more than one
index within FIELDS DO in order to make it DRY for most cases...

kevinpfromnm

unread,
Aug 17, 2009, 5:52:18 PM8/17/09
to Hobo Users
Instead of name as a hash parameter, how about match up with existing
fields syntax? Something like this:

index_name :index, fields => [:field1, :field2]

Tom Locke

unread,
Aug 18, 2009, 7:12:44 AM8/18/09
to hobo...@googlegroups.com
I don't think these should go in the fields block at all, since they
are not fields. The :index => true shorthand is OK, but I would but
the index declarations outside, like

fields do
...
end

index :my_nice_index, :fields => [:a, :b, :c]

or maybe

index :a, :b, :c, :as => :my_nice_index


Tom

Tom Locke

unread,
Aug 18, 2009, 7:14:02 AM8/18/09
to hobo...@googlegroups.com
> Perhaps it is OK to assume a field can't be part of more than one
> index within FIELDS DO in order to make it DRY for most cases...

Yeah I think the :index => true is just a shorthand for the common
case, much as :required and :unique are shorthands already.

Tom

Owen

unread,
Aug 18, 2009, 8:50:51 AM8/18/09
to Hobo Users
Yes, Ton, I like having "index" as a separate declare.

Owen

unread,
Aug 18, 2009, 8:51:50 AM8/18/09
to Hobo Users
Yes, Tom, that is DRY.

Henry Baragar

unread,
Aug 18, 2009, 12:27:08 PM8/18/09
to hobo...@googlegroups.com
Tom,

>I don't think these should go in the fields block at all, since they
>are not fields.
>

Since "belongs_to" (which generates a field/column) is not is the the fields
block, your argument is even stronger.


>The :index => true shorthand is OK, but I would but
>the index declarations outside, like
>
> fields do
> ...
> end
>
> index :my_nice_index, :fields => [:a, :b, :c]
>
>or maybe
>
> index :a, :b, :c, :as => :my_nice_index

I like this second version, since "as" could have a reasonable default.

>Tom

Regards,
Henry
--

Henry Baragar
Instantiated Software

Matt Jones

unread,
Aug 18, 2009, 4:06:03 PM8/18/09
to hobo...@googlegroups.com
The only objection to that is that the underlying AR functionality
(add_index) will take an array of fields and generate the name
automatically, so we shouldn't require the name to be specified. Thus
the 'index [:field1, :field2, :field3], :name => 'foo'' syntax.

To summarize the syntax, as the thread's gotten a little long:

In the fields block:
- :index => true on a field will establish an index, with the Rails
default name
- :index => 'foo' on a field will establish an index, named 'foo'
- indexed fields which are marked unique will generate a unique index

Outside of fields:
- index :foo will create an index on :foo
- index [:foo, :bar, :baz] will create an index on the three fields
together
- pass :name => 'foo' to name the index
- pass :unique => true to get a unique index
Note: the [] on the multi-field index is important to distinguish from
code that's expecting an attr_accessor-like behavior (pass N fields,
get N declarations). We could support this behavior as well, if desired.

Automatic indexing:
- belongs_to will automatically add an index for their foreign key
(and type, if polymorphic); pass :index => false to turn off
- STI bases should automatically add an index for the type field
- lifecycles will automatically add an index to the state field,
pass :index => false to the lifecycle block to turn off
- should timestamps auto-add indexes? It can apparently help with
sorting...
- should the attribute marked 'login' be indexed? It would certainly
make sense in user models.

Thoughts? I'm particularly interested in any other cases where
automatic indexing could be useful.

--Matt Jones

kevinpfromnm

unread,
Aug 18, 2009, 5:12:13 PM8/18/09
to Hobo Users
Timestamps no, though maybe look at default sort column and put an
index on that?

I hadn't thought about the state field but yeah good catch.

Everything looks good to me but I'm only a hack when it comes to
databases.

Question, will it remove indexes in absence of index call?

Tom Locke

unread,
Aug 19, 2009, 5:19:21 AM8/19/09
to hobo...@googlegroups.com
> In the fields block:
> - :index => true on a field will establish an index, with the Rails
> default name
> - :index => 'foo' on a field will establish an index, named 'foo'
> - indexed fields which are marked unique will generate a unique index

Yep all looks good.

> Outside of fields:
> - index :foo will create an index on :foo
> - index [:foo, :bar, :baz] will create an index on the three fields
> together
> - pass :name => 'foo' to name the index
> - pass :unique => true to get a unique index
> Note: the [] on the multi-field index is important to distinguish from
> code that's expecting an attr_accessor-like behavior (pass N fields,
> get N declarations). We could support this behavior as well, if
> desired.

I think I would rather see one index generated per declaration, and
drop the [...]. There are examples of both styles in Rails, e.g. you
can't do "belongs_to :user, :project, :group"

I also kinda like :as => 'foo' rather than :name => 'foo', just
because I'm a sucker for APIs that read a little like English, but I
don't feel strongly about that one. Maybe :name is more obvious.

> Automatic indexing:
> - belongs_to will automatically add an index for their foreign key
> (and type, if polymorphic); pass :index => false to turn off
> - STI bases should automatically add an index for the type field
> - lifecycles will automatically add an index to the state field,
> pass :index => false to the lifecycle block to turn off

All good

> - should timestamps auto-add indexes? It can apparently help with
> sorting...

I'd vote no on this one. But how about supporting

timestamps :index => true

in the fields block

> - should the attribute marked 'login' be indexed? It would certainly
> make sense in user models.

Yeah I think this makes sense.

> Thoughts? I'm particularly interested in any other cases where
> automatic indexing could be useful.

I'd warn against going too far on this (we've made that mistake in the
past with aspects of Hobo). For the obscure cases I think it's much
better to put a tiny bit more work on the user, than to do things that
were not expected.

Tom

Helder Ribeiro

unread,
Aug 19, 2009, 11:28:06 AM8/19/09
to hobo...@googlegroups.com
How would this default be turned off? Using

index :type => :false

outside the fields block looks (kind of) ok, but it's not obvious if
it should go in the base or the child class.

>> - lifecycles will automatically add an index to the state field,
>> pass :index => false to the lifecycle block to turn off
>
> All good
>
>> - should timestamps auto-add indexes? It can apparently help with
>> sorting...
>
> I'd vote no on this one. But how about supporting
>
>     timestamps :index => true
>
> in the fields block
>
>> - should the attribute marked 'login' be indexed? It would certainly
>> make sense in user models.
>
> Yeah I think this makes sense.
>
>> Thoughts? I'm particularly interested in any other cases where
>> automatic indexing could be useful.
>
> I'd warn against going too far on this (we've made that mistake in the
> past with aspects of Hobo). For the obscure cases I think it's much
> better to put a tiny bit more work on the user, than to do things that
> were not expected.

Second that.

>
> Tom
>
>
> >
>

Brett Nelson

unread,
Aug 19, 2009, 11:33:07 AM8/19/09
to Hobo Users
>I think I would rather see one index generated per declaration, and
>drop the [...]. There are examples of both styles in Rails, e.g. you
>can't do "belongs_to :user, :project, :group"

This is not declaring multiple indexes but declaring a single "multi-
field" index. The fields are probably concatenated first and then
indexed by the db engine.

You also need the brackets or some way of indexing a field that might
be named "as"

Brett

Matt Jones

unread,
Aug 19, 2009, 1:48:37 PM8/19/09
to Hobo Users
On Aug 19, 2009, at 5:19 AM, Tom Locke wrote:
>>
>> Note: the [] on the multi-field index is important to distinguish
>> from
>> code that's expecting an attr_accessor-like behavior (pass N fields,
>> get N declarations). We could support this behavior as well, if
>> desired.
>
> I think I would rather see one index generated per declaration, and
> drop the [...]. There are examples of both styles in Rails, e.g. you
> can't do "belongs_to :user, :project, :group"
>

Right, but the belongs_to example will blow up messily - the syntax
isn't supported at all. A user who doesn't pay attention to the
documentation and writes:

index :field1, :field2, :field3

won't get an error, just an index that they quite possibly didn't
intend. I know we'd all like to believe that users read
documentation,
but anybody who follows rails-talk will recognize that's not
entirely
true.... :)

> I also kinda like :as => 'foo' rather than :name => 'foo', just
> because I'm a sucker for APIs that read a little like English, but I
> don't feel strongly about that one. Maybe :name is more obvious.
>

I've got no preference for the :name / :as thing - I used :name
because that's what the underlying API uses.

>> - should timestamps auto-add indexes? It can apparently help with
>> sorting...
>
> I'd vote no on this one. But how about supporting
>
> timestamps :index => true
>
> in the fields block

Makes sense.

The one remaining issue is what to do about existing indexes that
aren't declared - it's actually a subset of the larger "what should
the migration generator do with undeclared fields" issue. But that's
really a better subject for another post.

--Matt Jones

kevinpfromnm

unread,
Aug 19, 2009, 7:12:31 PM8/19/09
to Hobo Users
Yeah, that's what I read it as well.

I think the brackets are a good way of visually indicating the
multiple columns are part of one index. It's the same as the
migration too. If the brackets aren't there it becomes more confusing
to figure out which parts are columns in the index and which parts are
other parameters.

kevinpfromnm

unread,
Aug 19, 2009, 7:19:42 PM8/19/09
to Hobo Users
Oops, didn't see the second page of posts...

In terms of what should the migration generator do with undeclared
fields question, I'd suggest giving 3 more options:

add to ignore (so it doesn't prompt again). It'd be nice to have a
flag to ignore the ignores on the generator

remove from DB. Well, this is really already there but be there for
index as well.

add definition to model (add in the appropriate line in fields or
index call)

This should probably be another thread as this one is pretty long in
the tooth now.

Owen

unread,
Aug 19, 2009, 10:11:44 PM8/19/09
to Hobo Users
Great input, guys. This is the kind of stuff that will make Hobo
great for the "Enterprise"

Tom Locke

unread,
Aug 20, 2009, 4:59:12 AM8/20/09
to hobo...@googlegroups.com
> A user who doesn't pay attention to the
> documentation and writes:
>
> index :field1, :field2, :field3
>
> won't get an error

I don't really like the idea that we make the syntax (a little)
messier on behalf of people who don't know how it works. There's so
many things we can do with class-level declarations like this, surely
we can't apply the rule of "attr_accessor style" to all of them for
ever : )

> I've got no preference for the :name / :as thing - I used :name
> because that's what the underlying API uses.

OK let's go with .... um ..... :name (obvious wins the day over
slickness ; )

> The one remaining issue is what to do about existing indexes that
> aren't declared - it's actually a subset of the larger "what should
> the migration generator do with undeclared fields" issue. But that's
> really a better subject for another post.

Agreed - for now you have to declare them.

Tom

Matt Jones

unread,
Sep 6, 2009, 5:30:45 AM9/6/09
to hobo...@googlegroups.com
For anybody brave enough to try it, there is an implementation of this
up on Github - http://github.com/tablatom/hobo/tree/indexgen .
To try it out, clone the repo into vendor/plugins/hobo and check out
the 'indexgen' branch.

I haven't written extensive tests yet, and there's a good chance that
things will be messed up on renaming models - but the essential
operations are all there. I'd love to hear some feedback. There are a
few other relevant changes sprinkled in that branch; see the history
for details.

--Matt Jones
Reply all
Reply to author
Forward
0 new messages