[pyDAL] Designing index support

126 views
Skip to first unread message

Giovanni Barillari

unread,
Sep 7, 2015, 11:28:55 AM9/7/15
to web2py-developers
Hi all,
I'm gonna start working on indexing support on pydal for the next release (expected this month 15.09).

Now, I'm gonna ask developers feedbacks about the syntax. What I've imagined is something like this:

db.define_table('my_table',
   
Field('a_field', indexed=True),
   
Field('b_field', 'int')
)

with 'indexed' parameter default to False.

I'm would like also to support this:

db.define_table('my_table',
   
Field('a_field', 'int'),
   
Field('b_field'),
   
Field('c_field'),
    indexes
=['a_field', ('b_field', 'c_field')]
)

since would be the only way to create combined indexes (in the g.ex. b_field and c_field is a single index).
The two syntax can obviously co-exist in the same table definition.

What do you think? Any other proposals/ideas/concerns?


/Giovanni

Massimo DiPierro

unread,
Sep 7, 2015, 12:41:56 PM9/7/15
to web2py-d...@googlegroups.com
+1

--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michele Comitini

unread,
Sep 7, 2015, 1:12:57 PM9/7/15
to web2py-developers
Giovanni,
Both are easy to read and understand, so +2.
SQL DDL allows indexed to be created outside table definition especially useful when indexing with expressions.


db.define_index(db(db.field_a.lower()))

naming and migrations are an issue?

mic

Giovanni Barillari

unread,
Sep 8, 2015, 4:01:43 AM9/8/15
to web2py-developers
I think the index support will need a bit of modifications to the migration engine.

Anyway, I can also plan a third syntax to do the following:

db.my_table.add_index('field_a')
db
.my_table.drop_index('field_a')

What do you think?

Paolo Valleri

unread,
Sep 8, 2015, 9:47:18 AM9/8/15
to web2py-d...@googlegroups.com
What happen if you run .add_index() more than once?

Mind that creating an index on an already populated table can require some time, which can lead to the bad result of a connection Timeout. As a result, the same code that works in a development env (where tables has few rows) can fail in production. 
In the design phase you should consider this situations and prefer to create the index in async mode if the underlying driver/back-end support it.

Are spatial indexes involved in this first implementation?

 Paolo

Giovanni Barillari

unread,
Sep 8, 2015, 11:41:39 AM9/8/15
to web2py-developers
@paolo I think that index support and migrations engine are quite interconnected. The problem of run .add_index multiple times is the same of trying to execute "CREATE INDEX" on the driver. The developer should be aware of it. As a precaution, we can let the 'add_index' and 'drop_index' methods to check the actual migrated schema and ignore the commands, as it happens for .define_table which checks if a 'CREATE TABLE', migration operations or nothing should be done.

Regarding spatial indexes: I will start with 'standard' indexes. We can plan an upgrade for 15.10 or next version on the indexing code to add them as soon as the 'basic' code is tested and working without issues.

/Giovanni

Vinicius Assef

unread,
Sep 8, 2015, 3:05:48 PM9/8/15
to web2py-d...@googlegroups.com
As a first approach, I’d choose the really minimum to make this idea viable. I see add_index() and drop_index() as this minimum.

It delivers:
1) Clarity;
2) Power and flexibility: create composite indexes, create indexes based on a function (as noted by Niphlod);
3) Simplicity: “there should be only one obvious way to do something”.

I wouldn’t add the `index` option in a Field() because:
1) Additional syntax are the proper way to confuse users.
2) Additional syntax are bad to search for code to maintain.
3) It’s more code to trace for bugs.
4) It’s less flexible.

So, I’d ask for really good reasons before implementing another way to create/drop indexes. And having them, I’d wait some time before implementing it. It could be only a good idea, nothing more than that.

Vinicius Assef.

Richard Vézina

unread,
Sep 8, 2015, 3:13:41 PM9/8/15
to web2py-d...@googlegroups.com
@Vinicius, as far as I know, there were no syntax for creating indexes from web2py... I am not sure I follow you with adding index option 5 points...

Richard

Vinicius Assef

unread,
Sep 8, 2015, 3:25:50 PM9/8/15
to web2py-d...@googlegroups.com
I mentioned that because one of Giovanni’s ideas was to create an `index` option in Field() constructor.


Vinicius.



Anthony

unread,
Sep 8, 2015, 3:48:25 PM9/8/15
to web2py-developers
+1

details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-developers+unsubscribe@googlegroups.com.

Richard Vézina

unread,
Sep 8, 2015, 3:48:56 PM9/8/15
to web2py-d...@googlegroups.com
You would he leave it at the table creation level with indexes('...')?

Vinicius Assef

unread,
Sep 8, 2015, 4:23:19 PM9/8/15
to web2py-d...@googlegroups.com
No.

I would go only with Table.create_index() and Table.drop_index().

BTW, I also wouldn't change the standard `create_index` by `add_index`.

Richard Vézina

unread,
Sep 8, 2015, 4:40:57 PM9/8/15
to web2py-d...@googlegroups.com
Ok these methods where existing, I guess they will stay for backward compatibility but they will be replace under the hood by add_ and drop_ once time has come...

Richard

Giovanni Barillari

unread,
Sep 9, 2015, 4:12:13 AM9/9/15
to web2py-developers
@vinicius I see your point.

We can start development on just these 2 commands, but I think that, sooner or later, we should deal with the migrations engine.
pyDAL provides automatic migrations by design. I wouldn't propose the index feature inside table definition IF migrations weren't automatic. If migrations on pyDAL were supposed to work with specific commands, then having indexes ONLY with specific commands make sense to me. but IMHO, since pyDAL make migrations for me, why is not supposed to migrate indexes as well?

/Giovanni

Vinicius Assef

unread,
Sep 9, 2015, 9:13:38 AM9/9/15
to web2py-d...@googlegroups.com
OK. I got your point. :-)

On 9 September 2015 at 05:12, Giovanni Barillari

Anthony

unread,
Sep 9, 2015, 11:29:44 AM9/9/15
to web2py-developers
We can start development on just these 2 commands, but I think that, sooner or later, we should deal with the migrations engine.
pyDAL provides automatic migrations by design. I wouldn't propose the index feature inside table definition IF migrations weren't automatic. If migrations on pyDAL were supposed to work with specific commands, then having indexes ONLY with specific commands make sense to me. but IMHO, since pyDAL make migrations for me, why is not supposed to migrate indexes as well?

I'm not sure I follow. Are you suggesting that db.mytable.create_index() would/should not trigger a migration (how could it not)? How are migrations a rationale for including the proposed index syntax within db.define_table()?

Anthony

Giovanni Barillari

unread,
Sep 10, 2015, 9:15:19 AM9/10/15
to web2py-developers
Anthony you are right, but consider this:
- I issue a db.mytable.create_index() which create a migration
- table definition doesn't have any info regarding indexes
- when I start application with the created index, how will the migration engine knows if it should keep or drop the index since the table definition says: no index at all?

Rationally speaking, I think everything regarding migrations should be kept inside table definition. Any eventual command like table.create_index() and table.drop_index() should not alter the migrations, or the risk is to have two different systems acting differently. How are we supposed to let them speak?

/Giovanni

Anthony

unread,
Sep 10, 2015, 10:25:04 AM9/10/15
to web2py-developers
On Thursday, September 10, 2015 at 9:15:19 AM UTC-4, Giovanni Barillari wrote:
Anthony you are right, but consider this:
- I issue a db.mytable.create_index() which create a migration
- table definition doesn't have any info regarding indexes
- when I start application with the created index, how will the migration engine knows if it should keep or drop the index since the table definition says: no index at all?

Are you saying the *.table file will include metadata indicating an index exists, but the define_table call doesn't specify the index, so the migration engine doesn't know whether to drop the index (because the table definition may have changed) or ignore the situation (because the index was created via create_index, which define_table doesn't know about)?

Well, I suppose the *.table file could include some additional metadata with each index indicating how it was created (i.e., via create_index or define_table) -- define_table could then safely ignore any indexes that were created via create_index.

Alternatively, we could allow index creation via create_index only. In that case, indexes would be explicitly created and dropped exactly when requested, so no confusion for the migration engine. As already noted, the advantage of creating indexes with a separate method called after the table definition is that we can easily specify more complex indexes based on expressions (without having to create a whole new syntax just for that purpose). If there is an easy way to do that within define_table, then that would be fine too.

Anthony

Michele Comitini

unread,
Sep 10, 2015, 11:04:37 AM9/10/15
to web2py-developers
Is the order of definitions going to be important?
 AFAIK in SQL the order is respected always.  But for instance with DAL references are deferred.  
IMHO it should be better to would avoid deferral with indexes thus allowing to use expressions (on already defined entities).

db.create_index(db.table1.field2)

Just a curiosity about syntax: why "create_index" when do not have "create_table"?


--

Anthony

unread,
Sep 10, 2015, 11:36:53 AM9/10/15
to web2py-developers


On Thursday, September 10, 2015 at 11:04:37 AM UTC-4, mcm wrote:
Is the order of definitions going to be important?
 AFAIK in SQL the order is respected always.  But for instance with DAL references are deferred.  
IMHO it should be better to would avoid deferral with indexes thus allowing to use expressions (on already defined entities).

db.create_index(db.table1.field2)

Just a curiosity about syntax: why "create_index" when do not have "create_table"?

I think the suggestion was:

db.table1.create_index(...)

So, more like db.table1.drop() than db.define_table('table1').

Note, making it a method of the table means you could use string names rather than having to pass in full Field objects (unless you need to generate expressions).

Maybe instead it should be table.create_indexes() to make it easier to define multiple indexes at once.

Anthony
Reply all
Reply to author
Forward
0 new messages