Hierarchical Database Selection

112 views
Skip to first unread message

Dave Cenker

unread,
Jan 11, 2013, 6:05:02 PM1/11/13
to web...@googlegroups.com
I am putting together a website for my son so that he can keep track of trains that he has seen in our area. Here are the applicable models:

db.define_table('company',
  Field('name'),
  Field('symbol'))

db.define_table('train',
  Field('user', 'reference auth_user', default=auth.user_id),
  Field('company', db.company),
  Field('date', 'date'),
  Field('numCars'))

I would like to be able to select all the trains that a particular user has seen. I believe I can accomplish this with the following:

currUserQuery = db.train.user == auth.user_id
allTrainsSeenByUser = db(currUserQuery).select(orderby='date desc')

What I would like to do is create an index page for each user that splits out the trains of each company that have been seen. For example, assume that there are three companies in the database: Amtrak, CSX, and BNSF. Of these three companies the user has seen trains from Amtrak and CSX. On their index page, they would see all the Amtrak trains that they have seen grouped together and all the CSX trains that they have seen grouped together. Therefore, I would like to perform a database selection that finds all the trains for a particular user and a particular company.

I have found a way to determine which companies a particular user has seen using the following:

companies = db(currUserQuery).select('train.company', distinct=True).as_list()

However, this returns a list of company IDs instead of names and when I try to do something like this:

db(train.company.id == id) it doesn't work.

What I'd really like to do is get a list of the names of the companies that a user has seen and then make a database selection based upon those names in a manner like this:

db(currUserQuery & (db.train.company.name == companyNameText)).select()

However, the db.train.company.name doesn't work. Is there a way to progress down the hierarchy (if you will) to access the company name on a particular train row given the database structure above. With Django (which I have used in the past), this would be accomplished similar to this (train__company__name=companyNameText), but I can't figure out the corresponding way to do so in web2py.

Thanks for any help you can provide!

Dave

Bruno Rocha

unread,
Jan 11, 2013, 6:40:13 PM1/11/13
to web...@googlegroups.com
On Fri, Jan 11, 2013 at 9:05 PM, Dave Cenker <dave....@gmail.com> wrote:
However, this returns a list of company IDs instead of names and when I try to do something like this:

db(train.company.id == id) it doesn't work.

should be

db(db.train.company == is)

What I'd really like to do is get a list of the names of the companies that a user has seen and then make a database selection based upon those names in a manner like this:

db(currUserQuery & (db.train.company.name == companyNameText)).select()

should be

& (db.company.name == comanyNameText)

Massimo Di Pierro

unread,
Jan 11, 2013, 7:13:36 PM1/11/13
to web...@googlegroups.com
I think you want (I am changing some field names because user and date are resrved keywords):

db.define_table('company',
  Field('name'),
  Field('symbol'))

db.define_table('train',
  Field('user_id', 'reference auth_user', default=auth.user_id),
  Field('company', db.company),
  Field('date_event', 'date'),
  Field('num_cars'))


companies = db(db.company.id.belongs(db(db.train.user_id==id)._select(db.train.company,distinct=True)).select(orderby=db.company.name)

Dave Cenker

unread,
Jan 13, 2013, 8:40:48 AM1/13/13
to web...@googlegroups.com
Excellent! This works just as advertised for obtaining the companies and the solution works in my application. I very much appreciate the solution.
 
I was wondering if this was analogous to the following syntax in Django:
 
train.objects.filter(company__name='Amtrak')
 
Please, no offense, because I absolutely love web2py, but the Django syntax in this case is much more understandable. Also, if there are more relationships to track such as:
 
blog.objects.filter(entry__author__name='Dave')
 
you are able to arbitrarily track any relationship with this syntax. If this isn't possible, or if I have misinterpreted the syntax and functionality, please let me know.
 
Thank you very much for (a) creating such a great framework and (b) for being so committed to helping here on the forums!
 
Dave

Vasile Ermicioi

unread,
Jan 13, 2013, 9:07:10 AM1/13/13
to web...@googlegroups.com
django
rows = train.objects.filter(company__name='Amtrak')

web2py
rows = db((db.trains.company==db.company.id) & (db.company.name=='Amtrak')).select(*db.trains.fields)

for me web2py way is better because it mimics real sql query

but I agree that it could look better, for example 

db(db.trains.company.name == 'Amtrak').select() #not yet a valid query for web2py :)

orientdb uses such syntax 

what do you think Massimo, is it possible to add such feature for reference fields?

Niphlod

unread,
Jan 13, 2013, 2:11:30 PM1/13/13
to web...@googlegroups.com
I won't argue with the difference between an ORM or a DAL, but just for the sake of discussion for future references, I'll tackle this from another angle......

We (I'll include myself in the group) that work on databases daily have a bad attitude towards super-normalizing things that maybe don't need normalization at all. That being said, I'm well aware of SQL syntax to retrieve what I want and DAL seems more natural because it feels more "controlled" (i.e. I know what it is doing under the hood).

as long as all "Amtrack" trains doesn't need to become "Amtrack2" on future day, references are just a bad way to deal with your model and your queries.....

A train with a company referenced in another table by an id is a train that can (should) switch companies often, even better, is part of a group of trains that somewhat needs to change company often. That model allows you to change a single "Amtrack" record to "Amtrack2" and have all the trains "belonging" to "Amtrack" to be "Amtrack2". If that' not the case, why don't you simply use

Field('company', requires=IS_IN_DB(db, 'company.name'))

?

That way you can issue queries like db.trains.company == 'amtrack' that are - usually - faster than filtering records through joins.

Vasile Ermicioi

unread,
Jan 13, 2013, 2:51:55 PM1/13/13
to web...@googlegroups.com
That way you can issue queries like db.trains.company == 'amtrack' that are - usually - faster than filtering records through joins.

you can't do db.trains.company == 'amtrack' because company is a reference field


Niphlod

unread,
Jan 14, 2013, 12:10:34 PM1/14/13
to web...@googlegroups.com


On Sunday, January 13, 2013 8:51:55 PM UTC+1, Vasile Ermicioi wrote:
That way you can issue queries like db.trains.company == 'amtrack' that are - usually - faster than filtering records through joins.

you can't do db.trains.company == 'amtrack' because company is a reference field

 
I know, did you read the entire post ?
I'm not saying that he can with the original model, I'm saying that changing the model "un-normalizing" the reference field often simplifies both code and the underlying queries ( improving performances as a nice side-effect ).

Derek

unread,
Jan 14, 2013, 2:44:24 PM1/14/13
to web...@googlegroups.com
Yes, I'm all for de-normalizing when it makes sense to do so. There's no reason for this data to be normalized. I'm going to give the thread starter the benefit of the doubt and assume it's not actually about trains.
Reply all
Reply to author
Forward
0 new messages