gae - specifying choices to StringProperty

32 views
Skip to first unread message

sudhakar m

unread,
Jan 1, 2009, 6:18:27 PM1/1/09
to web...@googlegroups.com
In Gae, we can specify the choices when creating the stringproperty.
  phone_type = db.StringProperty(
    choices
=('home', 'work', 'fax', 'mobile', 'other'))
This will prevent other value's to this field. I am finding it very much useful for my current design.
But how do i represent this in web2py's model. Is there any helper available for same.

I am perfectly ok with using gae api (portability is not an issue). But I am bit struck up finding the right place to define this(model/controller). Also how do i use it with SQLFORM?. How well does it play with rest of web2py code.

I also came across the following thread,
http://groups.google.com/group/web2py/browse_thread/thread/13f3fb91f90ef3b0?hl=en&tvc=2&q=gae+model

where creating new types like 'list_of_intgers', 'list_of_floats', 'list_of_stings' are discussed. But couldnt find further information on Gae specific datatypes (particularly ListProperty)

Anybody using these gae data types in web2py. Can somebody share some thoughts on this?

Sorry if i am missing something obvious here.

Sudhakar.M
A mind once stretched by a new idea never regains its original dimension. - Oliver Wendell Holmes

Fran

unread,
Jan 1, 2009, 7:07:50 PM1/1/09
to web2py Web Framework
On Jan 1, 11:18 pm, "sudhakar m" <sudhakar...@gmail.com> wrote:
> In Gae, we can specify the choices when creating the stringproperty.
> *  phone_type = db.StringProperty(
>     choices=('home', 'work', 'fax', 'mobile', 'other'))

db.table.field.requires=IS_IN_SET('home', 'work', 'fax', 'mobile',
'other')

Produces a dropdown in forms & does back-end validation as well (if
using SQLFORM or T2)

F

sudhakar m

unread,
Jan 1, 2009, 7:28:21 PM1/1/09
to web...@googlegroups.com
Thanks Fran. This would suffice my current requirement. But is there a way to define StringListProperty or ListProperty in model.

Sudhakar.M

Robin B

unread,
Jan 1, 2009, 7:49:41 PM1/1/09
to web2py Web Framework
The StringListProperty and ListProperty are not implemented in
web2py. They could be implemented, but they would be incompatible
with SQL databases. ListProperties are used because they can save you
from performing an extra db query in a denormalized database, so the
work around is to have another table, and perform an additional query
for each ListProperty.

Robin

sudhakar m

unread,
Jan 2, 2009, 8:28:22 AM1/2/09
to web...@googlegroups.com
Hi Robin,

Thanks for the explanation. But I still couldnt quite get it right. This is what I have.

For example in case of cricket match,
  1. Each match has many teams(say 2)
  2. Each team has many players
  3. Each Player belongs to many teams
So we have many-to-many relationship between match & team, team & player

If I can use ListProperty, I can define them as

db.define_table("Match",
      SQLField("Team_ids", "list_of_integers"),
      SQLField("ground", "string"),
      SQLField("result", "string"))

db.define_table("Team",
      SQLField("name", "string"),
      SQLField("player_ids", "list_of_integers"),
      SQLField("Match_ids", "list_of_integers"))

db.define_table("Player",
      SQLField("name", "string"),
      SQLField("Team_ids", "list_of_integers"))

db.Team.Match_ids.requires=IS_IN_DB(db, 'Match.id')
db.Player.Team_ids.requires=IS_IN_DB(db, 'Team.id')

If I have to do the same in relational db way, i have to,

  1. create a join table Match_Team & define a HABTM between match & team
  2. create a join table Player_Team with HABTM again.

Although the second approach works perfectly in relational db, I am wondering is this the right way to represent the data in a Flat database like BigTable.

Portability is not an issue for me. But I would like to make use of app engine to full extent. 

Sorry for my ignorance. I m learning all three(python, gae & web2py) at the same time ;)

mdipierro

unread,
Jan 2, 2009, 9:39:41 AM1/2/09
to web2py Web Framework
Oliver,

mind that JOINs do not work on GAE. Even if portability is not an
issue for you, it is for web2py. This means web2py does not upport
APIs that work on GAE but not otherwise.

You can mimic the GAE list property in a portable way using:

db.define_table("Player",
SQLField("name", "string"))

db.define_table("Team",
SQLField("name", "string"),
SQLField("player_ids", "text"))

db.define_table("Match",
SQLField("ground", "string"),
SQLField("result", "string"),
SQLField("team_ids", "text"))

db.Team.player_ids.requires=IS_IN_DB(db,'Player.id','%(name)
s',multiple=True)
db.Match.team_ids.requires=IS_IN_DB(db,'Team.id','%(name)
s',multiple=True)

The multiple=True makes the text field work like a list_of_integers in
a portable way and will render the field with a "select multiple".

You still have a problem because this is (as is list_of_interegers) is
closer to tagging than many to many. This means that even if a Match
refers to a Team, the Team does not refer to the Match. You can mimic
this as you suggested but you will run into trouble because the
different fields would not be linked by a Join. GAE does not support
JOINs and it will do so in the future.

Mind that this requires 1.55rc3 in

http://mdp.cti.depaul.edu/examples/static/155rc2/web2py_src.zip
http://mdp.cti.depaul.edu/examples/static/155rc2/web2py_win.zip
http://mdp.cti.depaul.edu/examples/static/155rc2/web2py_osx.zip

Massimo



On Jan 2, 7:28 am, "sudhakar m" <sudhakar...@gmail.com> wrote:
> Hi Robin,
>
> Thanks for the explanation. But I still couldnt quite get it right. This is
> what I have.
>
> For example in case of cricket match,
>
> 1. Each match has many teams(say 2)
> 2. Each team has many players
> 3. Each Player belongs to many teams
>
> So we have many-to-many relationship between match & team, team & player
>
> If I can use ListProperty, I can define them as
>
> db.define_table("Match",
> SQLField("Team_ids", "*list_of_integers*"),
> SQLField("ground", "string"),
> SQLField("result", "string"))
>
> db.define_table("Team",
> SQLField("name", "string"),
> SQLField("player_ids", "*list_of_integers*"),
> SQLField("Match_ids", "*list_of_integers*"))
>
> db.define_table("Player",
> SQLField("name", "string"),
> SQLField("Team_ids", "*list_of_integers*"))
>
> db.Team.Match_ids.requires=IS_IN_DB(db, 'Match.id')
> db.Player.Team_ids.requires=IS_IN_DB(db, 'Team.id')
>
> If I have to do the same in relational db way, i have to,
>
> 1. create a join table Match_Team & define a HABTM between match & team
> 2. create a join table Player_Team with HABTM again.

sudhakar m

unread,
Jan 2, 2009, 10:10:01 AM1/2/09
to web...@googlegroups.com
Thanks Massimo.

db.define_table("Player",
     SQLField("name", "string"))

db.define_table("Team",
     SQLField("name", "string"),
     SQLField("player_ids", "text"))

db.define_table("Match",
      SQLField("ground", "string"),
      SQLField("result", "string"),
      SQLField("team_ids", "text"))

This looks very close to what I want. I guess even google will be using something like this internally. I will take 1.55rc3 & give it a try. btw will it be fast enough if I have large no of records in match & player table (yes I cache them & read/write ratio will be around 80:1)

Sudhakar.M

mdipierro

unread,
Jan 2, 2009, 10:21:05 AM1/2/09
to web2py Web Framework
I do not know. but the slowdown is not storing and retrieving the
list. the slowdown will be in populating the dropbox.
Even wrose. The problem is GAE does not allow retrieving more than
1000 records at the time. This means that if you have more than 1000
players and/or teams you are in trouble.

You cannot even use ajax autocomplete because gae does not allow
search of substrings.

Assuming one player playes only for one team you can store the team
with the player in the Player table not use a many2many between team
and player.

Massimo

sudhakar m

unread,
Jan 2, 2009, 10:42:05 AM1/2/09
to web...@googlegroups.com
Assuming one player playes only for one team you can store the team with the player in the Player table not use a many2many between team and player.
In my case player plays for more than one team.

 
. The problem is GAE does not allow retrieving more than 1000 records at the time.
 I am investigating to find some workaround for this as well as I certainly have few thousand records in both player & match table.
 
Will play with this & post back once I find a soln. I guess I will end up in using quite a bunch of many-to-many relationships as my data is quite complex.

I will be bugging this list for some more time, till I finalize my models ;)
Thanks Massimo for creating a nice and lean web2py.

BTW the above link didnt work. I could find only rc2 @ http://mdp.cti.depaul.edu/examples/static/1.55rc2/web2py_src.zip . I will checkout from trunk instead.

Thanks,
Sudhakar.M

mdipierro

unread,
Jan 2, 2009, 10:43:40 AM1/2/09
to web2py Web Framework
the folder is called rc2. Actually it contains rc4 now. Sorry for the
confusion.

Massimo
> BTW the above link didnt work. I could find only rc2 @http://mdp.cti.depaul.edu/examples/static/1.55rc2/web2py_src.zip. I will

sudhakar m

unread,
Jan 2, 2009, 10:49:37 AM1/2/09
to web...@googlegroups.com
Yep got the source from
http://mdp.cti.depaul.edu/examples/static/1.55rc2/web2py_src.zip

I will do some trials and post back.

Sudhakar.M

Robin B

unread,
Jan 2, 2009, 11:10:55 AM1/2/09
to web2py Web Framework
Here is some additional info about ListProperty vs HABTM.

The same schema adding Membership tables instead of ListProperties:

db.define_table("Match",
SQLField("ground", "string"),
SQLField("result", "string"))

db.define_table("Team",
SQLField("name", "string"))

db.define_table("Player",
SQLField("name", "string"))

db.define_table("TeamMembership",
SQLField("team_id", "integer"),
SQLField("player_id", "integer"))

db.define_table("MatchMembership",
SQLField("team_id", "integer"),
SQLField("match_id", "integer"))

This is mostly equivalent to using a ListProperty, however you need 1
query extra of a Membership table to get the 'list' of ids, but also
to add/delete N items in a Membership you need N writes.

Having a normalized (3NF) schema (with either ListProperty or HABTM)
without joins requires N+1 queries to fetch things like all the player
names on a team (which is poor performance and you only get ~30
queries/req). To get the performance back, you could selectively de-
normalize (store data redundantly), based on the queries that you
expect to perform. So if your app needs all the player names for a
given team or all team names for a given player without a join in 1
query, you could also store the names in the TeamMembership:

db.define_table("TeamMembership",

SQLField("team_id", "integer"),
SQLField("team_name"))
SQLField("player_id", "integer")
SQLField("player_name"))

In this example, N+1 queries is now 1 query to get the player names
from a given team or team names for a given player. The catch is that
if you need to change a name, you need to update the name for each
TeamMembership matching that id, which is N writes again...

The main benefit for ListProperty is that adding/removing N items from
the list is 1 write at your end, but bigtable still needs to build/
drop the N indexes and since its a synchronous call, it does not
return until it finishes, so using ListProperty in that one use case
should be cheaper than N writes. But, added to GAE in August, batch
writes across entity groups ( http://googleappengine.blogspot.com/2008/08/couple-datastore-updates.html
) so now you could add/remove N items from a membership in parallel
with one batched write from your end (web2py does not support this).
How much is the N add/remove performance difference between
ListProperty and Membership batch writes? That would be an
interesting benchmark to perform. Another interesting point is that
ListProperties are limited to ~5000 items.

Robin

On Jan 2, 9:49 am, "sudhakar m" <sudhakar...@gmail.com> wrote:
> Yep got the source fromhttp://mdp.cti.depaul.edu/examples/static/1.55rc2/web2py_src.zip

sudhakar m

unread,
Jan 2, 2009, 12:16:50 PM1/2/09
to web...@googlegroups.com
Thanks Robin for a detailed explanation on HABTM vs ListProperty.

db.define_table("Match",
     SQLField("ground", "string"),
     SQLField("result", "string"))

db.define_table("Team",
     SQLField("name", "string"))

db.define_table("Player",
     SQLField("name", "string"))

db.define_table("TeamMembership",
     SQLField("team_id", "integer"),
     SQLField("player_id", "integer"))

db.define_table("MatchMembership",
     SQLField("team_id", "integer"),
     SQLField("match_id", "integer"))

I will probably stick with this model for now as ListProperty restriction of 5000 items might cause some problems in the near future.
 
db.define_table("TeamMembership",

     SQLField("team_id", "integer"),
     SQLField("team_name"))
     SQLField("player_id", "integer")
     SQLField("player_name"))
I certainly agree on this. De-normalization is very much required in my case.

 But, added to GAE in August, batch
writes across entity groups ( http://googleappengine.blogspot.com/2008/08/couple-datastore-updates.html
) so now you could add/remove N items from a membership in parallel
with one batched write from your end (web2py does not support this).

To start with there will be very few writes in my case. But I have a scenario where a single record in player table gets updated frequently (say one update/min for a record). So at any point in time there wont be more than say 5-6 records updated per min. I guess this shouldnt be a problem

One more thing that concerns me is in gql.py, I could only find 'reference':google_db.IntegerProperty, defined for SQL_DIALECTS. So arent we using the gae's ReferenceProperty? (my apologies, I still dont fully understand the code yet ;) )

Once I have a full grasp of gql.py, I am planning to modify it in such a way that it makes use of all the gae's special features. I am certain that it will require many changes which will play nice with rest of web2py. By this way we will have a webframe work that fully supports gae. Other's who are not using gae for deployment can safely ignore it. Only catch here will be that the portability will be lost for such applcation.

Is there any way to find out what code web2py is generating for gae (yep I am looking in to the code already)
okie let me dig little deeper to find out how it actually works

Thanks,
Sudhakar.M

Robin B

unread,
Jan 2, 2009, 1:53:47 PM1/2/09
to web2py Web Framework
You are right, the driver does not use ReferenceProperty, but uses
IntegerProperty for all references. ReferenceProperty is a
convenience for the AppEngine ORM and is not needed in web2py DAL, and
would be incompatible with the SQL tables.

Originally, gql.py generated GQL and passed it to GQLQuery which
parsed it and generate the Query object, the GQL was stored in
_last_sql and _select() could be used to view the GQL of the last
query, but now gql.py uses the Query object directly to
programmatically generate the query, so there is no GQL generated or
parsed. It would be possible to add more debugging information to the
last_sql since the information is available (filters, orderby,
limitby), but its not formatted or presented like GQL.

Robin
> >http://googleappengine.blogspot.com/2008/08/couple-datastore-updates....
> > ) so now you could add/remove N items from a membership in parallel
> > with one batched write from your end (web2py does not support this).
>
> To start with there will be very few writes in my case. But I have a
> scenario where a single record in player table gets updated frequently (say
> one update/min for a record). So at any point in time there wont be more
> than say 5-6 records updated per min. I guess this shouldnt be a problem
>
> One more thing that concerns me is in gql.py, I could only find
> 'reference':google_db.IntegerProperty, defined for SQL_DIALECTS. So arent we
> using the gae's *ReferenceProperty*? (my apologies, I still dont fully

James Ashley

unread,
Jan 3, 2009, 3:09:49 PM1/3/09
to web2py Web Framework


On Jan 2, 9:21 am, mdipierro <mdipie...@cs.depaul.edu> wrote:
> I do not know. but the slowdown is not storing and retrieving the
> list. the slowdown will be in populating the dropbox.

The GAE datastore is pretty slow. From what I've gathered, depending
on your data model, if you're doing more than 2 read/writes (with
writes being *much* slower) per request, you'll probably have issues.

> Even wrose. The problem is GAE does not allow retrieving more than
> 1000 records at the time. This means that if you have more than 1000
> players and/or teams you are in trouble.

This turns out to be much less problematic than it sounds. Have
something unique that you're ordering the query by. Timestamp, ID,
whatever. Do a query, get 1000 records, do a second query and filter
it by ("orderField >", lastValue). (You might want to spread those
queries over multiple requests, though, because of the hefty CPU
limitations).

> You cannot even use ajax autocomplete because gae does not allow
> search of substrings.

"Cannot" is such an ugly word. There are always ways to do things,
but sometimes you have to get *really* creative.

sudhakar m

unread,
Jan 3, 2009, 3:54:59 PM1/3/09
to web...@googlegroups.com
The GAE datastore is pretty slow.  From what I've gathered, depending
on your data model, if you're doing more than 2 read/writes (with
writes being *much* slower) per request, you'll probably have issues.

Yes I did a small test with a simple model. It was taking around 1sec/request for single write on an entity. I am not sure if that is acceptable in real world apps. As Robin mentioned earlier, we can use batch write to speed up writes. Probably read should be much faster & it can cached to have even better response. 

My only concern is web2py is more of database neutral & because of this many of GQL's specific features are'nt directly supported. Make no mistake. Web2py is a great framework with a very lively community. Infact I feel there are certain things done better than rails. But for my current requirement, I think I have staty close to GAE's api to extract every bit of performance improvement. So I have started to dive into GAE to find out how much I will be missing if I choose web2py for google apps.

Is there someone using production application in GAE using web2py? If so what is the performance?

James Ashley

unread,
Jan 3, 2009, 3:55:32 PM1/3/09
to web2py Web Framework


On Jan 1, 5:18 pm, "sudhakar m" <sudhakar...@gmail.com> wrote:
> In Gae, we can specify the choices when creating the stringproperty.
>
> *  phone_type = db.StringProperty(
>     choices=('home', 'work', 'fax', 'mobile', 'other'))
> *
>
> This will prevent other value's to this field. I am finding it very much
> useful for my current design.
> But how do i represent this in web2py's model. Is there any helper available
> for same.
>
> I am perfectly ok with using gae api (portability is not an issue).

I wound up switching to that when I started running into pretty much
the same issues you are.

Like several others have pointed out in this thread, there are
workarounds. But google's datastore is so slow, and their limits so
tight that it just seemed silly to add any extra overhead. BigTable
is *not* an RDBMS. Trying to shoehorn it into acting like one doesn't
make a lot of sense. You lose its advantages, and you just get
something that looks vaguely like a crippled RDBMS.

It might count as premature optimization, but you should be able to
tell pretty quickly. Set up the models with the suggested work-
arounds. Bulk upload a few thousand records. Set up a page or three
that run the most complicated queries you expect to see. Then check
your logs.

If you have warnings about CPU quota, request time, or actually get
time-outs, these suggestions will not work for you.

Based just on what I've seen on the GAE mailing list, I think you'll
need to make the same switch I did.

> But I am
> bit struck up finding the right place to define this(model/controller).

I started out with a GaeModels.py in my app's modules folder. I
defined my models (just using the GAE API) there. Then I added a
helper class and only access my models through it. That way, if I do
have to switch later, I only have to update those 2 things.

[Later, after I'd trimmed down my web2py installation immensely, so I
wasn't as worried about the file count limit, I switched that to a
GaeModels folder with an __init__.py that imports all my models and
then broke each model into its own class. I'm mentioning this only
because you said you're new to python, so I'm guessing you haven't
seen this trick yet].

If you're totally not worried about portability, just define them in
your models folder.

> Also how do i use it with SQLFORM?. How well does it play with rest of web2py
> code.

I don't think I've ever used SQLFORM. I'm perfectly fine with
building my own forms, saving my own data, and doing my own
validation. Actually, at this point, most of my interaction is
switching to using dojo with Ajax and json. I may be missing one of
those basic pieces of web2py that would make my life *much* easier,
but I'm okay with that.

It would be interesting to see what it would take to make SQLFORM work
directly with google's datastore models.

> I also came across the following thread,http://groups.google.com/group/web2py/browse_thread/thread/13f3fb91f9...
>
> where creating new types like 'list_of_intgers', 'list_of_floats',
> 'list_of_stings' are discussed. But couldnt find further information on Gae
> specific datatypes (particularly
> ListProperty<http://code.google.com/appengine/docs/datastore/typesandpropertyclass...>
> )
>
> Anybody using these gae data types in web2py. Can somebody share some
> thoughts on this?
>
> Sorry if i am missing something obvious here.

I think you see the situation pretty clearly.


> Sudhakar.M
> A mind once stretched by a new idea never regains its original dimension. -
> Oliver Wendell Holmes

Regards,
James

sudhakar m

unread,
Jan 3, 2009, 6:00:09 PM1/3/09
to web...@googlegroups.com
Thanks James for sharing the info.

  Set up the models with the suggested work-
arounds.  Bulk upload a few thousand records.  Set up a page or three
that run the most complicated queries you expect to see.  Then check
your logs.
 
I reached the same stage 2 days back. I designed models in web2py way using the newly introduced many-to-many from trunk. The good thing about the new many-to-many is that it eliminates the need for join table. When i finished my bare-bones prototype design, I ended up in designing model's very slim & sleek. I mean I used 6-7 many-to-many's & few one-to-many's & it drastically reduced no of tables in my final design. I would say it was rather un-conventional design, which followed neither RDBMS nor GAE's flat model. I didnt have any data duplication (except for the key's) at the same time having the benifits of both worlds. But after little thought I started worrying about text-field being used for join. Although it was a very good idea, its very much in early stage & there is no benchmarks about the performance, considering the fact that text-fields are not indexed. I felt ListProperty to be a much better choice.

Based just on what I've seen on the GAE mailing list, I think you'll
need to make the same switch I did.

I am thinking to staying more close to GAE. I know it will be pain to work at the lower level directly at GAE API without any frame work. But again, If I end up hacking both web2py & webapp, it will become equally difficult to maintain (oops this is my personal opinion). I am yet to check out other big beast Django. I have heard that it plays nice with GAE, but I am still skeptical to use it in my app as it's too heavy & learning curve will be very steep (I especially hate it, if i have to learn for months to develop a single app)when compared to web2py


[Later, after I'd trimmed down my web2py installation immensely, so I
wasn't as worried about the file count limit, I switched that to a
GaeModels folder with an __init__.py that imports all my models and
then broke each model into its own class.  I'm mentioning this only
because you said you're new to python, so I'm guessing you haven't
seen this trick yet].

Yes very much. I havent even started using modules!! I would say that I am still evauating how to extract best out of web2py & GAE.

I don't think I've ever used SQLFORM.  I'm perfectly fine with
building my own forms, saving my own data, and doing my own
validation. 

Out of my curiosity I have few questions.

ReferenceProperty & SelfReferenceProperty arent used for defining relations. Instead IntegerProperty is used. Conventional representation on very-unconventional DB. Are you facing any problems in this. Are you defining model web2py conventinal way or with webapp api?

How did you approach many-to-many relationship in your design?

Are you using Expando property in your models along with web2py. I think this is a cool feature & ofcourse its unconventional again.
 
What is the final feature set for web2py on GAE. As far as I can list, I can list the following things
1. Small codebase (one can always see the code for learning)
2. Automatic migration
3. Clean templating
4. SQLFORM ofcourse ( speeds up prototyping !!)
5. Multiple db support (may come in handy if we want to sync GAE with external db. Honestly didnt think of it b4 writing this)
6. Admin UI (yes I hacked it & hosted in a remote shared free host. So deoloyment is single click again)

Anything else in feature list? . As of now I am prepared to sacrifice above for performance (pure academic intreset in this). If I could find some way of unobtrusively using GAE's features, I will certainly jump back ;)

Thanks,
Sudhakar.M
Little Knowledge is always dangerious!!

James Ashley

unread,
Jan 4, 2009, 12:37:25 PM1/4/09
to web2py Web Framework
[This is starting to veer *way* off-topic, and might be more
appropriate on the GAE list. My apologies]

On Jan 3, 5:00 pm, "sudhakar m" <sudhakar...@gmail.com> wrote:
(OK, this was from a different message):
> Yes I did a small test with a simple model. It was taking around
> 1sec/request for single write on an entity. I am not sure if that is
> acceptable in real world apps. As Robin mentioned earlier, we can use batch
> write to speed up writes. Probably read should be much faster & it can
> cached to have even better response

From what I've gathered, avoid entity groups, pretty much at all
costs. Use the bulk uploader if you can. And, really, don't plan on
updating more than 1 or 2 entities per request.


> Thanks James for sharing the info.

No problem. And remember, like everyone else, I'm still getting my
head wrapped around GAE as well.

>   Set up the models with the suggested work-
>
> > arounds.  Bulk upload a few thousand records.  Set up a page or three
> > that run the most complicated queries you expect to see.  Then check
> > your logs.
>
> I reached the same stage 2 days back. I designed models in web2py way using
> the newly introduced many-to-many from trunk. The good thing about the new
> many-to-many is that it eliminates the need for join table. When i finished
> my bare-bones prototype design, I ended up in designing model's very slim &
> sleek. I mean I used 6-7 many-to-many's & few one-to-many's & it drastically
> reduced no of tables in my final design. I would say it was rather
> un-conventional design, which followed neither RDBMS nor GAE's flat model. I
> didnt have any data duplication (except for the key's) at the same time
> having the benifits of both worlds.

If it's fast enough for you, then great! Run with it. But try not to
get hung up on data duplication. That's one thing I've learned from
google's engineers: don't sweat it (even though maintaining multiple
copies of the same data is a pain). [That doesn't apply in an RDBMS,
of course]

> But after little thought I started
> worrying about text-field being used for join. Although it was a very good
> idea, its very much in early stage & there is no benchmarks about the
> performance, considering the fact that text-fields are not indexed.

Have you tried running any benchmarks yet? (BTW, this sort of thing
*really* needs to be run on google's servers. The fake datastore they
use in the SDK is awful).

> I felt
> ListProperty<http://code.google.com/appengine/docs/datastore/typesandpropertyclass...>to
> be a much better choice.

At this point, I still basically have JOIN tables, sort of. They're
meaningful tables (say, the Team) that I'll be querying for
themselves. Each Team will have a ListProperty of KeyProperty's back
to its Players (likewise for the Players back to the Teams they're
on), and Matches would have a ReferenceProperty back to each team that
played (maybe Home and Visitor, or some such).

(Huh. I'm not doing anything with sporting events at all, but this
pattern matches up almost precisely with my problem domain. I wonder
how common it really is).

So, if I have a set of matches and want to retrieve the players, I
have a lot of querying/joining/reference-following to do. And what
happens if players get traded in the middle of the season? It would
probably make more sense to just take the players (and their status)
for both teams in a given match, pickle that, and store it in a Blob.
If you have to worry about those things.

In my case, I can query for a Team, then use Ajax to load the players
and matches (basically grabbing one record per request) and build a
tree that way.

It just depends on what you need. GAE does force you to do more up-
front analysis and design than I'm used to.

>> Based just on what I've seen on the GAE mailing list, I think you'll
>> need to make the same switch I did.
>
> I am thinking to staying more close to GAE. I know it will be pain to work
> at the lower level directly at GAE API without any frame work. But again, If
> I end up hacking both web2py & webapp, it will become equally difficult to
> maintain (oops this is my personal opinion).

I wouldn't want to mix/match those, either.

> I am yet to check out other big
> beast Django. I have heard that it plays nice with GAE, but I am still
> skeptical to use it in my app as it's too heavy & learning curve will be
> very steep (I especially hate it, if i have to learn for months to develop a
> single app)when compared to web2py

I know conventional wisdom says to use django. But in and of itself,
it's really focused on a very niche problem domain, which doesn't
match my needs at all. After they took out the pieces that just won't
work on GAE, it seemed pretty useless to me.

And then tons of people are trying to install 1.0 (instead of the more-
or-less 0.96 that GAE provides), and failing. There seem to be
several projects around to make this work, but I haven't had the
patience/motivation yet to actually try it.

> > [Later, after I'd trimmed down my web2py installation immensely, so I
> > wasn't as worried about the file count limit, I switched that to a
> > GaeModels folder with an __init__.py that imports all my models and
> > then broke each model into its own class.  I'm mentioning this only
> > because you said you're new to python, so I'm guessing you haven't
> > seen this trick yet].
>
> Yes very much. I havent even started using modules!! I would say that I am
> still evauating how to extract best out of web2py & GAE.

Don't neglect the python side of things ;-). There's tons of gooey
goodness in this language.

> > I don't think I've ever used SQLFORM.  I'm perfectly fine with
> > building my own forms, saving my own data, and doing my own
> > validation.
>
> Out of my curiosity I have few questions.
>
> ReferenceProperty &
> <http://code.google.com/appengine/docs/datastore/typesandpropertyclass...>
> SelfReferenceProperty<http://code.google.com/appengine/docs/datastore/typesandpropertyclass...>arent
> used for defining relations. Instead IntegerProperty is used.
> Conventional representation on very-unconventional DB. Are you facing any
> problems in this. Are you defining model web2py conventinal way or with
> webapp api?

I'm using google's raw datastore api (google.ext.db). I gave up on
web2py's style of models when I really needed a ListProperty of
KeyProperty's.

No biggy. I can write to google's "low-level" ORM (I think they
actually snagged the interface from django) where I need it, then use
web2py's when I don't. (Actually, I like google's ORM better, so I've
wound up using it pretty much exclusively). And still keep the rest
of web2py's goodness.


> How did you approach many-to-many relationship in your design?

I cheat. :-D


> Are you using Expando<http://code.google.com/appengine/docs/datastore/expandoclass.html>property
> in your models along with web2py.

I haven't come across a situation where I could find any use for it.
Yet.

> I think this is a cool feature & ofcourse its unconventional again.

I agree totally. One of those examples of tradeoffs with an RDBMS.
"If you try to JOIN tables, the performance is going to be awful.
Let's not even talk about transactions. But, hey, try doing *this*
with SQL Server!" As long as you will have the penalties, no matter
what, you might as well take advantage of the benefits.
>
> What is the final feature set for web2py on GAE.

I'm definitely not the person to ask about this. I've done so much
heavy customization (mainly yanking things out) that I haven't
switched versions in months. I should probably post my changes back
somewhere (be happy to, if anyone's interested), but they aren't
anything that will wind up in the trunk, and I'm not about to suggest
branching.

> As far as I can list, I can
> list the following things
> 1. Small codebase (one can always see the code for learning)
> 2. Automatic migration
> 3. Clean templating
> 4. SQLFORM ofcourse ( speeds up prototyping !!)

Like I think I said, it would be *very* interesting to see how this
works using GAE's models instead of web2py's. (Or rather, how hard it
would be to make it work).

> 5. Multiple db support (may come in handy if we want to sync GAE with
> external db. Honestly didnt think of it b4 writing this)

Do you mean having one app running on GAE, running against BigTable,
with another running on your own host, using a different back-end?

I haven't really considered it (all that much), but that does sound
powerful.

> 6. Admin UI (yes I hacked it & hosted in a remote shared free host. So
> deoloyment is single click again)

Sweet! That sounds like a patch you should send.

It does remind me of one more piece of advice. You're probably better
off developing against the SDK server, rather than web2py's. (You'll
pretty much have to do that if you switch to google's Models). I
don't remember ever hitting anything I could specifically put my
finger on, but I never felt comfortable developing against, say,
sqlite with web2py's server, then publishing to google's servers.
>
> Anything else in feature list?

Hmm. I'm sure the people with more web2py experience than I can give
a more authoritative list. But, off the top of my head:

* Heavily tested code, that's been looked at by a lot of people (not
as many as django, but the parts I've compared are also much more
readable)
* *Much* higher level abstractions than webapp. Google's pretty much
deprecated that approach.
* Quite a lot of built-in security that's just silly to try to re-
invent
* I haven't really checked out T2 and T3, but they sound extremely
interesting
* Easy, intuitive access to things like REQUEST and RESPONSE.
* Responsive, friendly community. That may just be a side-effect of
python...it seems like *all* python programmers are friendly and
helpful. Maybe because we aren't all stressed out about all the
nonsense our programming language forces on us? <G>


>. As of now I am prepared to sacrifice above
> for performance (pure academic intreset in this).

*If* it turns out that you have performance problems (I've been wrong
before. Premature optimization is the root of all evil!), I really
recommend you just swap out the database layer. Maybe put a light
wrapper over web2py's ORM and only access it. As needed, you can
shift that wrapper to use GAE's, and the calling code will never know
you changed anything.

That may be the easiest approach to using SQLFORM: have wrapper
classes over GAE's ORM with the same interface as web2py's ORM, and
feed those to SQLFORM. Like I said, I haven't really looked at
SQLFORM at all, so I have no idea how much work would be involved.

Or maybe just forget about web2py's ORM and use GAE's. Just because
that part's a performance brick wall doesn't mean you should also
scrap the rest of it.

> If I could find some way
> of unobtrusively using GAE's features, I will certainly jump back ;)

It's no big deal. The customizations I've made are mainly my own
personal quirks (simplejson is included with GAE's django, so I don't
also need the version web2py includes in contrib. gluon.import_all
imports a lot of pieces that don't exist on GAE, so I got rid of them
because I got tired of seeing the error messages at startup. Etc).

Hmm. Simple example. This is old code, so no promises about whether
it still works (or is actually worth looking at):

In applications/init/modules/GaeModels I have email_list.py:

#! /usr/bin/env python

''' People who want to be notified about updates '''

from google.appengine.ext import db

import applications.init.modules.Base as Base

class email_list(db.Model, Base.BasePermission):
nickname = db.StringProperty(default = "Anonymous")
email_address = db.EmailProperty(required=True)
musician = db.BooleanProperty(default=False)
validated = db.BooleanProperty(default=False)
opted_in = db.DateTimeProperty(default=None)

In my wrapper class, one of the methods that accesses that:

class MailingListSignUp(Base.Base):
...
def BasicAddToMailingList(self):
''' If the user hasn't already declined to be a part of this,
request to
participate '''
existing_requests = GaeModels.email_list.all().filter
('email_address',
self.email_address)

# For now, if they've requested once, don't send another email.
# This is a compromise that probably needs to be reconsidered
# Then again, the email list is really a stop-gap
for existing_request in existing_requests:
raise MyExceptions.EmailRejected(self.email_address)

list_add_request = GaeModels.email_list(email_address =
self.email_address)
if self.nickname:
list_add_request.nickname = self.nickname
list_add_request.musician = self.musician
list_add_request.put()

return list_add_request.key()

My Controller can make an instance of that, call BasicAddToMailingList
(), do whatever it wants with the result, and pass its results on to
the View. It has no idea which ORM I'm using.

Not perfect, but it strikes me as tons better than just dropping
web2py completely.

Good luck,
James

mdipierro

unread,
Jan 4, 2009, 12:56:08 PM1/4/09
to web2py Web Framework
> I'm using google's raw datastore api (google.ext.db). I gave up on
> web2py's style of models when I really needed a ListProperty of
> KeyProperty's.

Have you checked out the latest IS_IN_DB(....multiple=True)?
I do not think it is very different than the ListProperty.

Massimo
> ...
>
> read more »

James Ashley

unread,
Jan 4, 2009, 3:34:09 PM1/4/09
to web2py Web Framework


On Jan 4, 11:56 am, mdipierro <mdipie...@cs.depaul.edu> wrote:
> > I'm using google's raw datastore api (google.ext.db). I gave up on
> > web2py's style of models when I really needed a ListProperty of
> > KeyProperty's.
>
> Have you checked out the latest IS_IN_DB(....multiple=True)?
> I do not think it is very different than the ListProperty.

In all fairness, I have not. I have seen posts discussing it, and
I've taken a quick glance at the code.

It's more a gut reaction based on [very limited] experience and
exposure to issues I see on the GAE mailing list. Well, those and a
lack of time.

IS_IN_DB seems to solve other issues than the ones I've run across,
though (I could very well be misunderstanding its use).

In one example, I have an ACL of users/groups allowed to do various
things. Under each security condition, I keep a ListProperty of User/
Group keys. If one of those keys matches the currently logged-in user
(or any of the groups the user's in), the user can do whatever that
permission level allows. I'm leaning toward storing the user's ACL
rights as a pickled BLOB instead (in a record corresponding to each
user), but, for now, this is performing acceptably.

I switched to doing it that way a few months back, because I couldn't
figure out how to do it at all with web2py's ORM.

In another example, I have a tree data structure. Child nodes contain
reference nodes to their parent nodes. The parent node has a
BooleanProperty to indicate whether it has any children (this is used
in the tree view to indicate whether a given node can be expanded or
not). Since ReferenceProperty's work both ways, I can get the Node's
children in a single query when the client tries to expand a node.

Before I switched to that method, App Engine timed out all over the
place.

I'm tempted to de-normalize that as well, but I also query directly
for child nodes, and I want to avoid the duplication as much as I can.

Anyway, looking through the newest sources, it looks like the id
column has been switched to use GAE's key, so maybe these use-cases
could be converted back without much overhead.

Please don't anyone read this as any sort of criticism of web2py's
ORM. Far from it. The fact that web2py runs on GAE out of the box
puts it way ahead of any comparable framework (including Google's old
crippled version of django) that I've been able to find.

I almost always wind up putting a wrapper layer over any data access
layer I use, almost out of habit. I've had to change those so many
times that I don't even really think twice about that particular
step. I made the engineering trade-off decision to switch to google's
ORM a few months back, because it made the most sense for that
situation at that time for that current project.

Based upon Sudhakar.M's description, I suspected that he'd run into
many of the same issues, so I recommended he run some tests and
benchmarks to see.

(One of the GAE trade-offs, as far as I've been able to tell, is that
performance doesn't seem to vary much with the size of your data
store, though it does with the size of your results. So querying a
few thousand records should give you a fairly accurate benchmark. I
really should check that assumption).

I have found that working directly with their API has given me a much
better feel for what works and what doesn't. It's easy enough to say
that it isn't an RDBMS. It's a lot harder to break years' worth of
habits of thinking in RDBMS terms. (I still see at least 2-3 posts a
week on their mailing list of new developers complaining that queries
time out, but they aren't willing to de-normalize their data).

Actually, thinking about it from that perspective, it seems like
another good reason to not use web2py's ORM. If/lwhen I do switch to
a different platform, I'll also be switching back to an RDBMS. Which
will involve a fairly hefty database re-design, no matter which way I
look at things. I'll be far less likely to miss something if I have
an obvious re-write where anything I miss will break immediately.

Sorry for the long babble. These are issues that I keep circling back
and questioning myself about.

Regards,
James

mdipierro

unread,
Jan 4, 2009, 5:24:21 PM1/4/09
to web2py Web Framework
I am not familiar with listproperty but how does this (http://
groups.google.com/group/web2py/browse_thread/thread/1ac0b91d25047aa0)
differ from it?

Massimo

sudhakar m

unread,
Jan 5, 2009, 2:03:40 PM1/5/09
to web...@googlegroups.com
Opps its been a long thread.. so I am going to keep it very short.

Premature optimization is the root of all evil!

Yes. I am completely inline with this. Its time to get things done. Will post back with results!!

Thanks again,
Sudhakar.M
Reply all
Reply to author
Forward
0 new messages