Bug in virtualfields w/ session

79 views
Skip to first unread message

Michael Toomim

unread,
Jul 29, 2011, 9:05:39 PM7/29/11
to web2py-users
I think I found a bug in virtualfields. I have the following
controller:

def posts():
user = session.auth.user
n = user.name # returns None

Where "person" is defined as a virtualfield on user:

class Users():
def name(self):
return self.users.first_name + ' ' + self.users.last_name
db.users.virtualfields.append(Users())

The problem is that user.name returns None, because apparently the
virtualfield isn't loaded into the session variable of user.

I made this work with the following modification to the controller:

def posts():
user = db.users[session.auth.user.id]
n = user.name # returns the user name correctly!

I just had to refetch the user from the database.

Anthony

unread,
Jul 29, 2011, 9:57:30 PM7/29/11
to web...@googlegroups.com
auth.user is Storage(table_user._filter_fields(user, id=True)). The _filter_fields method of the auth_user table only selects actual table fields, not virtual fields, so auth.user will not include any virtual fields. Perhaps this should be changed.
 
Anthony

contatog...@gmail.com

unread,
Jul 30, 2011, 8:28:56 AM7/30/11
to web...@googlegroups.com
Do not know if I can help, but I made a screencast showing the use of virtualenv with web2py:

http://vimeo.com/22919392
_____________________________________________
Gilson Filho

Anthony

unread,
Jul 30, 2011, 10:02:58 AM7/30/11
to web...@googlegroups.com
An issue has been submitted, and this should be corrected soon.
 
Anthony

Michael Toomim

unread,
Aug 1, 2011, 4:10:28 PM8/1/11
to Anthony, Massimo Di Pierro, web...@googlegroups.com
Maybe it helps for me to explain my use-case. I mainly use virtual fields as lazy methods, to help traverse related tables. I was actually surprised that lazy evaluation wasn't the default. I noticed a few implications of this:
  - Large queries are slowed by virtualfields, even if they won't be needed, esp if they query db
  - My definitions for virtualfields aren't as clean as they could be, because I have many nested "lazy" funcs in the class definition
  - We can't serialize all objects into session variables

So really I'm just using this because it's a nicer notation to call row.otherthing() instead of getotherthing(row). Maybe I really want some different feature here?

On Aug 1, 2011, at 5:40 AM, Anthony Bastardi wrote:

Note, after looking at this some more, Massimo recalled that the reason auth_user virtual fields were excluded from auth.user (and therefore from saving in the session) is because some virtual fields are objects that cannot be pickled and therefore cannot be serialized to store in the session. So, we're thinking of either creating an option to store auth_user virutual fields in auth.user, or maybe testing to make sure the virtual fields can be pickled, and excluding them if not.
 
Anthony

On Mon, Aug 1, 2011 at 5:30 AM, Michael Toomim <too...@cs.washington.edu> wrote:
Awesome! I did not know there was an issue submission system.

Massimo Di Pierro

unread,
Aug 2, 2011, 5:31:53 AM8/2/11
to web2py-users
We need to work on the speed. This can perhaps help the syntax:

db=DAL()
db.define_table('a',Field('b','integer'))
for i in range(10):
db.a.insert(b=i)

def lazy(f):
def g(self,f=f):
import copy
self=copy.copy(self)
return lambda *a,**b: f(self,*a,**b)
return g

class Scale:
@lazy
def c(self,scale=1):
return self.a.b*scale

db.a.virtualfields.append(Scale())
for row in db(db.a).select():
print row.b, row.c(1), row.c(2), row.c(3)

Michael Toomim

unread,
Aug 2, 2011, 6:03:45 PM8/2/11
to web...@googlegroups.com
That's way better syntax! Great idea!

Michael Toomim

unread,
Aug 8, 2011, 11:38:59 PM8/8/11
to web2py-users
It turns out the speed problem is REALLY bad. I have a table with
virtualfields of 14,000 rows. When I run raw sql:

a = db.executesql('select * from people;')

...the query returns in 121ms. But when I run it through the DAL on
only a subset of the data:

a = db(db.people.id > 0).select(limitby=(0,1000))

...it returns in 141096.431ms. That's... 141 seconds. So 1000x longer
on .1 of the database.

My virtualfields are all lazy functions. I'm looking into what's
causing it and will report back when I find out. It seems it might
have something to do with the lazy decorator func because when I hit C-
c the code is often stuck there... inside import copy or something.

def lazy(f):
def g(self,f=f):
import copy
self=copy.copy(self)
return lambda *a,**b: f(self,*a,**b)
return g

Anyway, I'll send an update when I have more info.

On Aug 2, 3:03 pm, Michael Toomim <too...@gmail.com> wrote:
> That's way better syntax!  Great idea!
>
> On Aug 2, 2011, at 2:31 AM, Massimo Di Pierro wrote:
>
>
>
>
>
>
>
> > We need to work on the speed. This can perhaps help the syntax:
>
> > db=DAL()
> > db.define_table('a',Field('b','integer'))
> > for i in range(10):
> >    db.a.insert(b=i)
>
> > def lazy(f):
> >    def g(self,f=f):
> >        import copy
> >        self=copy.copy(self)
> >        return lambda *a,**b: f(self,*a,**b)
> >    return g
>
> > class Scale:
> >    @lazy
> >    def c(self,scale=1):
> >        return self.a.b*scale
>
> > db.a.virtualfields.append(Scale())
> > for row in db(db.a).select():
> >    print row.b, row.c(1), row.c(2), row.c(3)
>
> > On Aug 1, 3:10 pm, Michael Toomim <too...@gmail.com> wrote:
> >> Maybe it helps for me to explain my use-case. I mainly use virtual fields as lazy methods, to help traverse related tables. I was actually surprised that lazy evaluation wasn't the default. I noticed a few implications of this:
> >>   - Large queries are slowed byvirtualfields, even if they won't be needed, esp if they query db
> >>   - My definitions forvirtualfieldsaren't as clean as they could be, because I have many nested "lazy" funcs in the class definition
> >>   - We can't serialize all objects intosessionvariables
>
> >> So really I'm just using this because it's a nicer notation to call row.otherthing() instead of getotherthing(row). Maybe I really want some different feature here?
>
> >> On Aug 1, 2011, at 5:40 AM, Anthony Bastardi wrote:
>
> >>> Note, after looking at this some more, Massimo recalled that the reason auth_user virtual fields were excluded from auth.user (and therefore from saving in thesession) is because some virtual fields are objects that cannot be pickled and therefore cannot be serialized to store in thesession. So, we're thinking of either creating an option to store auth_user virutual fields in auth.user, or maybe testing to make sure the virtual fields can be pickled, and excluding them if not.
>
> >>> Anthony
>
> >>> On Mon, Aug 1, 2011 at 5:30 AM, Michael Toomim <too...@cs.washington.edu> wrote:
> >>> Awesome! I did not know there was an issue submission system.
>
> >>> On Jul 30, 2011, at 7:02 AM, Anthony wrote:
>
> >>>> An issue has been submitted, and this should be corrected soon.
>
> >>>> Anthony
>
> >>>> On Friday, July 29, 2011 9:57:30 PM UTC-4, Anthony wrote:
> >>>> auth.user is Storage(table_user._filter_fields(user, id=True)). The _filter_fields method of the auth_user table only selects actual table fields, not virtual fields, so auth.user will not include any virtual fields. Perhaps this should be changed.
>
> >>>> Anthony
>
> >>>> On Friday, July 29, 2011 9:05:39 PM UTC-4, Michael Toomim wrote:
> >>>> I think I found a bug invirtualfields. I have the following
> >>>> controller:
>
> >>>> def posts():
> >>>>     user =session.auth.user
> >>>>     n = user.name # returns None
>
> >>>> Where "person" is defined as a virtualfield on user:
>
> >>>> class Users():
> >>>>     def name(self):
> >>>>         return self.users.first_name + ' ' + self.users.last_name
> >>>> db.users.virtualfields.append(Users())
>
> >>>> The problem is that user.name returns None, because apparently the
> >>>> virtualfield isn't loaded into thesessionvariable of user.

Michael Toomim

unread,
Aug 9, 2011, 12:29:51 AM8/9/11
to web2py-users
Mid-status note: it would be great if the profiler worked with the
web2py shell!

Then I could run commands at the command prompt in isolation and see
how long they take.

On Aug 8, 8:38 pm, Michael Toomim <too...@gmail.com> wrote:
> It turns out the speed problem is REALLY bad. I have a table withvirtualfieldsof 14,000 rows. When I run raw sql:
>
>     a = db.executesql('select * from people;')
>
> ...the query returns in 121ms. But when I run it through the DAL on
> only a subset of the data:
>
>     a = db(db.people.id > 0).select(limitby=(0,1000))
>
> ...it returns in 141096.431ms. That's... 141 seconds. So 1000x longer
> on .1 of the database.
>
> Myvirtualfieldsare all lazy functions. I'm looking into what's

Michael Toomim

unread,
Aug 9, 2011, 10:36:05 PM8/9/11
to web2py-users
Result: Fixed by upgrading. I was seeing this bug:
http://code.google.com/p/web2py/issues/detail?id=345

However, virtualfields still take more time than they should. My
selects take 2-3x longer with virtualfields enabled than without. I
implemented a little hack in the dal that adds methods to rows with
only a 10% overhead (instead of 200-300%) and can share that if
anyone's interested.

On Aug 8, 8:38 pm, Michael Toomim <too...@gmail.com> wrote:
> It turns out the speed problem is REALLY bad. I have a table with
> virtualfields of 14,000 rows. When I run raw sql:
>
>     a = db.executesql('select * from people;')
>
> ...the query returns in 121ms. But when I run it through the DAL on
> only a subset of the data:
>
>     a = db(db.people.id > 0).select(limitby=(0,1000))
>
> ...it returns in 141096.431ms. That's... 141 seconds. So 1000x longer
> on .1 of the database.
>
> My virtualfields are all lazy functions. I'm looking into what's
> causing it and will report back when I find out. It seems it might
> have something to do with the lazy decorator func because when I hit C-
> c the code is often stuck there... inside import copy or something.
>
> def lazy(f):
>    def g(self,f=f):
>        import copy
>        self=copy.copy(self)
>        return lambda *a,**b: f(self,*a,**b)
>    return g
>
> Anyway, I'll send an update when I have more info.
>
> On Aug 2, 3:03 pm, MichaelToomim<too...@gmail.com> wrote:
>
>
>
>
>
>
>
> > That's way better syntax!  Great idea!
>
> > On Aug 2, 2011, at 2:31 AM, Massimo Di Pierro wrote:
>
> > > We need to work on the speed. This can perhaps help the syntax:
>
> > > db=DAL()
> > > db.define_table('a',Field('b','integer'))
> > > for i in range(10):
> > >    db.a.insert(b=i)
>
> > > def lazy(f):
> > >    def g(self,f=f):
> > >        import copy
> > >        self=copy.copy(self)
> > >        return lambda *a,**b: f(self,*a,**b)
> > >    return g
>
> > > class Scale:
> > >    @lazy
> > >    def c(self,scale=1):
> > >        return self.a.b*scale
>
> > > db.a.virtualfields.append(Scale())
> > > for row in db(db.a).select():
> > >    print row.b, row.c(1), row.c(2), row.c(3)
>
> > > On Aug 1, 3:10 pm, MichaelToomim<too...@gmail.com> wrote:
> > >> Maybe it helps for me to explain my use-case. I mainly use virtual fields as lazy methods, to help traverse related tables. I was actually surprised that lazy evaluation wasn't the default. I noticed a few implications of this:
> > >>   - Large queries are slowed byvirtualfields, even if they won't be needed, esp if they query db
> > >>   - My definitions forvirtualfieldsaren't as clean as they could be, because I have many nested "lazy" funcs in the class definition
> > >>   - We can't serialize all objects intosessionvariables
>
> > >> So really I'm just using this because it's a nicer notation to call row.otherthing() instead of getotherthing(row). Maybe I really want some different feature here?
>
> > >> On Aug 1, 2011, at 5:40 AM, Anthony Bastardi wrote:
>
> > >>> Note, after looking at this some more, Massimo recalled that the reason auth_user virtual fields were excluded from auth.user (and therefore from saving in thesession) is because some virtual fields are objects that cannot be pickled and therefore cannot be serialized to store in thesession. So, we're thinking of either creating an option to store auth_user virutual fields in auth.user, or maybe testing to make sure the virtual fields can be pickled, and excluding them if not.
>
> > >>> Anthony
>

Massimo Di Pierro

unread,
Aug 10, 2011, 2:16:36 AM8/10/11
to web2py-users
let us see it!

Michael Toomim

unread,
Aug 10, 2011, 8:11:07 PM8/10/11
to web2py-users
Ok. The basic idea is to allow you to define helpers methods on rows,
sort of like the Models of rails/django.

You use it like this... I put this in models/db_methods.py:

@extra_db_methods
class Users():
def name(self):
return '%s %s' % ((self.first_name or ''),
(self.last_name or ''))
def fb_name(self):
p = self.person()
return (p and p.name) or 'Unknown dude'
def person(self):
return db.people(db.people.fb_id == self.fb_id)
def friends(self):
return [Storage(name=f[0], id=f[1])
for f in sj.loads(self.friends_cache)]

@extra_db_methods
class People():
... etc

These are for tables db.users and db.people. It looks up the table
name from the class name. For each table that you want to extend, you
make a class and put @extra_db_methods on top.

It's implemented with the following @extra_db_methods decorator and a
patch to dal.py. The decorator just traverses the class, pulls out all
methods, and throws them into a "methods" variable on the appropriate
table in dal. Then the dal's parse() routine adds these methods each
row object, using the python type.MethodType() routine for
retargetting a method from one class to another object.

The downside is extending dal with yet ANOTHER way of adding methods
to objects. That makes 3 apis to maintain for similar things
(virtualfields, computedfields, and this). And I'm not sure about the
names (like "extra_db_methods") for these things yet. Also I think we
might be able to get it even faster by being more clever with python
inheritance in the Row class. Right now it has roughly 10% overhead on
selects in my tests (uncompiled code).

At the bottom of this message is the decorator that implements the
same functionality using the existing virtualfields mechanism and your
"lazy" decorator. Its downside is a 2x to 3x overhead on selects and
instead of self.field you have to say self.<tablename>.field in the
method bodies.

def extra_db_methods(clss):
tablename = clss.__name__.lower()
if not tablename in db:
raise Error('There is no `%s\' table to put virtual methods in'
% tablename)

for k in clss.__dict__.keys():
method = clss.__dict__[k]
if type(method).__name__ == 'function' or type(method).__name__
== 'instancemethod':
db[tablename].methods.update({method.__name__ : method})

return clss

--- k/web2py/gluon/dal.py 2011-08-03 16:46:39.000000000 -0700
+++ web2py/gluon/dal.py 2011-08-10 17:04:48.344795251 -0700
@@ -1459,6 +1459,7 @@
new_rows.append(new_row)
rowsobj = Rows(db, new_rows, colnames, rawrows=rows)
for tablename in virtualtables:
+ rowsobj.setmethods(tablename, db[tablename].methods)
for item in db[tablename].virtualfields:
try:
rowsobj =
rowsobj.setvirtualfields(**{tablename:item})
@@ -4559,6 +4560,7 @@
tablename = tablename
self.fields = SQLCallableList()
self.virtualfields = []
+ self.methods = {}
fields = list(fields)

if db and self._db._adapter.uploads_in_blob==True:
@@ -5574,6 +5576,14 @@
self.compact = compact
self.response = rawrows

+ def setmethods(self, tablename, methods):
+ if len(methods) < 0: return
+ for row in self.records:
+ if tablename not in row: break # Abort on this and all
rows. For efficiency.
+ for (k,v) in methods.items():
+ r = row[tablename]
+ r.__dict__[k] = types.MethodType(v, r)
+ return self
def setvirtualfields(self,**keyed_virtualfields):
if not keyed_virtualfields:
return self

---
And Here's the implementation using virtualfields:

def lazy(f):
def g(self,f=f):
import copy
self=copy.copy(self)
return lambda *a,**b: f(self,*a,**b)
return g

def extra_db_methods_vf(clss):
''' This decorator clears virtualfields on the table and replaces
them with the methods on this class.
'''
# First let's make the methods lazy
for k in clss.__dict__.keys():
if type(getattr(clss, k)).__name__ == 'instancemethod':
setattr(clss, k, lazy(getattr(clss, k)))

tablename = clss.__name__.lower()
if not tablename in db:
raise Error('There is no `%s\' table to put virtual methods in'
% tablename)
del db[tablename].virtualfields[:] # We clear virtualfields each
time
db[tablename].virtualfields.append(clss())
return clss

You use this just like before but with @extra_db_methods_vf instead of
@extra_db_methods, and append <tablename> to each use of "self".

On Aug 9, 11:16 pm, Massimo Di Pierro <massimo.dipie...@gmail.com>
wrote:
> let us see it!
>
> On Aug 9, 9:36 pm, MichaelToomim<too...@gmail.com> wrote:
>
>
>
>
>
>
>
> > Result: Fixed by upgrading. I was seeing this bug:http://code.google.com/p/web2py/issues/detail?id=345
>
> > However, virtualfields still take more time than they should. My
> > selects take 2-3x longer with virtualfields enabled than without. I
> > implemented a little hack in the dal that adds methods to rows with
> > only a 10% overhead (instead of 200-300%) and can share that if
> > anyone's interested.
>

Massimo Di Pierro

unread,
Aug 11, 2011, 5:55:17 AM8/11/11
to web2py-users
This is really interesting. Please give me some time to study it,
meanwhile, so that I do not forget, please open an issue and post the
code there.

Massimo
> ...
>
> read more »

Michael Toomim

unread,
Aug 11, 2011, 3:21:35 PM8/11/11
to web2py-users
Ok, it's here http://code.google.com/p/web2py/issues/detail?id=374

Thank you for looking into this Massimo! I do not know the best way to
do this... my code is just a first reaction to making something
faster.

On Aug 11, 2:55 am, Massimo Di Pierro <massimo.dipie...@gmail.com>
wrote:
> This is really interesting. Please give me some time to study it,
> meanwhile, so that I do not forget, please open an issue and post the
> code there.
>
> Massimo
>
> ...
>
> read more »
Reply all
Reply to author
Forward
0 new messages