GAE: insert using "key_name"

69 views
Skip to first unread message

Quint

unread,
Jan 8, 2014, 6:25:42 AM1/8/14
to web...@googlegroups.com
Hi,

In GAE it's possibe to supply a "key_name" when you insert an entity. This is a string that you can then use as an id to efficiently fetch an entity from db.

I would like to be able to do this using web2py. I could also choose to use the GAE API directly to do this but then a could not make use thing like calculated Fields etc in my web2py tables.

I made the following modification to the GoogleDatastoreAdapter insert() function:

 def insert(self,table,fields):
        dfields
=dict((f.name,self.represent(v,f.type)) for f,v in fields)
       
# table._db['_lastsql'] = self._insert(table,fields)
       
# Field name 'gae_key_name' can be used insert using key_name for both DB and NDB.
        keyname
= None
       
if 'gae_key_name' in dfields:
            keyname
= dfields['gae_key_name']
           
if self.use_ndb:
                dfields
['id'] = dfields.pop('gae_key_name')
           
else:
                dfields
['key_name'] = dfields.pop('gae_key_name')

        tmp
= table._tableobj(**dfields)
        tmp
.put()
        key
= tmp.key if self.use_ndb else tmp.key()
        rid
= Reference(key.id())
       
(rid._table, rid._record, rid._gaekey) = (table, None, key)
       
return rid

Now one can insert using a field name 'gae_key_name'. 
DB and NDB expect different parameter names for the model constructors to supply a key_name. 'key_name vs 'id''.

Now the only problem is that _listify() in Table required that all fields are defined in the Table.
We can now choose to define a Field "gae_key_name" in every Table that we want to use this feature for.
Another option is to modify the _listify function a bit to accomodate this:

def _listify(self,fields,update=False):
        new_fields
= {} # format: new_fields[name] = (field,value)


       
# store all fields passed as input in new_fields
       
# 'gae_key_name' is GAE specific and should not be defined in the table.
       
for name in fields:
           
if not name in self.fields:
               
if name not in ['id', 'gae_key_name']:
                   
raise SyntaxError(
                       
'Field %s does not belong to the table' % name)
               
if name == 'gae_key_name':
                   
# Create stub Field for 'gae_key_name' so it can be included
                   
# without being defined in the model.
                    field
= Field('gae_key_name', 'string')
                    new_fields
['gae_key_name'] = field, fields['gae_key_name']

           
else:
                field
= self[name]
                value
= fields[name]
               
if field.filter_in:
                    value
= field.filter_in(value)
                new_fields
[name] = (field,value)

The last option feels a bit hacky but removes the requirement to define a Field 'gae_key_name' in every Table.

What to you think?

I attached a patch that includes both changes.

Regards,

Quint



Enable_insert_using_key_name_for_GAE.patch

Alan Etkin

unread,
Jan 8, 2014, 8:18:53 AM1/8/14
to web...@googlegroups.com

I would like to be able to do this using web2py. I could also choose to use the GAE API directly to do this but then a could not make use thing like calculated Fields etc in my web2py tables.

One issue about forcing key names is that there's no guarantee that the new key will provide a valid ID integer when retrieved, and the DAL api requires record insertions to provide that kind of value. Note that calls to Key.id() can return None, which is used by the insert method to create the db reference object. IIRC, the exception of that rule is the case of keyed tables, but I'm not if it is supported for gae nosql. In case it is supported, the patch could instead use keyed tables for storing entity key names.

From the datastore Python api:

"... The identifier may be either a key name string, assigned explicitly by the application when the instance is created, or an integer numeric ID, assigned automatically by App Engine when the instance is written (put) to the Datastore ..."

I have not tested the behavior but according to the above I think that the datastore api would return None instead of valid integer identifiers for the case of entities stored with manual key names.

Quint

unread,
Jan 8, 2014, 9:39:57 AM1/8/14
to web...@googlegroups.com
Yes,

Your post reminds me, I forgot something. I thought that when a key_name is used, it could simply return that as an id.
But if you say that DAL requires that it returns an integer id, than yes, that would break it.
When you use a key_name, the records do not have an numeric id.
But does that mean you should not allow it to be inserted that way?
Nothing restricts users from operating (using web2py api) on rows inserted with GAE api....

   
dfields=dict((f.name,self.represent(v,f.type)) for f,v in fields)
       
# table._db['_lastsql'] = self._insert(table,fields)

       
#quint

        keyname
= None
       
if 'gae_key_name' in dfields:
            keyname
= dfields['gae_key_name']
           
if self.use_ndb:
                dfields
['id'] = dfields.pop('gae_key_name')
           
else:
                dfields
['key_name'] = dfields.pop('gae_key_name')


        tmp
= table._tableobj(**dfields)
        tmp
.put()



       
if keyname:
           
return
keyname



        key
= tmp.key if self.use_ndb else tmp.key()

        rid
= Reference(key.id() if self.use_ndb else key.id_or_name())

       
(rid._table, rid._record, rid._gaekey) = (table, None, key)
       
return rid

Alan Etkin

unread,
Jan 8, 2014, 9:52:18 AM1/8/14
to web...@googlegroups.com
> But does that mean you should not allow it to be inserted that way?
> Nothing restricts users from operating (using web2py api) on rows inserted with GAE api....

There's no restriction on the use of the datastore, since it is not a service reserved for web2py apps, although I belive storing records without id assignment would be restrictive for apps (and I doubt it will be actually compatible at all, with the exception of the keyed tables mentioned before, in case it is supported).

Quint

unread,
Jan 8, 2014, 11:07:25 AM1/8/14
to web...@googlegroups.com
Sorry, I don't understand this line.

 although I belive storing records without id assignment would be restrictive for apps

Do you mean web2py apps or GAE apps? What did you mean by "restrictive"

(and I doubt it will be actually compatible at all

Compatible with what?

Sure, when you would insert using a key_name, and without an numeric id, the will be web2py API functionalities that wont work anymore. (the ones that expect a numeric id returned)
But you would know that you don't need those functionalities if you would decide to use a key_name.

Alan Etkin

unread,
Jan 8, 2014, 11:52:47 AM1/8/14
to web...@googlegroups.com
Compatible with what?

With any web2py feature expecting entity keys (web2py records) with integer ids.

But you would know that you don't need those functionalities if you would decide to use a key_name

I think this is usually called a "corner case". I would't add the feature for the purpose for storing entities by name since it does not require much coding to implement it per-application using the gae api.

Alan Etkin

unread,
Jan 8, 2014, 4:33:48 PM1/8/14
to
How about adding support in dal.py for the following:

# processes the field input and add defaults, computes, etc. (does not make actual insertion).
>>> values = db._insert(spam="alot", ...)
{"spam": "alot", ...}

So one can use the output for storing named keys (with app specific logic or maybe plugin for extended datastore functions).

AFAIK, the patch is trivial (it would need adding a _insert adapter method like the case of mongodb or imap)

EDIT: the proposed patch is for enhancing the datastore adapter

Alan Etkin

unread,
Jan 9, 2014, 8:00:08 AM1/9/14
to web...@googlegroups.com
How about adding support in dal.py for the following:

# processes the field input and add defaults, computes, etc. (does not make actual insertion).
>>> values = db._insert(spam="alot", ...)
{"spam": "alot", ...}

I made a pr about supporting _insert for processing values without applying changes to the database as proposed in the previous post

https://github.com/web2py/web2py/pull/341


Message has been deleted

Quint

unread,
Jan 9, 2014, 3:01:22 PM1/9/14
to
Works great!

On Thursday, January 9, 2014 7:21:45 PM UTC+1, Quint wrote:
Excellent!


I already included it to try it out but how do i use it?


When I use your example I get:


AttributeError: 'DAL' object has no attribute '_insert' 

(obviously because there is no _select() in DAL)


When i try:
db._adapter._insert(props)

I get:



 File "C:\Users\Quint*********\gluon\dal.py", line 5302, in <genexpr>
    return dict((f.name,self.represent(v,f.type)) for f,v in fields)
ValueError: too many values to unpack


What do I need to supply to _select()?
 It looks like I need to supply a collection of Fields.


I'm missing something...

Alan Etkin

unread,
Jan 9, 2014, 4:14:43 PM1/9/14
to
> Works great!

On Thursday, January 9, 2014 7:21:45 PM UTC+1, Quint wrote:
Excellent!


I already included it to try it out but how do i use it?

Your message is confusing, does it work or it does not? It worked for my environment (recent release of gae sdk and development server)

Specifically, what worked was:

>>> myvalues = db._insert(k=v, ...)
{k: <processed value>}

Where db is a google app engine connection (DAL instance)

EDIT: wrong example, the correct syntax is

>>> db.<table>._insert(<kwargs>)
{k: v, ...}

I think that db.table._update can also be useful

Quint

unread,
Jan 9, 2014, 4:22:57 PM1/9/14
to web...@googlegroups.com
Sorry for the confussion..
But it works fine. I will definitally use this.

(I posted something stupid and removed the stupid post but left some traces. ;-))

Alan Etkin

unread,
Jan 9, 2014, 5:18:19 PM1/9/14
to web...@googlegroups.com
> But it works fine. I will definitally use this.

Ok. If you can, it would be useful if you could share your tests about supported features for records lacking id values and any issue you find with setters and getters using key names. Note that the Rows class supports defining objects using dictionaries to construct them. Finally, I belive a plugin to extended DAL class so it provides retrieving and setting records by key name would be also useful.

Alan Etkin

unread,
Jan 11, 2014, 8:07:56 AM1/11/14
to web...@googlegroups.com

Sorry for the confussion..
But it works fine. I will definitally use this.

(I posted something stupid and removed the stupid post but left some traces. ;-))


Well there are good reasons not to add the feature, at least with the current name:

https://groups.google.com/d/msg/web2py-developers/bN0WS9_skzw/NJq4bs5M8KIJ

Sorry; however, you could temporarily implement a similar method by subclassing GoogleDatastoreAdapter so it pre-processes data without applying changes.

http://www.web2py.com/books/default/chapter/29/06/the-database-abstraction-layer#Note-on-new-DAL-and-adapters

Quint

unread,
Jan 12, 2014, 7:33:33 AM1/12/14
to
ok, understood.

This is really the only thing I need:

def _pre_process(self, table, **fields):
       
"""
        Takes a w2p table and a dictionary with values and processes
        the field input and add defaults, computes, etc using the web2py table.
        """



        fields
= table._defaults(fields)
       
return dict((f.name,table._db._adapter.represent(v,f.type)) for f,v in table._listify(fields))

At this moment this works for me and I don't need to subclass the adapter.

But:

In the other thread you spoke about people possibly relying on _<something> functions (for debugging/logging etc.)
I gues my question is, can users safely rely on _<something> methods? What is w2p's "policy" regarding underscore-prefixed functions.
Are they considered private?

So can I use the above function as it is?

Oh and,

Sorry; however, you could temporarily implement a similar method by subclassing GoogleDatastoreAdapter so it pre-processes data without applying changes.

What do you mean by  "temporarily"?
Could the feature be implemented in the future with a different name?

Quint

Alan Etkin

unread,
Jan 12, 2014, 8:56:43 AM1/12/14
to web...@googlegroups.com
 
I gues my question is, can users safely rely on _<something> methods?

The change revert is due this concern about api reliability, so adapters behave as specified in

http://www.web2py.com/books/default/chapter/29/06/the-database-abstraction-layer?search=_insert#Raw-SQL

What is w2p's "policy" regarding underscore-prefixed functions.
Are they considered private?

Underscore is required in some cases for example to avoid name collisions. I suppose that the policy about the use of underscore follows the pep8 style guide except for special cases described in the book. For example (chapter 4):
Functions that take arguments or start with a double underscore are not publicly exposed and can only be called by other functions.

So can I use the above function as it is?

I think so

> What do you mean by  "temporarily"?
> Could the feature be implemented in the future with a different name?

It could, If there's agreement about adding it to the adapter.

Reply all
Reply to author
Forward
0 new messages