1.3.1 SDK Prerelease - help us verify

84 views
Skip to first unread message

Ikai Lan

unread,
Feb 3, 2010, 5:03:22 PM2/3/10
to google-a...@googlegroups.com
Hello App Engine Developers,

As part of our ongoing efforts to improve release quality and
transparency, we will start prereleasing SDKs for early testing. We
hope this gives developers a chance to participate in our release
process by trying out new changes and sending feedback. As of this
morning, the prerelease SDK for our next release, 1.3.1, is available
in the familiar download location (note that the filename ends in
'prerelease.zip'):


If you're interested, please download and give it a try locally with
your favorite App Engine code. Please note that, as a prerelease, this
SDK is not yet supported and still subject to change. Thus, please
don't take critical dependencies or make substantial changes to
production apps based on this SDK.

Importantly, this prerelease is purely for the SDK and is intended for
local testing and development in dev_appserver. The server-side of App
Engine (our production environment) is not at 1.3.1, so deploying with
this SDK is not yet supported. In the future, we might enable a
complete SDK and server test environment for prereleases.

Please try 1.3.1 for local development and send us your feedback!

Thanks,

App Engine Team


Python
=========
- New support for Datastore Query Cursors
- New support for Transactional Task Creation
- Additional file extensions permitted when sending mail including .doc and .xls 
- New Grab Tail added to Memcache API
- Support for Custom Admin Console pages
- New "month" and "synchronized" syntax for Cron configuration
- Application Stats library now included in with SDK
- Bulk Loader supports bulk downloading all kinds simultaneously
- appcfg.py validates SSL certificates for HTTPS connections
- Support for ETags, If-matches, If-not-matches HTTP Headers, as well as 304 error codes now available on static files (not available on the dev_appserver or Blobstore blobs) http://code.google.com/p/googleappengine/issues/detail?id=575

Java
=========
- Datastore Query Cursors
- Transactional Tasks
- Additional file extensions permitted when sending mail including .doc and .xsl 
- Grab Tail added to Memcache API
- Support for Custom Admin Console pages
- Java Precompilation is now on by default.
- Developers can opt-out of precompilation by setting the flag in appengine-web.xml
  <precompilation-enabled>false</precompilation-enabled>
- New built-in support for unit testing (see appengine-testing.jar)
- net.sf.jsr107 package included as an alternative to the low-level Memcache API
- javax.annotation.Resource/Resources added to the package whitelist
- New "month" and "synchronized" syntax for Cron configuration
- URLFetch supports asynchronous requests
- appcfg.sh uses HTTPS for application deployment
- appcfg.sh adds request_logs --append
- Changes to the order queries without a specified sort order are returned. Only queries that use IN will see different results.
- Added support for multiple != filters on the same property
- Improved support for keys-only queries when using IN and != filters
- Support for ETags, If-matches, If-not-matches HTTP Headers, as well as 304 error codes now available on static files (not available on the dev_appserver or Blobstore blobs)
- Fixed issue where the maximum transform count was enforced for composite operations
- Fixed issue with whitespace on the end of strings in web.xml
- Fixed "Not Found" issue when defining <error-page> in web.xml
- Fixed issue when defining <welcome-file-list> in web.xml
- Fixed issue where cancelling a deployment in progress would unintentionally delete packages
- Fixed issue with QuotaService.getCpuTimeInMegaCycles() returning 0
- Fixed issue where JSP exceptions will be incorrectly cast causing a ClassCastException 

--
Ikai Lan
Developer Programs Engineer, Google App Engine
http://googleappengine.blogspot.com | http://twitter.com/app_engine

sabiancrash

unread,
Feb 3, 2010, 6:32:52 PM2/3/10
to Google App Engine
Is there any documentation on the new functionality released for
python, specifically datastore query cursors?

> dev_appserver or Blobstore blobs)http://code.google.com/p/googleappengine/issues/detail?id=575


>
> Java
> =========
> - Datastore Query Cursors
> - Transactional Tasks
> - Additional file extensions permitted when sending mail including .doc and
> .xsl
>  http://code.google.com/p/googleappengine/issues/detail?id=494
> - Grab Tail added to Memcache API
> - Support for Custom Admin Console pages
> - Java Precompilation is now on by default.
> - Developers can opt-out of precompilation by setting the flag in
> appengine-web.xml
>   <precompilation-enabled>false</precompilation-enabled>
> - New built-in support for unit testing (see appengine-testing.jar)
>  http://code.google.com/p/googleappengine/issues/detail?id=326
> - net.sf.jsr107 package included as an alternative to the low-level Memcache
> API
> - javax.annotation.Resource/Resources added to the package whitelist
> - New "month" and "synchronized" syntax for Cron configuration
> - URLFetch supports asynchronous requests

> -http://code.google.com/p/googleappengine/issues/detail?id=1899

Jason C

unread,
Feb 3, 2010, 9:07:09 PM2/3/10
to Google App Engine
I think it's mean to refer to 304 as an "error code" - it's the best
HTTP response code of all, and think of what a wonderful place the
Internet would be if everyone knew how to use it properly! ;)

j

On Feb 3, 4:03 pm, Ikai Lan <i...@google.com> wrote:

> dev_appserver or Blobstore blobs)http://code.google.com/p/googleappengine/issues/detail?id=575


>
> Java
> =========
> - Datastore Query Cursors
> - Transactional Tasks
> - Additional file extensions permitted when sending mail including .doc and
> .xsl
>  http://code.google.com/p/googleappengine/issues/detail?id=494
> - Grab Tail added to Memcache API
> - Support for Custom Admin Console pages
> - Java Precompilation is now on by default.
> - Developers can opt-out of precompilation by setting the flag in
> appengine-web.xml
>   <precompilation-enabled>false</precompilation-enabled>
> - New built-in support for unit testing (see appengine-testing.jar)
>  http://code.google.com/p/googleappengine/issues/detail?id=326
> - net.sf.jsr107 package included as an alternative to the low-level Memcache
> API
> - javax.annotation.Resource/Resources added to the package whitelist
> - New "month" and "synchronized" syntax for Cron configuration
> - URLFetch supports asynchronous requests

> -http://code.google.com/p/googleappengine/issues/detail?id=1899

Koen Bok

unread,
Feb 4, 2010, 5:01:33 AM2/4/10
to Google App Engine
Seems like an exciting update!

- New Grab Tail added to Memcache API

What does this mean?

Nick Johnson (Google)

unread,
Feb 4, 2010, 7:18:28 AM2/4/10
to google-a...@googlegroups.com
On Thu, Feb 4, 2010 at 10:01 AM, Koen Bok <ko...@madebysofa.com> wrote:
Seems like an exciting update!

- New Grab Tail added to Memcache API

What does this mean?

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.




--
Nick Johnson, Developer Programs Engineer, App Engine
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047

theillustratedlife

unread,
Feb 4, 2010, 2:08:35 PM2/4/10
to Google App Engine
I believe that the CRON changes are to address this issue:
http://code.google.com/p/googleappengine/issues/detail?id=1261&can=5&colspec=ID%20Type%20Status%20Priority%20Stars%20Owner%20Summary%20Log%20Component

Can you please explain the new syntax?

Thanks!

brandoneggar

unread,
Feb 4, 2010, 2:11:21 PM2/4/10
to Google App Engine
Finally, Datastore Query Cursors. Thank you.

Charlie

unread,
Feb 4, 2010, 3:06:22 PM2/4/10
to Google App Engine
Datastore Query Cursors seem very interesting. Where can we find
documentation (inline or otherwise) on them?

On Feb 4, 7:18 am, "Nick Johnson (Google)" <nick.john...@google.com>
wrote:


> On Thu, Feb 4, 2010 at 10:01 AM, Koen Bok <k...@madebysofa.com> wrote:
> > Seems like an exciting update!
>
> > - New Grab Tail added to Memcache API
>
> > What does this mean?
>

> See the inline docs, here:http://code.google.com/p/googleappengine/source/browse/trunk/python/g...

> > google-appengi...@googlegroups.com<google-appengine%2Bunsubscrib e...@googlegroups.com>

Guria

unread,
Feb 4, 2010, 7:09:17 PM2/4/10
to Google App Engine
Maybe itsn't best place for this question, but...
Have you any plans to finish an XMPP API with <iq/> and <presense/>
support (http://code.google.com/p/googleappengine/issues/detail?
id=2071)

> dev_appserver or Blobstore blobs)http://code.google.com/p/googleappengine/issues/detail?id=575


>
> Java
> =========
> - Datastore Query Cursors
> - Transactional Tasks
> - Additional file extensions permitted when sending mail including .doc and
> .xsl
>  http://code.google.com/p/googleappengine/issues/detail?id=494
> - Grab Tail added to Memcache API
> - Support for Custom Admin Console pages
> - Java Precompilation is now on by default.
> - Developers can opt-out of precompilation by setting the flag in
> appengine-web.xml
>   <precompilation-enabled>false</precompilation-enabled>
> - New built-in support for unit testing (see appengine-testing.jar)
>  http://code.google.com/p/googleappengine/issues/detail?id=326
> - net.sf.jsr107 package included as an alternative to the low-level Memcache
> API
> - javax.annotation.Resource/Resources added to the package whitelist
> - New "month" and "synchronized" syntax for Cron configuration
> - URLFetch supports asynchronous requests

> -http://code.google.com/p/googleappengine/issues/detail?id=1899

niklasro.appspot.com

unread,
Feb 4, 2010, 7:35:30 PM2/4/10
to Google App Engine
Very interesting it couldn't some language .mo dialectindependent adds
for instance arabic months some noted should. Easy output enables
arabic months we can display العملية 04 فبراير أهلا،  دخول

Greg Temchenko

unread,
Feb 5, 2010, 6:57:41 AM2/5/10
to Google App Engine
As I see, this code was added in r93 and it's SDK 1.2.8.
Does it mean you just didn't mention this feature in 1.2.8?

On Feb 4, 3:18 pm, "Nick Johnson (Google)" <nick.john...@google.com>
wrote:


> On Thu, Feb 4, 2010 at 10:01 AM, Koen Bok <k...@madebysofa.com> wrote:
> > Seems like an exciting update!
>
> > - New Grab Tail added to Memcache API
>
> > What does this mean?
>

> See the inline docs, here:http://code.google.com/p/googleappengine/source/browse/trunk/python/g...

> > google-appengi...@googlegroups.com<google-appengine%2Bunsubscrib e...@googlegroups.com>

Kyle Heu

unread,
Feb 4, 2010, 6:31:02 PM2/4/10
to Google App Engine
Do the Datastore Cursors solve the 1000 query limit?

> dev_appserver or Blobstore blobs)http://code.google.com/p/googleappengine/issues/detail?id=575


>
> Java
> =========
> - Datastore Query Cursors
> - Transactional Tasks
> - Additional file extensions permitted when sending mail including .doc and
> .xsl
>  http://code.google.com/p/googleappengine/issues/detail?id=494
> - Grab Tail added to Memcache API
> - Support for Custom Admin Console pages
> - Java Precompilation is now on by default.
> - Developers can opt-out of precompilation by setting the flag in
> appengine-web.xml
>   <precompilation-enabled>false</precompilation-enabled>
> - New built-in support for unit testing (see appengine-testing.jar)
>  http://code.google.com/p/googleappengine/issues/detail?id=326
> - net.sf.jsr107 package included as an alternative to the low-level Memcache
> API
> - javax.annotation.Resource/Resources added to the package whitelist
> - New "month" and "synchronized" syntax for Cron configuration
> - URLFetch supports asynchronous requests

> -http://code.google.com/p/googleappengine/issues/detail?id=1899

alf

unread,
Feb 5, 2010, 2:04:19 PM2/5/10
to Google App Engine
and 30 sec time window?

thx.

Adam Crossland

unread,
Feb 7, 2010, 12:49:29 PM2/7/10
to Google App Engine
Nick, et al.:

Could we get a pointer or pointers to in-line documentation as related
to query cursors? If this is what it sounds like, it is far-and-away
the biggest and most useful new feature in the SDK, but we need some
hints about how it is used. I've browsed the source code and found
plenty of referneces to cursors, but there's not much actualyl useful
info about how they are used.

Ross Karchner

unread,
Feb 7, 2010, 8:45:31 PM2/7/10
to google-a...@googlegroups.com
I'm loving cursors and transactional tasks (which may be the geekiest sentence I've ever written)-- I hope send_mail is next to get transactional treatment.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Nickolas Daskalou

unread,
Feb 7, 2010, 8:51:28 PM2/7/10
to google-a...@googlegroups.com, google-a...@googlegroups.com
Where did you find the documentation for how to use these two new features?

Takashi Matsuo

unread,
Feb 7, 2010, 8:52:29 PM2/7/10
to google-a...@googlegroups.com
Hi Ross,

Agreed!

For the time being, you can use following strategy for a workaround.
1) prepare a handler for sending particular mail
2) put this handler into the task queue in a transactional manner

Regards,

--
Takashi Matsuo
Kay's daddy

Ross Karchner

unread,
Feb 7, 2010, 9:13:10 PM2/7/10
to google-a...@googlegroups.com
Poking around the code, and trial an error.

For transactional tasks, just add transactional=True to any tasks you create in a transaction-- if the transaction fails, the task won't be created.

For cursors, after doing a fetch() on a Query, .cursor() (on the Query itself , not the results) will return a big ugly string that acts as the cursor.

Later (probably in a separate handler), use .with_cursor(the_cursor) to modify the query. When you fetch, it will pick up where the previous fetch left off.


Excerpted from my code:

    subscriber_q=Query(subscriptions.models.Subscription, keys_only=True).filter('site =', edition.site).filter('newsletter =', edition.newsletter).filter('active =', True)
    if request.form.has_key('cursor'):
        subscriber_q=subscriber_q.with_cursor(request.form['cursor'])
    subscribers=subscriber_q.fetch(BATCH_SIZE)

(it might look a little funny since I'm using YARO for my request object http://lukearno.com/projects/yaro/    )

Nickolas Daskalou

unread,
Feb 7, 2010, 9:36:17 PM2/7/10
to google-a...@googlegroups.com
Thanks Ross, this is very helpful. :)

Nick

风笑雪

unread,
Feb 8, 2010, 8:13:14 AM2/8/10
to google-a...@googlegroups.com
Seems no one reply me, so I copy and paste it here:

The memcache.grab_tail() function is very useful for counting, but
why it only returns values, not a (key, value) pair list or dict?

If a (key, value) pair list is available to retrieve, I think counting
would be even easier like this:

1) add a page view for article 1:

memcache.incr('1', namespace='article counter', initial_value=0)

2) add a page view for article 2:

memcache.incr('2', namespace='article counter', initial_value=0)

3) use cron jobs to update memcache counter to datastore:

counter = memcache.grab_tail(10, 'article counter')

for article_id, pv in counter:
# update to datastore


But now, counter only contains pv, no article_id in it, so I can't
find a simple way to implement the same scenario.

杨浩

unread,
Feb 8, 2010, 8:16:02 AM2/8/10
to google-a...@googlegroups.com
^ ^
I see you!

2010/2/8 风笑雪 <kea...@gmail.com>
B56.gif
1E3.png

Franck

unread,
Feb 7, 2010, 8:50:23 AM2/7/10
to Google App Engine
Didn't find doc about "Support for Custom Admin Console pages" ?

Is this a way to add our data in the Admin Console Web app ?

Franck

unread,
Feb 7, 2010, 8:56:28 AM2/7/10
to Google App Engine
Didn't find doc about the "Support for Custom Admin Console pages" ?

Could that means that we can add custom data to Admin Console webapp ?

PS.: Ikai, sorry about the direct message

Ikai L (Google)

unread,
Feb 8, 2010, 2:06:51 PM2/8/10
to google-a...@googlegroups.com
The official docs are pending, but here's Nick Johnson to the rescue:

Stephen

unread,
Feb 8, 2010, 4:33:15 PM2/8/10
to Google App Engine

On Feb 8, 7:06 pm, "Ikai L (Google)" <ika...@google.com> wrote:
> The official docs are pending, but here's Nick Johnson to the rescue:
>
> http://blog.notdot.net/2010/02/New-features-in-1-3-1-prerelease-Cursors


What are the performance characteristics of cursors?

The serialised cursor shows that it stores an offset. Does that mean
that if the offset is one million, one million rows will have to be
skipped before the next 10 are returned? This will be faster than
doing it in your app, but not as quick as the existing bookmark
techniques which use the primary key index.

Or is the server-side stateful, like a typical SQL implementation of
cursors? In which case, are there any limits to the number of active
cursors? Or what if a cursor is resumed some time in the future; will
it work at all, or work slower?

Alkis Evlogimenos ('Αλκης Ευλογημένος)

unread,
Feb 8, 2010, 5:07:24 PM2/8/10
to google-a...@googlegroups.com
There is no offset. The protocol buffer stores a start_key and a boolean denoting if this start key is inclusive or not. The performance of continuing the fetch from a cursor should be the same as the performance of the first entities you got from a query.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.




--

Alkis

Ikai L (Google)

unread,
Feb 8, 2010, 5:17:46 PM2/8/10
to google-a...@googlegroups.com
I got beaten to this answer. No, there is no traversal to get to the offset.

BigTable has an underlying mechanism for range queries on keys. Indexes are essentially a key comprised of a concatenation of application ID, entity type, column, value. When a filter operation is performed, the datastore looks for a range matching this criteria, returning the set of keys. A cursor also adds the datastore key of the entity so it is possible to serialize where to begin the query. This is actually a bit awkward to explain without visuals. You can watch Ryan Barrett's talk here:


Hopefully, we'll be able to post an article at some point in the future explaining how cursors work.

2010/2/8 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlog...@gmail.com>



--

Stephen

unread,
Feb 8, 2010, 6:25:51 PM2/8/10
to Google App Engine

Ah right, Nick's blog does say start_key and not offset. My bad.

Maybe there will be warnings in the upcoming documentation, but my
first instinct was to embed the serialised cursor in the HTML as the
'next' link. But that doesn't look like a good idea as Nick's decoded
query shows what's embedded:

PrimaryScan {
start_key: "shell\000TestModel\000foo\000\232bar\000\200"
start_inclusive: true
}
keys_only: false

First, you may or may not want to leak this info. Second, could this
be altered on the client to change the query in any way that's
undesirable?

Once you have a cursor, where do you store it so you can use it again?


On Feb 8, 10:17 pm, "Ikai L (Google)" <ika...@google.com> wrote:
> I got beaten to this answer. No, there is no traversal to get to the offset.
>
> BigTable has an underlying mechanism for range queries on keys. Indexes are
> essentially a key comprised of a concatenation of application ID, entity
> type, column, value. When a filter operation is performed, the datastore
> looks for a range matching this criteria, returning the set of keys. A
> cursor also adds the datastore key of the entity so it is possible to
> serialize where to begin the query. This is actually a bit awkward to
> explain without visuals. You can watch Ryan Barrett's talk here:
>
> http://www.youtube.com/watch?v=tx5gdoNpcZM
>
> Hopefully, we'll be able to post an article at some point in the future
> explaining how cursors work.
>

> 2010/2/8 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlogime...@gmail.com>


>
>
>
> > There is no offset. The protocol buffer stores a start_key and a boolean
> > denoting if this start key is inclusive or not. The performance of
> > continuing the fetch from a cursor should be the same as the performance of
> > the first entities you got from a query.
>

> > On Mon, Feb 8, 2010 at 4:33 PM, Stephen <sdea...@gmail.com> wrote:
>
> >> On Feb 8, 7:06 pm, "Ikai L (Google)" <ika...@google.com> wrote:
> >> > The official docs are pending, but here's Nick Johnson to the rescue:
>
> >> >http://blog.notdot.net/2010/02/New-features-in-1-3-1-prerelease-Cursors
>
> >> What are the performance characteristics of cursors?
>
> >> The serialised cursor shows that it stores an offset. Does that mean
> >> that if the offset is one million, one million rows will have to be
> >> skipped before the next 10 are returned? This will be faster than
> >> doing it in your app, but not as quick as the existing bookmark
> >> techniques which use the primary key index.
>
> >> Or is the server-side stateful, like a typical SQL implementation of
> >> cursors? In which case, are there any limits to the number of active
> >> cursors? Or what if a cursor is resumed some time in the future; will
> >> it work at all, or work slower?
>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Google App Engine" group.
> >> To post to this group, send email to google-a...@googlegroups.com.
> >> To unsubscribe from this group, send email to

> >> google-appengi...@googlegroups.com<google-appengine%2Bunsu...@googlegroups.com>


> >> .
> >> For more options, visit this group at
> >>http://groups.google.com/group/google-appengine?hl=en.
>
> > --
>
> > Alkis
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine" group.
> > To post to this group, send email to google-a...@googlegroups.com.
> > To unsubscribe from this group, send email to

> > google-appengi...@googlegroups.com<google-appengine%2Bunsu...@googlegroups.com>

Ikai L (Google)

unread,
Feb 8, 2010, 7:26:13 PM2/8/10
to google-a...@googlegroups.com
A cursor serializes to a Base64 encoded String, so you can store it anywhere you want to store strings: Memcached, Datastore, etc. You can even pass it as an URL parameter to task queues.

2010/2/8 Stephen <sde...@gmail.com>
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.




--
Ikai Lan

Stephen

unread,
Feb 8, 2010, 7:59:03 PM2/8/10
to Google App Engine

I'm asking if it's wise to store it as a query parameter embedded in a
web page.


On Feb 9, 12:26 am, "Ikai L (Google)" <ika...@google.com> wrote:
> A cursor serializes to a Base64 encoded String, so you can store it anywhere
> you want to store strings: Memcached, Datastore, etc. You can even pass it
> as an URL parameter to task queues.
>

> 2010/2/8 Stephen <sdea...@gmail.com>

> > <google-appengine%2Bunsu...@googlegroups.com<google-appengine%252Buns...@googlegroups.com>


>
> > > >> .
> > > >> For more options, visit this group at
> > > >>http://groups.google.com/group/google-appengine?hl=en.
>
> > > > --
>
> > > > Alkis
>
> > > > --
> > > > You received this message because you are subscribed to the Google
> > Groups
> > > > "Google App Engine" group.
> > > > To post to this group, send email to google-a...@googlegroups.com
> > .
> > > > To unsubscribe from this group, send email to
> > > > google-appengi...@googlegroups.com<google-appengine%2Bunsu...@googlegroups.com>

> > <google-appengine%2Bunsu...@googlegroups.com<google-appengine%252Buns...@googlegroups.com>

Nick Johnson (Google)

unread,
Feb 9, 2010, 6:14:16 AM2/9/10
to google-a...@googlegroups.com
2010/2/9 Stephen <sde...@gmail.com>


I'm asking if it's wise to store it as a query parameter embedded in a
web page.

You're right that it's unwise. Depending on how you construct your query, a user could potentially modify the cursor they send to you to return results from any query your datastore is capable of performing, which could result in you revealing information to the user that they shouldn't know. You should either store the cursor on the server-side, or encrypt it before sending it to the client.

I was going to mention something about this in my post, but it slipped my mind.

-Nick Johnson 
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.




--

Nickolas Daskalou

unread,
Feb 9, 2010, 8:53:20 AM2/9/10
to google-a...@googlegroups.com
Will we be able to construct our own cursors much the same way that we are able to construct our own Datastore keys (Key.from_path())?

Also along the same lines, will we be able to "deconstruct" a cursor to get its components (offset, start_inclusive etc.), as we can now do with keys (key.name(), key.id(), key.kind() etc.)?


2010/2/9 Nick Johnson (Google) <nick.j...@google.com>

Nick Johnson (Google)

unread,
Feb 9, 2010, 9:03:50 AM2/9/10
to google-a...@googlegroups.com
Hi Nickolas,

2010/2/9 Nickolas Daskalou <ni...@daskalou.com>

Will we be able to construct our own cursors much the same way that we are able to construct our own Datastore keys (Key.from_path())?

No, not practically speaking.
 

Also along the same lines, will we be able to "deconstruct" a cursor to get its components (offset, start_inclusive etc.), as we can now do with keys (key.name(), key.id(), key.kind() etc.)?

While you could do this, there's no guarantees that it'll work (or continue to work), as you'd be digging into internal implementation details. Why do you want to do this?

-Nick Johnson

Nickolas Daskalou

unread,
Feb 9, 2010, 9:45:45 AM2/9/10
to google-a...@googlegroups.com
I'd want to do this so that I could include parts of the cursor (such as the offset) into a URL without including other parts (eg. the model kind and filters). I could then reconstruct the cursor on the server side based on what was passed into the URL.

For example, if I was searching for blog comments that contained the word "the" (with the default order being the creation time, descending), the URL might look like this:

myblog.com/comments/?q=the

With model:

class Comment(db.Model):
  ....
  created_at = db.DateTimeProperty(auto_now_add=True)
  words = db.StringListProperty() # A list of all the words in a comment (forget about exploding indexes for now)
  ...

The query object for this URL might look something like:

....
q = Comment.all().filter('words',self.request.get('q')).order('-created_at')
....

To get to the 1001st comment, it'd be good if the URL looked something like this:

myblog.com/comments/?q=the&skip=1000

instead of:

myblog.com/comments/?q=the&cursor=[something ugly]

so that when the request comes in, I can do this:

....
q = Comment.all().filter('words',self.request.get('q')).order('-created_at')
cursor_template = q.cursor_template()
cursor = db.Cursor.from_template(cursor_template,offset=int(self.request.get('skip')))
....
(or something along these lines)

Does that make sense?

Nick Johnson (Google)

unread,
Feb 9, 2010, 9:51:58 AM2/9/10
to google-a...@googlegroups.com
Hi Nickolas,

2010/2/9 Nickolas Daskalou <ni...@daskalou.com>
I'd want to do this so that I could include parts of the cursor (such as the offset) into a URL without including other parts (eg. the model kind and filters). I could then reconstruct the cursor on the server side based on what was passed into the URL.

The offset argument you're talking about is specific to the dev_appserver's implementation of cursors. In production, offsets are not used, so this won't work.

-Nick Johnson

Nickolas Daskalou

unread,
Feb 9, 2010, 10:07:14 AM2/9/10
to google-a...@googlegroups.com
Does the production cursor string contain information about the app id, kind, any filter()s or order()s, and (more importantly) some sort of numerical value that indicates how many records the next query should "skip"? If so, and if we could extract this information (and then use it again to the reconstruct the cursor), that would make for much cleaner, safer and intuitive URLs than including the entire cursor string (or some sort of encrypted/encoded cursor string replacement).


2010/2/10 Nick Johnson (Google) <nick.j...@google.com>

Alkis Evlogimenos ('Αλκης Ευλογημένος)

unread,
Feb 9, 2010, 10:25:53 AM2/9/10
to google-a...@googlegroups.com
If the cursor had to skip entries by using an offset, its performance would depend on the size of the offset. This is what the current Query.fetch() api is doing when you give it an offset. A cursor is a pointer to the entry from which the next query will start. It has no notion of offset.

Alkis

Jeff Schnitzer

unread,
Feb 9, 2010, 12:02:19 PM2/9/10
to google-a...@googlegroups.com
Still, a slightly modified version of the original request does not
seem unreasonable. He would have to formulate his URLs something like
this:

myblog.com/comments/?q=the&first=1234

or maybe:

myblog.com/comments/?q=the&after=1234

I could see this being really useful, since encrypting (or worse,
storing on the server) the cursor is pretty painful. Furthermore, it
seems highly probable that as things are, many people will obliviously
write public webapps that take a raw cursor as a parameter. This
could be the new SQL injection attack.

Jeff

2010/2/9 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlog...@gmail.com>:

Jeff Schnitzer

unread,
Feb 9, 2010, 12:04:45 PM2/9/10
to google-a...@googlegroups.com
In case it wasn't completely clear - 1234 in this example is the
object's id, not an offset.

Jeff

Nickolas Daskalou

unread,
Feb 9, 2010, 10:31:06 PM2/9/10
to google-a...@googlegroups.com
What Jeff suggests would also be great. So one page 1 we could do something like:

....
next_page_url = 'http://myblog.com/comments/?q=the&first=%d' % cursor.start_key().id()

....

and then reconstruct the cursor on page 2:

....
cursor = db.Cursor(start_key=db.Key('Comment',int(self.request.get('first'))), ....)

....

How about it Google? Will we be able to do this?


2010/2/10 Jeff Schnitzer <je...@infohazard.org>

Piotr Jaroszyński

unread,
Feb 10, 2010, 12:00:34 AM2/10/10
to google-a...@googlegroups.com
Hello,

Not sure whether it was present in 1.3.0 but there is an unpleasant
bug in 1.3.1 where blobstore request mangling breaks data encoding
[1].

[1] - http://code.google.com/p/googleappengine/issues/detail?id=2749

--
Best Regards
Piotr Jaroszyński

Nick Johnson (Google)

unread,
Feb 10, 2010, 6:10:13 AM2/10/10
to google-a...@googlegroups.com
Hi Nickolas,

2010/2/10 Nickolas Daskalou <ni...@daskalou.com>

What Jeff suggests would also be great. So one page 1 we could do something like:

....
next_page_url = 'http://myblog.com/comments/?q=the&first=%d' % cursor.start_key().id()

....

and then reconstruct the cursor on page 2:

....
cursor = db.Cursor(start_key=db.Key('Comment',int(self.request.get('first'))), ....)

....

How about it Google? Will we be able to do this?

The cursor format is internal, and it's not really amenable to being parsed like this, since it will vary depending on the type of query you're executing.

-Nick Johnson

Christian Schuhegger

unread,
Feb 14, 2010, 12:27:19 PM2/14/10
to Google App Engine
Hello,

I understand that this is a prerelease, but is there a maven
repository where I can reference this new SDK? I did not find it at:
http://maven-gae-plugin.googlecode.com/svn/repository/com/google/appengine/

Karter

unread,
Feb 15, 2010, 12:31:49 PM2/15/10
to Google App Engine
Hi Folks,
great discussion - I'm still unable to understand if cursors can be
used to implement a clear cut pagination solution (assuming we encrypt
the cursor key) and send it to the html page. As of now, the offset
and the total number of results seems like the only way to go.

By pagination, I mean (next and previous) and that being available at
any point:

a' la
http://www.google.com/search?hl=en&q=test+internet+speed&start=10&sa=N

On Feb 10, 3:10 am, "Nick Johnson (Google)" <nick.john...@google.com>
wrote:
> Hi Nickolas,
>
> 2010/2/10 Nickolas Daskalou <n...@daskalou.com>


>
>
>
>
>
> > What Jeff suggests would also be great. So one page 1 we could do something
> > like:
>
> > ....
> > next_page_url = 'http://myblog.com/comments/?q=the&first=%d'%
> > cursor.start_key().id()
> > ....
>
> > and then reconstruct the cursor on page 2:
>
> > ....
> > cursor =
> > db.Cursor(start_key=db.Key('Comment',int(self.request.get('first'))), ....)
> > ....
>
> > How about it Google? Will we be able to do this?
>
> The cursor format is internal, and it's not really amenable to being parsed
> like this, since it will vary depending on the type of query you're
> executing.
>
> -Nick Johnson
>
>
>
>
>

> > 2010/2/10 Jeff Schnitzer <j...@infohazard.org>


>
> > In case it wasn't completely clear - 1234 in this example is the
> >> object's id, not an offset.
>
> >> Jeff
>

> >> On Tue, Feb 9, 2010 at 9:02 AM, Jeff Schnitzer <j...@infohazard.org>


> >> wrote:
> >> > Still, a slightly modified version of the original request does not
> >> > seem unreasonable.  He would have to formulate his URLs something like
> >> > this:
>
> >> > myblog.com/comments/?q=the&first=1234
>
> >> > or maybe:
>
> >> > myblog.com/comments/?q=the&after=1234
>
> >> > I could see this being really useful, since encrypting (or worse,
> >> > storing on the server) the cursor is pretty painful.  Furthermore, it
> >> > seems highly probable that as things are, many people will obliviously
> >> > write public webapps that take a raw cursor as a parameter.  This
> >> > could be the new SQL injection attack.
>
> >> > Jeff
>

> >> > 2010/2/9 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlogime...@gmail.com


> >> >:
> >> >> If the cursor had to skip entries by using an offset, its performance
> >> would
> >> >> depend on the size of the offset. This is what the current
> >> Query.fetch() api
> >> >> is doing when you give it an offset. A cursor is a pointer to the entry
> >> from
> >> >> which the next query will start. It has no notion of offset.

> >> >> On Tue, Feb 9, 2010 at 4:07 PM, Nickolas Daskalou <n...@daskalou.com>


> >> wrote:
>
> >> >>> Does the production cursor string contain information about the app
> >> id,
> >> >>> kind, any filter()s or order()s, and (more importantly) some sort of
> >> >>> numerical value that indicates how many records the next query should
> >> >>> "skip"? If so, and if we could extract this information (and then use
> >> it
> >> >>> again to the reconstruct the cursor), that would make for much
> >> cleaner,
> >> >>> safer and intuitive URLs than including the entire cursor string (or
> >> some
> >> >>> sort of encrypted/encoded cursor string replacement).
>

> >> >>> 2010/2/10 Nick Johnson (Google) <nick.john...@google.com>
>
> >> >>>> Hi Nickolas,
>
> >> >>>> 2010/2/9 Nickolas Daskalou <n...@daskalou.com>

> >> >>>>> <nick.john...@google.com> wrote:
>
> >> >>>>>> Hi Nickolas,
>
> >> >>>>>> 2010/2/9 Nickolas Daskalou <n...@daskalou.com>


>
> >> >>>>>>> Will we be able to construct our own cursors much the same way
> >> that we
> >> >>>>>>> are able to construct our own Datastore keys (Key.from_path())?
>
> >> >>>>>> No, not practically speaking.
>
> >> >>>>>>> Also along the same lines, will we be able to "deconstruct" a
> >> cursor
> >> >>>>>>> to get its components (offset, start_inclusive etc.), as we can
> >> now do with
> >> >>>>>>> keys (key.name(), key.id(), key.kind() etc.)?
>
> >> >>>>>> While you could do this, there's no guarantees that it'll work (or
> >> >>>>>> continue to work), as you'd be digging into internal implementation
> >> details.
> >> >>>>>> Why do you want to do this?
> >> >>>>>> -Nick Johnson
>

> >> >>>>>>> 2010/2/9 Nick Johnson (Google) <nick.john...@google.com>
>
> >> >>>>>>>> 2010/2/9 Stephen <sdea...@gmail.com>

> ...
>
> read more »

Andy Freeman

unread,
Feb 16, 2010, 9:24:47 AM2/16/10
to Google App Engine
> Furthermore, it
> seems highly probable that as things are, many people will obliviously
> write public webapps that take a raw cursor as a parameter. This
> could be the new SQL injection attack.

Can you comment a bit more on the security issues?

AFAIK, cursors can not be used to write anything. The cursor still
has to match the query with its parameters, so I don't see how they
can synthesize a cursor to see anything that they haven't already seen
(replay) or that they'd see by requesting more and more pages (skip
ahead).

The cursor may, as part of its "is this the right query" content,
reveal something about the query.

Hmm - the latter seems somewhat serious. It isn't data modification,
but it is a data reveal.

What information can someone extract from a production cursor? Does
it contain the parameters (bad) or signatures (okay if someone can't
derive one parameter given the other parameters).

-andy


On Feb 9, 9:02 am, Jeff Schnitzer <j...@infohazard.org> wrote:
> Still, a slightly modified version of the original request does not
> seem unreasonable.  He would have to formulate his URLs something like
> this:
>
> myblog.com/comments/?q=the&first=1234
>
> or maybe:
>
> myblog.com/comments/?q=the&after=1234
>
> I could see this being really useful, since encrypting (or worse,
> storing on the server) the cursor is pretty painful.  Furthermore, it
> seems highly probable that as things are, many people will obliviously
> write public webapps that take a raw cursor as a parameter.  This
> could be the new SQL injection attack.
>
> Jeff
>

> 2010/2/9 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlogime...@gmail.com>:> If the cursor had to skip entries by using an offset, its performance would


> > depend on the size of the offset. This is what the current Query.fetch() api
> > is doing when you give it an offset. A cursor is a pointer to the entry from
> > which the next query will start. It has no notion of offset.

> > On Tue, Feb 9, 2010 at 4:07 PM, Nickolas Daskalou <n...@daskalou.com> wrote:
>
> >> Does the production cursor string contain information about the app id,
> >> kind, any filter()s or order()s, and (more importantly) some sort of
> >> numerical value that indicates how many records the next query should
> >> "skip"? If so, and if we could extract this information (and then use it
> >> again to the reconstruct the cursor), that would make for much cleaner,
> >> safer and intuitive URLs than including the entire cursor string (or some
> >> sort of encrypted/encoded cursor string replacement).
>

> >> 2010/2/10 Nick Johnson (Google) <nick.john...@google.com>
>
> >>> Hi Nickolas,
>
> >>> 2010/2/9 Nickolas Daskalou <n...@daskalou.com>

> >>>> db.Cursor.from_template(cursor_template,offset=int(self.request.get('skip')­))


> >>>> ....
> >>>> (or something along these lines)
>
> >>>> Does that make sense?
>
> >>>> On 10 February 2010 01:03, Nick Johnson (Google)

> >>>> <nick.john...@google.com> wrote:
>
> >>>>> Hi Nickolas,
>
> >>>>> 2010/2/9 Nickolas Daskalou <n...@daskalou.com>


>
> >>>>>> Will we be able to construct our own cursors much the same way that we
> >>>>>> are able to construct our own Datastore keys (Key.from_path())?
>
> >>>>> No, not practically speaking.
>
> >>>>>> Also along the same lines, will we be able to "deconstruct" a cursor
> >>>>>> to get its components (offset, start_inclusive etc.), as we can now do with
> >>>>>> keys (key.name(), key.id(), key.kind() etc.)?
>
> >>>>> While you could do this, there's no guarantees that it'll work (or
> >>>>> continue to work), as you'd be digging into internal implementation details.
> >>>>> Why do you want to do this?
> >>>>> -Nick Johnson
>

> >>>>>> 2010/2/9 Nick Johnson (Google) <nick.john...@google.com>
>
> >>>>>>> 2010/2/9 Stephen <sdea...@gmail.com>

> >>>>>>>> > > > >> google-appengi...@googlegroups.com<google-appengine%2Bunsubscrib­e...@googlegroups.com>
>
> >>>>>>>> > > <google-appengine%2Bunsu...@googlegroups.com<google-appengine%252Bunsub­scr...@googlegroups.com>


>
> >>>>>>>> > > > >> .
> >>>>>>>> > > > >> For more options, visit this group at
> >>>>>>>> > > > >>http://groups.google.com/group/google-appengine?hl=en.
>
> >>>>>>>> > > > > --
>
> >>>>>>>> > > > > Alkis
>
> >>>>>>>> > > > > --
> >>>>>>>> > > > > You received this message because you are subscribed to the
> >>>>>>>> > > > > Google
> >>>>>>>> > > Groups
> >>>>>>>> > > > > "Google App Engine" group.
> >>>>>>>> > > > > To post to this group, send email to
> >>>>>>>> > > > > google-a...@googlegroups.com
> >>>>>>>> > > .
> >>>>>>>> > > > > To unsubscribe from this group, send email to
>

> >>>>>>>> > > > > google-appengi...@googlegroups.com<google-appengine%2Bunsubscrib­e...@googlegroups.com>
>
> >>>>>>>> > > <google-appengine%2Bunsu...@googlegroups.com<google-appengine%252Bunsub­scr...@googlegroups.com>


>
> >>>>>>>> > > > > .
> >>>>>>>> > > > > For more options, visit this group at
> >>>>>>>> > > > >http://groups.google.com/group/google-appengine?hl=en.
>
> >>>>>>>> > > > --
> >>>>>>>> > > > Ikai Lan
> >>>>>>>> > > > Developer Programs Engineer, Google App Enginehttp://
> >>>>>>>> > > googleappengine.blogspot.com|http://twitter.com/app_engine
>
> >>>>>>>> > > --
> >>>>>>>> > > You received this message because you are subscribed to the
> >>>>>>>> > > Google Groups
> >>>>>>>> > > "Google App Engine" group.
> >>>>>>>> > > To post to this group, send email to
> >>>>>>>> > > google-a...@googlegroups.com.
> >>>>>>>> > > To unsubscribe from this group, send email to
>

> >>>>>>>> > > google-appengi...@googlegroups.com<google-appengine%2Bunsubscrib­e...@googlegroups.com>

Christian Schuhegger

unread,
Feb 16, 2010, 10:29:18 AM2/16/10
to Google App Engine
I just noticed that the maven repository was updated and all jars that
I need are there. many thanks.

On Feb 14, 6:27 pm, Christian Schuhegger


<christian.schuheg...@gmail.com> wrote:
> Hello,
>
> I understand that this is a prerelease, but is there a maven

> repository where I can reference this new SDK? I did not find it at:http://maven-gae-plugin.googlecode.com/svn/repository/com/google/appe...

Nick Johnson (Google)

unread,
Feb 16, 2010, 11:46:00 AM2/16/10
to google-a...@googlegroups.com
Hi Andy,

2010/2/16 Andy Freeman <ana...@earthlink.net>

> Furthermore, it
> seems highly probable that as things are, many people will obliviously
> write public webapps that take a raw cursor as a parameter.  This
> could be the new SQL injection attack.

Can you comment a bit more on the security issues?

AFAIK, cursors can not be used to write anything.  The cursor still
has to match the query with its parameters, so I don't see how they
can synthesize a cursor to see anything that they haven't already seen
(replay) or that they'd see by requesting more and more pages (skip
ahead).

I was mistaken when I stated that they shouldn't be sent to the user in the clear. As you point out, in order to use a cursor, you still have to reconstruct the original query, so a user could not modify a cursor to cause you to display records they should not have access to.


The cursor may, as part of its "is this the right query" content,
reveal something about the query.

Hmm - the latter seems somewhat serious.  It isn't data modification,
but it is a data reveal.

This is true, though I wouldn't personally consider it a serious issue. That is up to you, naturally.
 

What information can someone extract from a production cursor?  Does
it contain the parameters (bad) or signatures (okay if someone can't
derive one parameter given the other parameters).

It contains the complete key of the next record to be returned, along with some extra information about the query. Feel free to experiment and see for yourself, of course. :)

-Nick Johnson

Waldemar Kornewald

unread,
Feb 16, 2010, 6:02:37 PM2/16/10
to Google App Engine
Hi,
what's the advantage of cursors compared to key-based pagination? The
latter at least allows for paginating backwards from any point. Why
don't cursors just build on the same principle?

Bye,
Waldemar Kornewald

On Feb 16, 5:46 pm, "Nick Johnson (Google)" <nick.john...@google.com>

> > start_key...
>
> read more »

ryan

unread,
Feb 16, 2010, 10:17:39 PM2/16/10
to Google App Engine
On Feb 4, 3:31 pm, Kyle Heu <kyleheus...@gmail.com> wrote:
> Do the Datastore Cursors solve the 1000 query limit?

actually, we removed the 1000 result limit independent of cursors. you
can now fetch or iterate more than 1000 results, no cursor needed.

On Feb 5, 11:04 am, alf <alberto....@gmail.com> wrote:
> and 30 sec time window?

definitely not. :P

ryan

unread,
Feb 16, 2010, 10:28:12 PM2/16/10
to Google App Engine
On Feb 7, 5:52 pm, Takashi Matsuo <matsuo.taka...@gmail.com> wrote:
>
> For the time being, you can use following strategy for a workaround.
> 1) prepare a handler for sending particular mail
> 2) put this handler into the task queue in a transactional manner

exactly! we actually don't even consider it a workaround, per se. it's
the recommended way to attach any API call or chunk of code to a
datastore transaction so that it's guaranteed to happen if the
transaction commits.

it would take a significant amount of effort to attach another
individual API (like mail) to datastore transactions in the datastore
backend. given that, we chose to do just the task queue because it can
run arbitrary code, which means you can use it to make any API call
transactional.

granted, enqueueing a task to run an API call does take a little extra
setup. if that's a concern, though, the deferred library mostly
addresses it:

http://code.google.com/appengine/articles/deferred.html

i think it might not compatible with transactional tasks quite yet:

http://code.google.com/p/googleappengine/issues/detail?id=2721

but assuming that's true i expect we'll fix it soon.

ryan

unread,
Feb 16, 2010, 10:34:56 PM2/16/10
to Google App Engine
On Feb 8, 1:33 pm, Stephen <sdea...@gmail.com> wrote:
> What are the performance characteristics of cursors?

good question! we'll address this in the docs, but for now...

> The serialised cursor shows that it stores an offset. Does that mean
> that if the offset is one million, one million rows will have to be
> skipped before the next 10 are returned? This will be faster than
> doing it in your app, but not as quick as the existing bookmark
> techniques which use the primary key index.

that offset field is very rarely used, only e.g. if you provide an
offset on the original query, start it but don't actually fetch any
results, then ask for the cursor.

cursors store direct pointer(s) to the index row(s) where the query
will resume scanning. in that sense, they work the same way as the
existing bookmark techniques, except they're (obviously) much easier
to use, work with any query, and don't require any extra
property(ies).

> Or is the server-side stateful, like a typical SQL implementation of
> cursors? In which case, are there any limits to the number of active
> cursors? Or what if a cursor is resumed some time in the future; will
> it work at all, or work slower?

no, cursors are stateless. all necessary information is included in
the cursor blob itself. among other things, this happily means that
resuming a cursor years later is just as fast as resuming it seconds
later.

ryan

unread,
Feb 16, 2010, 11:42:41 PM2/16/10
to Google App Engine
hah. ryan, meet groups. groups, meet ryan. that "Newer" link at the
bottom is really useful. it shows all these extra messages that you
don't see at first, sometimes even ones that have already said what
you're about to say. so useful! :P

ryan

unread,
Feb 17, 2010, 12:30:08 AM2/17/10
to Google App Engine
On Feb 10, 3:10 am, "Nick Johnson (Google)" <nick.john...@google.com>
wrote:

> The cursor format is internal, and it's not really amenable to being parsed
> like this, since it will vary depending on the type of query you're
> executing.

+1. the cursor format is an implementation detail. you definitely can
decode a cursor, muck with it, and re-encode it, and it will work.
that's definitely unsupported, though, so we'd discourage it.

On Feb 16, 8:46 am, "Nick Johnson (Google)" <nick.john...@google.com>
wrote:


> As you point out, in order to use a cursor, you still have to
> reconstruct the original query, so a user could not modify a cursor to cause
> you to display records they should not have access to.

+1. this is important.

> It contains the complete key of the next record to be returned, along with
> some extra information about the query. Feel free to experiment and see for
> yourself, of course. :)

specifically, the "extra information" is your app id and the kinds and
some property values, and possibly also property names, of one or more
entities that were query results. the specific properties involved are
the properties in the query.

On Feb 16, 3:02 pm, Waldemar Kornewald <wkornew...@gmail.com> wrote:
> what's the advantage of cursors compared to key-based pagination? The
> latter at least allows for paginating backwards from any point. Why
> don't cursors just build on the same principle?

the main advantage is that cursors are a built in, turnkey solution.
key-based pagination takes much more work on the developer's part:
extracting all of the property values and key required to resume the
query, packing them up together and serializing them, then later
injecting them back into the query as filter values to resume. what's
more, key-based pagination often requires an extra custom index.

you're right, though, key-based pagination does support paging
backward, even if it generally requires yet another custom index.
cursors don't easily support paging backward right now, but they
definitely could. we can think about adding that in the future.

Takashi Matsuo

unread,
Feb 17, 2010, 11:23:02 AM2/17/10
to google-a...@googlegroups.com
On Wed, Feb 17, 2010 at 12:28 PM, ryan <ryanb+a...@google.com> wrote:
> On Feb 7, 5:52 pm, Takashi Matsuo <matsuo.taka...@gmail.com> wrote:
>>
>> For the time being, you can use following strategy for a workaround.
>> 1) prepare a handler for sending particular mail
>> 2) put this handler into the task queue in a transactional manner
>
> exactly! we actually don't even consider it a workaround, per se. it's
> the recommended way to attach any API call or chunk of code to a
> datastore transaction so that it's guaranteed to happen if the
> transaction commits.
>
> it would take a significant amount of effort to attach another
> individual API (like mail) to datastore transactions in the datastore
> backend. given that, we chose to do just the task queue because it can
> run arbitrary code, which means you can use it to make any API call
> transactional.

Its nice to know the opinion from App Engine team!
Agreed that using transactional task queue for other APIs is somewhat
reasonable and reliable.
Having said that, the strategy is still not perfect.

In my opinion, the only problem is the task for sending mail is *not*
idempotent. In other words, what if the task is executed more than
once? Presumably recipients will receive more than one mail. Its
somewhat annoying, isn't it?

As far as I know, a background process in charge of managing and
triggering tasks *can* mis-recognize that the task execution is failed
while the execution is actually succeeded, right?

If so, there is still a possibility for receiving redundant mails.
Definitely things become better than before, but is still not perfect.

I know its a very difficult problem, but could you please be aware of
this room for improvement?
Anyway, thanks for the great new release. Developers around me and I
appreciate your efforts very much!

> granted, enqueueing a task to run an API call does take a little extra
> setup. if that's a concern, though, the deferred library mostly
> addresses it:
>
> http://code.google.com/appengine/articles/deferred.html
>
> i think it might not compatible with transactional tasks quite yet:
>
> http://code.google.com/p/googleappengine/issues/detail?id=2721
>
> but assuming that's true i expect we'll fix it soon.

Its very good to know that you're aware of the issue I posted :)
(because it was left as NEW state, so I was afraid it was ignored)

Regards,

--
Takashi Matsuo
Kay's daddy

ryan

unread,
Feb 17, 2010, 12:52:13 PM2/17/10
to Google App Engine
On Feb 17, 8:23 am, Takashi Matsuo <matsuo.taka...@gmail.com> wrote:
> In my opinion, the only problem is the task for sending mail is *not*
> idempotent. In other words, what if the task is executed more than
> once? Presumably recipients will receive more than one mail. Its
> somewhat annoying, isn't it?

true! good point. idempotence is an important, general concern, across
most systems. on app engine, if it's important that you only do
something once, doing it in a request handler directly isn't really
any better than doing it in a task, since both could die at any point
and/or run multiple times.

idempotence is expensive and difficult to get right, and the specifics
often change significantly from project to project, so we generally
leave it to developers themselves or client-side libraries. if you
need it, i actually think i've seen open source libraries for app
engine that do much of the heavy lifting for you.

Karthik Ram

unread,
Feb 17, 2010, 4:01:24 PM2/17/10
to google-a...@googlegroups.com
They are faster - use less system resources - but I do not see an effective way to paginate using the current implementation of cursors which are "forward only" unless YOU store the cursor . Given that cursors are "opaque", there is no guarantee if that will work over time.

2010/2/16 Waldemar Kornewald <wkorn...@gmail.com>

--

James Ashley

unread,
Feb 17, 2010, 9:08:55 PM2/17/10
to Google App Engine
I don't have time to test this just now, and I hate to waste bandwidth
on what (so far) is just idle speculation, but I did spot what looks
like a potential security. I figured it'd be better to share it with
the group than sit on it until I actually *do* have time to confirm/
deny.

On Feb 16, 11:30 pm, ryan <ryanb+appeng...@google.com> wrote:
>
> On Feb 16, 8:46 am, "Nick Johnson (Google)" <nick.john...@google.com>
> wrote:
>
> > As you point out, in order to use a cursor, you still have to
> > reconstruct the original query, so a user could not modify a cursor to cause
> > you to display records they should not have access to.
>
> +1. this is important.

Very much agreed. This seems almost like it might be a reason to not
pass cursors around "in the clear."

> > It contains the complete key of the next record to be returned, along with
> > some extra information about the query. Feel free to experiment and see for
> > yourself, of course. :)
>
> specifically, the "extra information" is your app id and the kinds and
> some property values, and possibly also property names, of one or more
> entities that were query results. the specific properties involved are
> the properties in the query.

This is where I had my "Wait, that sounds wrong" moment. You didn't
mention anything about the currently logged in user.

Off the top of my head, I can think of 2 mutually exclusive
scenarios. They both assume a page requires the user to be logged
in, the original query used the login ID as part of the WHERE clause,
and there's a cursor to page through results.

1) The user has a bunch of personal...whatever. Bookmarks that he
doesn't want to share with his wife. The original query is tied to
his google account. He stashes a browser bookmark halfway through the
list and logs out of the site. Later, his wife uses the same computer
and logs in and checks out the new bookmarks. This one requires her
to log into her google account. From what I'm reading, it sounds like
she'll see his data.
2) The user has a query that she wants to share. So sends some sort
of private site-specific message to one of her friends. In this
case, we want the cursor to maintain its original set of results,
ignoring the currently logged in user. And we want the friend to be
able to access that query from a completely different part of the
site.

I'm guessing it doesn't check to see if the current user "owns" that
query. In many cases, it would be a waste of time to check. And, if
we're using something besides the google user API, it would be a
meaningless check.

Another possibility that springs to mind:
Mr. Cracker With Too Much Time On His Hands has a cursor with data he
shouldn't see that belongs to, say, admin portions of the site. He
browses the rest of your site, looking for anything else that uses
cursors. He replaces those with this one.

This one's probably just paranoia on my part. You did mention that
cursors store Kind information. So (depending on how the API is
implemented), if he happened to plug it into a page with models that
had the same properties, it seems like a risk that duck-typing could
work in his favor. I know I've written a few testing pages that are
designed to dump the details of whichever models I happen to give them
query descriptions for. They were always admin-only, and I haven't
had one make it into production yet, but...

I guess how dangerous this is depends totally on undocumented
implementation details, and the individual app.

But maybe the (hypothetical) user login issues are worth keeping in
mind, I suppose they might suggest other gotchas to list members.

Regards,
James

Wooble

unread,
Feb 17, 2010, 10:24:37 PM2/17/10
to Google App Engine

On Feb 17, 9:08 pm, James Ashley <james.ash...@gmail.com> wrote:
> 1) The user has a bunch of personal...whatever.  Bookmarks that he
> doesn't want to share with his wife.  The original query is tied to
> his google account.  He stashes a browser bookmark halfway through the
> list and logs out of the site.  Later, his wife uses the same computer
> and logs in and checks out the new bookmarks.  This one requires her
> to log into her google account.  From what I'm reading, it sounds like
> she'll see his data.

Only if your app is dumb enough not to check who the current logged in
user is before constructing the query for his data. If your
application will run a query for a given user's private data based
entirely on the URL with no authentication, it doesn't matter if
you're using cursors or not; your application is inherently insecure.

Nick Johnson (Google)

unread,
Feb 18, 2010, 12:47:31 PM2/18/10
to google-a...@googlegroups.com
Hi James,

Good questions. Answers inline.

In this situation, the original query, and the one resumed from the cursor, should contain a filter on the ID of the currently logged in user. When the wife logs in, that filter is different, so the cursor won't point to a valid location in her result set - so it'll return all her own results, or none of them, but not her husband's results.
 
2) The user has a query that she wants to share.  So sends some sort
of private site-specific message to one of her friends.    In this
case, we want the cursor to maintain its original set of results,
ignoring the currently logged in user.  And we want the friend to be
able to access that query from a completely different part of the
site.

In this case, the rest of the query information (eg, the user sharing the data) needs to be included in the query string, so the app can again construct the same query before calling .with_cursor() on it.


I'm guessing it doesn't check to see if the current user "owns" that
query.  In many cases, it would be a waste of time to check.  And, if
we're using something besides the google user API, it would be a
meaningless check.

Another possibility that springs to mind:
Mr. Cracker With Too Much Time On His Hands has a cursor with data he
shouldn't see that belongs to, say, admin portions of the site.  He
browses the rest of your site, looking for anything else that uses
cursors.  He replaces those with this one.

As above, for a cursor to work, the query he takes advantage of must already be capable of selecting the records that he wants to see - otherwise either nothing or the regular results will be returned.

Think of a cursor as being the key of the first result to pick up a query from - it's meaningless unless it's applied to the same query it was originally generated from. If you try and apply it to a query that it wasn't generated from, it'll either give an error outright, or affect the results in an unpredictable manner - but it'll never cause the query to return results it couldn't have on its own.

-Nick Johnson


This one's probably just paranoia on my part.  You did mention that
cursors store Kind information.  So (depending on how the API is
implemented), if he happened to plug it into a page with models that
had the same properties, it seems like a risk that duck-typing could
work in his favor.  I know I've written a few testing pages that are
designed to dump the details of whichever models I happen to give them
query descriptions for.  They were always admin-only, and I haven't
had one make it into production yet, but...

I guess how dangerous this is depends totally on undocumented
implementation details, and the individual app.

But maybe the (hypothetical) user login issues are worth keeping in
mind, I suppose they might suggest other gotchas to list members.

Regards,
James
--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Reply all
Reply to author
Forward
0 new messages