1.5.2 SDK Prerelease

555 views
Skip to first unread message

Johan Euphrosine

unread,
Jul 12, 2011, 4:59:45 AM7/12/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com, google-app...@googlegroups.com
Hi everyone,

Prerelease SDK for 1.5.2 is ready for testing, feel free to download
and give it a try for local verification:
http://code.google.com/p/googleappengine/downloads/list

See release notes below.

Happy testing!

App Engine Python SDK - Release Notes

Version 1.5.2
=============================
- You can now specify the minimum pending latency for instances and the maximum
number of idle instances for your application in the Admin Console.
- The datastore now never requires an exploding index.
http://code.google.com/appengine/docs/python/datastore/queries.html#Big_Entities_and_Exploding_Indexes
- The SDK will now never suggest indexes with the same property repeated, as
such indexes are likely to be exploding indexes.
- The SDK now supports multiple concurrent transactions.
- Datastore stats are now available on a per-namespace basis.
- The queue details page in the Admin Console now contains request header
details, previous run information, and a task payload viewer.
- You can modify the lease on a task leased from a pull queue using the
modify_task_lease() method.
- Pull Task maximum size has been increased to 1MB.
- You can now update the number of available backend instances without needing
to first stop the backend using the "backend configure" appcfg.py directive.
- You can now set the "References" and "In-Reply-To" headers with the Mail API.
http://code.google.com/p/googleappengine/issues/detail?id=2802
- The SDK "application" environment variable will now be prefixed with dev~.
The new preferred way of retrieving your app id is to use
appidentity.get_application_id(). The --default partition flag can be used
for applications whose code relied on a specific environment variable.
- In the Deferred API, defer() now accepts the _target parameter.
- Added a to_dict() function to db.py which converts a model to a dictionary.
- Added a get_original_metadata() method to the Images API to extract EXIF
information from images.
http://code.google.com/p/googleappengine/issues/detail?id=4133
- Added an @transactional decorator to db.py for functions that should
always be run in a transaction.
- Fixed an issue in the SDK where the Deferred API did not work when using
the --backends flag.
http://code.google.com/p/googleappengine/issues/detail?id=5072

App Engine Java SDK - Release Notes

Version 1.5.2
=============
- You can now specify the minimum pending latency for instances and the maximum
number of idle instances for your application in the Admin Console.
- The datastore now never requires an exploding index.
- The SDK will now never suggest indexes with the same property repeated, as
such indexes are likely to be exploding indexes.
- Datastore stats are now available on a per-namespace basis.
- The queue details page in the Admin Console now contains request header
details, previous run information, and a task payload viewer.
- You can modify the lease on a task leased from a pull queue using the
modifyTaskLease() method.
- Pull Task maximum size has been increased to 1MB.
- You can now update the number of available backend instances without needing
to first stop the backend using the "backend configure" appcfg directive.
- You can now set the "References" and "In-Reply-To" headers with the Mail API.
http://code.google.com/p/googleappengine/issues/detail?id=2802
- The whitelist has been updated to include support for JSR 105.
- When the SDK throws a DatastoreNeedIndexException for a missing index
definition, the exception can now be caught.
--
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

Max

unread,
Jul 12, 2011, 5:21:38 AM7/12/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com, google-app...@googlegroups.com
Great job!

May I know more about the datastore now never requires an exploding index?

Does that mean we don't need to build exploding index or simply can't build exploding index?

Pascal Voitot Dev

unread,
Jul 12, 2011, 5:24:09 AM7/12/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com, google-app...@googlegroups.com
exactly the same question ;)

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/KyL9f70-VtkJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Robert Kluin

unread,
Jul 12, 2011, 2:08:25 PM7/12/11
to google-a...@googlegroups.com
Looking forward to "The queue details page in the Admin Console now
contains request header
details, previous run information, and a task payload viewer." That
might be a big help with debugging. :)

> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.

kamens

unread,
Jul 12, 2011, 2:45:59 PM7/12/11
to Google App Engine
This sounds like a great release -- thanks proppy!

On Jul 12, 1:59 am, Johan Euphrosine <pro...@google.com> wrote:
> Hi everyone,
>
> Prerelease SDK for 1.5.2 is ready for testing, feel free to download
> and give it a try for local verification:http://code.google.com/p/googleappengine/downloads/list
>
> See release notes below.
>
> Happy testing!
>
> App Engine Python SDK - Release Notes
>
> Version 1.5.2
> =============================
> - You can now specify the minimum pending latency for instances and the maximum
>   number of idle instances for your application in the Admin Console.
> - The datastore now never requires an exploding index.
>    http://code.google.com/appengine/docs/python/datastore/queries.html#B...

sebastián serrano

unread,
Jul 12, 2011, 2:54:15 PM7/12/11
to google-a...@googlegroups.com
Hi, 
  Thanks for the prerelease! Is nice to start playing with the new features for the new pricing model.
  Could you clarify about the exploding indexes change? is not clear to me either.

Cheers, Sebastian

pdknsk

unread,
Jul 12, 2011, 6:19:16 PM7/12/11
to Google App Engine
> The SDK now supports multiple concurrent transactions.

What does this mean exactly?

Alfred Fuller

unread,
Jul 12, 2011, 7:34:05 PM7/12/11
to google-app...@googlegroups.com, google-a...@googlegroups.com, google-appe...@googlegroups.com
Hi,

It means that there are alternatives to using exploding indexes (i.e. they are no longer required to execute a given query). You can still have them (there are cases where they are useful, namely to optimize query speed over write cost) and the SDK will still suggest them in many cases (as it is hard to detect them by looking at the query). However, the SDK will no long suggest indexes with the same property repeated multiple times (as these are obviously an exploding index). 

One example where this is very important is SearchableModel, which previously was crippled by exploding indexes if you tried to sort results (and now works as expected).

We are working on an article that goes through how to decided when to remove or keep them, but for now the Google IO talk from 2010, Next Gen Queries (The "Zigzag Merge Join += Sort" part), is a good resource if you want a really deep dive on how this works.

 - Alfred


--
You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group.
To post to this group, send email to google-app...@googlegroups.com.
To unsubscribe from this group, send email to google-appengine...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.

Alfred Fuller

unread,
Jul 12, 2011, 7:47:50 PM7/12/11
to google-a...@googlegroups.com
The Datastore itself has always supported multiple concurrent transactions. However the dev_appserver in the python SDK previously used a global lock and would deadlock if a single thread tried to start more than a single transaction. Now you can have multiple concurrent transactions running at the same time. This is good for the Go runtime (which uses the python dev_appserver), the Datastore Plus library, anyone else who wants to use concurrent transactions on the dev_appserver. 

On Tue, Jul 12, 2011 at 3:19 PM, pdknsk <pdk...@googlemail.com> wrote:
> The SDK now supports multiple concurrent transactions.

What does this mean exactly?
--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.

Max

unread,
Jul 13, 2011, 2:16:11 AM7/13/11
to google-a...@googlegroups.com
Thanks Alfred, 

For Datastore Plus project, is there (or will there be) a Java version?

PK

unread,
Jul 13, 2011, 4:06:17 AM7/13/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com, google-app...@googlegroups.com
A lot of great long expected features. Thanks!!

I just started to experiment and the dev~ prefix is breaking scripts and processes I have been using. What motivated this change?

Thanks
PK

Ice13ill

unread,
Jul 13, 2011, 2:31:54 PM7/13/11
to Google App Engine
It seams the explanation to "The datastore now never requires an
exploding index" is incomplete. Can you give an example that
illustrates the difference? (before/now )

Matthew Blain

unread,
Jul 13, 2011, 4:28:57 PM7/13/11
to Google App Engine
Hi PK,
The dev~ prefix is there to more closely replicate what happens for
many apps running on the App Engine servers: they have a s~ in the
APPLICATION_ID environment variable. (Specifically, the ones running
on the High Replication Datastore). This will help developers identify
where they're relying on it.
You should use the new (and possibly not yet included in our docs--if
not, it will be soon) app_identity API.

from google.appengine.api import app_identity
appid = app_identity.get_application_id()

Is this what is causing you issues?

As a workaround (or if you want to simulate the behavior of the Master/
Slave datastore), you can use --default_partition="" on the command
line, but you should move your application to use the API method.

Also note that if you have existing information in your dev appserver
datastore, it will "appear" to have flushed when the full appid
changes (from appid to dev~appid); you can use the flag to keep it as
before, or use the bulkloader to dump/restore across app ids.

--Matthew

Guido van Rossum

unread,
Jul 13, 2011, 4:34:55 PM7/13/11
to Google App Engine
On Jul 12, 11:16 pm, Max <thebb...@gmail.com> wrote:
> Thanks Alfred,
>
> For Datastore Plus project, is there (or will there be) a Java version?

No, NDB is Python specific in many ways. Java already has a decent
async datastore API based on Futures:

http://code.google.com/appengine/docs/java/datastore/async.html

--Guido van Rossum (not Alfred, but Datastore Plus' author :-)

PK

unread,
Jul 13, 2011, 4:51:49 PM7/13/11
to google-a...@googlegroups.com
Matthew,

thanks for the reply.

In our tests we use remote API extensively. The same datastore is updated from either the server or remote API code. The remote API scripts need to be given the appid and we have an environment variable defining it. Of course this is now broken. I am retreating to a mechanism that says "if you are operating agains the dev datastore then the APPID is "dev~" + real_app_id else it is real_app_id", I am hopeful this will fix the issues.

Between working around this issue and the further "regression" of 3643 (see my comment there) I have not yet been able to pass our automated tests with 1.5.2_prerelease

Anyway, I am optimistic, and look forward to starting simplifying some of my indexes next. I have been waiting for more than a year for that :-)

Thanks,
PK

PK

unread,
Jul 13, 2011, 5:27:08 PM7/13/11
to google-a...@googlegroups.com
There seems to be a regression there though, could somebody confirm whether this issue I just filed is a real issue or just a misconfiguration in my environment?


pdknsk

unread,
Jul 13, 2011, 5:51:53 PM7/13/11
to Google App Engine
Does transaction refer to database transactions? Or does "multiple
concurrent transactions" mean asynchronous operations in general? This
is not clear to me. Does it support asynchronous urlfetch now?

And, in a slightly related question, are there plans to raise the
limit of simultaneous asynchronous urlfetch calls on production? It's
currently limited to 10.

Alfred Fuller

unread,
Jul 13, 2011, 6:00:31 PM7/13/11
to google-a...@googlegroups.com
Datastore transactions


--

Matthew Blain

unread,
Jul 13, 2011, 6:44:29 PM7/13/11
to Google App Engine
Hi PK,
You shouldn't do any sort of special checks. Instead, use the API. In
the case of remote_api, don't specify the appid at all. Instead,
specify the endpoint (e.g. a.appspot.com/_ah/remote_api or localhost:
8080/_ah/remote_api). The tool will look up the app id.

--Matthew

Noah McIlraith

unread,
Jul 14, 2011, 4:41:50 AM7/14/11
to google-appe...@googlegroups.com, google-a...@googlegroups.com, google-app...@googlegroups.com
Still no fulltext search?

Alfred Fuller

unread,
Jul 14, 2011, 1:42:04 PM7/14/11
to google-app...@googlegroups.com, google-a...@googlegroups.com, google-appe...@googlegroups.com
I should point out the SDK is currently very insistent about it's "suggestions" and believes that everything else is wrong (will throw an NeedIndexError), even though it may not be. We are working making the SDK smarter in this regards, but until then you will have to test in production (you can test in production now). Here is a short blurb about how you can remove exploding indexes (official docs coming soon):

Consider:

SELECT * FROM Model WHERE list1 = :1 AND list2 =:2 AND ancestor is :3 AND int = :4 ORDER BY date DESC

Which 'needs' this index (from the perspective of the SDK):
- kind: Model
  ancestor: yes
  properties:
  - name: list1
  - name: list2
  - name: int
  - name: date
    direction: desc

Here a easy way to figure out what non-exploding indexes can be used instead:
  1. Group all properties with equality/ancestor filters
    1. in this case [ancestor, list1, list2, int]
  2. Split this grouping such that no multi-valued properties are in the same group
    1. one example is: [ancestor, list1], [list2, int], but the most efficient split actually repeats the single value properties: [ancestor, list1, int], [ancestor, list2, int]
  3. Create the following indexes
    For each group:
        add index([grouped values] + [query orders/inequality])

    in this case: 
        index(ancestor, list1, int, -date)
        index(ancestor, list2, int, -date)

    or

- kind: Model
  ancestor: yes
  properties:
  - name: list1
  - name: int
  - name: date
    direction: desc
- kind: Model
  ancestor: yes
  properties:
  - name: list2
  - name: int
  - name: date
    direction: desc

 - Alfred

PK

unread,
Jul 14, 2011, 2:35:14 PM7/14/11
to google-appe...@googlegroups.com, google-app...@googlegroups.com, google-a...@googlegroups.com
Alfred thanks for the clarification.

However, isn't ancestor a list too that could contribute to an explosion (albeit minor assuming shallow hierarchies). If this is the case, would these indexes help/work?

- kind: Model
  ancestor: yes
  properties:
  - name: int
  - name: date
    direction: desc
- kind: Model
  properties:
  - name: list1
  - name: int
  - name: date
    direction: desc
- kind: Model
  properties:
  - name: list2
  - name: int
  - name: date
    direction: desc

Thanks

Don Barthel

unread,
Jul 14, 2011, 4:07:07 PM7/14/11
to google-a...@googlegroups.com
I too am extremely interested to see an in-depth explanation of "not requiring an exploding index".

My project was shelved a year ago because of the exploding index issue.

johnwlockwood

unread,
Jul 14, 2011, 4:08:15 PM7/14/11
to google-appe...@googlegroups.com, google-app...@googlegroups.com, google-a...@googlegroups.com
congratulations on this Alfred. I will try out a filter on multiple words in the same list.

Alexandru Farcaş

unread,
Jul 15, 2011, 3:25:29 AM7/15/11
to google-a...@googlegroups.com, google-app...@googlegroups.com, google-appe...@googlegroups.com
Hi Alfred,

Thanks for clarification. I also have 2 questions:

1. For this query:
SELECT * FROM Model WHERE list = :1 AND list =:2 AND list=:3 AND string :=4  ORDER BY date DESC
will be enough this index? 

- kind: Model
  properties:
  - name: list
  - name: string

  - name: date
    direction: desc
 
2. After I create this index (or indexes) I will still receive this exceptions?

com.google.appengine.api.datastore.DatastoreNeedIndexException >> 
The built-in indices are not efficient enough for this query and your data. Please add a composite index for this query..  An index is missing but we are unable to tell you which one due to a bug in the App Engine SDK.  If your query only contains equality filters you most likely need a composite index on all the properties referenced in those filters.

--Alex


Alfred Fuller

unread,
Jul 15, 2011, 11:37:26 AM7/15/11
to google-app...@googlegroups.com, google-a...@googlegroups.com, google-appe...@googlegroups.com
On Fri, Jul 15, 2011 at 12:25 AM, Alexandru Farcaş <alex....@expert-group.biz> wrote:
Hi Alfred,

Thanks for clarification. I also have 2 questions:

1. For this query:
SELECT * FROM Model WHERE list = :1 AND list =:2 AND list=:3 AND string :=4  ORDER BY date DESC
will be enough this index? 

- kind: Model
  properties:
  - name: list
  - name: string
  - name: date
    direction: desc
 

Yes. This is the index the SDK will now suggest.
 
2. After I create this index (or indexes) I will still receive this exceptions?


com.google.appengine.api.datastore.DatastoreNeedIndexException >> 
The built-in indices are not efficient enough for this query and your data. Please add a composite index for this query..  An index is missing but we are unable to tell you which one due to a bug in the App Engine SDK.  If your query only contains equality filters you most likely need a composite index on all the properties referenced in those filters.


It is possible. This means that there are lots of results that match each filter and no results that match all filters (in the first 10k results). If you see this, adding the following exploding index should help a great deal:

- kind: Model
  properties:
  - name: list
  - name: list
  - name: string
  - name: date

We plan on removing this exception in the future, but this won't improve the efficiency of the query (the only thing that will do that is adding indexes like this).

--Alex



--
You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group.

Alfred Fuller

unread,
Jul 15, 2011, 1:55:17 PM7/15/11
to google-appe...@googlegroups.com, google-app...@googlegroups.com, google-a...@googlegroups.com
Yes, you are correct and those indexes will work. It's a trade of, composite indexes 'pre-intersect' (at write time) properties while zigzag merge join 'post-intersects' properties (at read time). I left the ancestor in because it is probably very 'selective' which has the potential to greatly reduce the amount of data that needs intersected at read time (though this is very data dependent). 

--
You received this message because you are subscribed to the Google Groups "google-appengine-python" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-python/-/aQh0Xx49xlsJ.
To post to this group, send email to google-appe...@googlegroups.com.
To unsubscribe from this group, send email to google-appengine-...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine-python?hl=en.

PK

unread,
Jul 15, 2011, 2:12:14 PM7/15/11
to google-a...@googlegroups.com, google-app...@googlegroups.com, google-appe...@googlegroups.com
I filed this enhancement request earlier today, if you also see this need as you experiment with indexes please star it:

Building indexes for large data sets is expensive so deleting and recreating them from scratch needs to be avoided. With "NextGen queries" features the need arises to experiment with different index strategies for functionality and performance reasons. What I propose is a mechanism---probably through the admin interface---to disable an index for query purposes. The index must still be updated but is not used for queries until it is enabled again. When it is enabled the index is fully functional without any rebuilding overhead.

This way a user can put all the possible indexes in her index.yaml file, enable/disable some of them in production and perform tests/measurements. When satisfied she can vacuum the unnecessary ones.

Matija

unread,
Jul 15, 2011, 3:26:01 PM7/15/11
to google-a...@googlegroups.com, google-app...@googlegroups.com, google-appe...@googlegroups.com
OMG... finally... zigzag merge join... 

Any info on query performance 

SELECT * 
FROM Model 
WHERE 
    list = :1 AND 
    list = :2 AND 
    list = :3 
ORDER BY date DESC

with index
- kind: Model
  properties:
  - name: list
  - name: date
 
or is it highly dependent on data distribution?

Matija

Alfred Fuller

unread,
Jul 15, 2011, 3:54:32 PM7/15/11
to google-appe...@googlegroups.com, google-a...@googlegroups.com, google-app...@googlegroups.com
:-)

performance is data dependent.

Here is convoluted explanation of performance:
Sx = set of entities where list = :x
smallest_set = min(S1.size(), S2.size(), ...)

It works best when the intersection(S1, S2, S3,...) is large compared to the smallest_set.
The pathological case is intersection(S1, S2, S3, ...) = 0 and smallest_set = |all data| / 2

--
You received this message because you are subscribed to the Google Groups "google-appengine-python" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-python/-/w_WoWIWmEncJ.
To post to this group, send email to google-appe...@googlegroups.com.
To unsubscribe from this group, send email to google-appengine-...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine-python?hl=en.

yinhm

unread,
Jul 17, 2011, 9:55:20 AM7/17/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com, google-app...@googlegroups.com
On Tue, Jul 12, 2011 at 4:59 PM, Johan Euphrosine <pro...@google.com> wrote:
> - The SDK "application" environment variable will now be prefixed with dev~.

I just upgraded to prerelease 1.5.2, and can't read any old data from my app.

Then I down upgraded to 1.5.1, got no lucky although the
dev_appserver.datastore filesize
seems correctly.

I uploaded some data then upgraded again, results was the same. But
this time after down
upgraded the data was there, don't what cause the data missing first
time. Maybe due to I
wrote part of the data so the metadata was changed.


I upgraded to prerelease 1.5.2 again, upload data from a old dump, got
this error:

{{{
File "/home/yinhm/apps/APPNAME/parts/google_appengine/google/appengine/datastore/datastore_stub_util.py",
line 224, in CheckAppId
'app "%s" cannot access app "%s"\'s data' % (request_app_id, app_id))
File "/home/yinhm/apps/APPNAME/parts/google_appengine/google/appengine/datastore/datastore_stub_util.py",
line 207, in Check
raise apiproxy_errors.ApplicationError(error_code, msg)
ApplicationError: ApplicationError: 1 app "dev~APPNAME" cannot access
app "APPNAME"'s data
}}}


I'm using python sdk with sqlite datastore.


--
Regards,
twitter:@yinhm
有微码头:http://yinhm.appspot.com
github: https://github.com/yinhm

yinhm

unread,
Jul 17, 2011, 12:50:02 PM7/17/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com
2011/7/17 yinhm <epa...@gmail.com>:

> On Tue, Jul 12, 2011 at 4:59 PM, Johan Euphrosine <pro...@google.com> wrote:
>> - The SDK "application" environment variable will now be prefixed with dev~.
>
> Then I down upgraded to 1.5.1, got no lucky although the
> dev_appserver.datastore filesize
> seems correctly.

The data seems missing after down upgraded maybe caused by buildout
update rod.recipe.appengine to 2.0.0 which updated datastore path.

PK

unread,
Jul 17, 2011, 3:27:10 PM7/17/11
to google-appe...@googlegroups.com, google-a...@googlegroups.com
@yinhm have you tried to run your server with the option --default_partition=""  in 1.5.2 prerelease?

PK

yinhm

unread,
Jul 17, 2011, 11:00:03 PM7/17/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com


With --default_partition="" datastore works like before, thx point this to me.

Still wonder why break this in a minor release.

PK

unread,
Jul 18, 2011, 12:57:44 AM7/18/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com
They are changing the default appid on the dev server to have a dev~ prefix. Data already written in a previous release do not and they are not visible any more.

Please read this earliest post by Matthew for an explanation:


Hope this helps...

PK


yinhm

unread,
Jul 18, 2011, 1:18:00 AM7/18/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com

Oh, sorry. I was glanced at this thread, didn't noticed it.


>Matthew

> or use the bulkloader to dump/restore across app ids.

I tried bulkloader, it was broken, don't knwon is this due to I'm using sqlite.

Brandon Donnelson

unread,
Jul 23, 2011, 10:54:02 AM7/23/11
to google-a...@googlegroups.com, google-appe...@googlegroups.com, google-app...@googlegroups.com
Suggestion: When you prerelease, could you stick that in the eclipse repository too, for ease of download?

Brandon Donnelson

Nevin Freeman

unread,
Jul 25, 2011, 6:34:18 PM7/25/11
to google-a...@googlegroups.com
For developers facing this issue locally, it may be useful to know that I only was able to access the old local datastore by adding the flag --default_partition= (without any quotes).
Reply all
Reply to author
Forward
0 new messages