New cap on indexed properties per entity

461 views
Skip to first unread message

ryan

unread,
Jun 9, 2008, 9:51:29 PM6/9/08
to Google App Engine
Hi all! As a heads up, we've recently added a cap on the total number
of indexed property values that a single entity may have. This
includes both normal property values (excluding Text and Blob) and
index rows generated by your app's indexes. The cap is currently 5000.
We'll add this to the docs soon.

For example, this entity:

class Foo(db.Model):
x = db.ListProperty(int)
y = db.ListProperty(string)

foo = Foo(x=[1, 2], y=['a', 'b'])

with this index.yaml:

indexes:
- kind: Foo
properties:
- name: x
- name: y

will have *eight* indexed properties. The first four are the two
values for x and two values for y. The second four are generated by
the index. They are (x=1, y='a'), (x=1, y='b'), (x=2, y='a'), and
(x=2, y='b').

If you attempt to insert an entity with more than 5000 indexed
properties, the put() call will raise a BadRequestError with a
descriptive error message.

-Ryan

DennisP

unread,
Jun 10, 2008, 10:25:48 AM6/10/08
to Google App Engine
Sounds like that could add up fast...is this a temporary cap for the
preview, or is it permanent?
-dennis

peterk

unread,
Jun 10, 2008, 11:05:00 AM6/10/08
to Google App Engine
Unless I'm mixing up jargon, this shouldn't be too restrictive for
99.999% of apps..? 5000 indexed properties max per entity i.e.
instance of a model (or 'row' in the database), should be plenty for
most. Now if it was 5000 per model that'd be a different story..I
certainly hope it doesn't mean that!

ryan

unread,
Jun 10, 2008, 12:32:59 PM6/10/08
to Google App Engine
On Jun 10, 7:25 am, DennisP <DennisBPeter...@gmail.com> wrote:
> Sounds like that could add up fast...is this a temporary cap for the
> preview, or is it permanent?

Good question! Expect the cap to stay for the foreseeable future. The
datastore is designed to scale the total number of entities in your
app, but not necessarily the number of properties or index rows in a
given entity.

Frank

unread,
Jun 10, 2008, 1:13:34 PM6/10/08
to Google App Engine
wait a sec, indexes are based on the kind anyway, right?
so what happens if I have 5000 entities of the same kind with an index
on 10 of their properties? knowing that those 10 properties could all
be different... would that break the cap?

could you please clarify because this can have huge consequences,
basically limitating the number of different value a propery of a kind
of entity can have...

I surely hope I misunderstood

thank you

ryan

unread,
Jun 10, 2008, 8:26:40 PM6/10/08
to Google App Engine
You're right, that is an important point to emphasize. The cap only
applies within individual entities, not across entities. In your case,
each individual entity would only have 10 property values and one ten-
property index, for a total of 20 indexed properties, so you wouldn't
hit the cap.

Nerdy Tommy

unread,
Jun 11, 2008, 5:29:10 AM6/11/08
to Google App Engine
Thanks for the explanation Ryan. That wasn't obvious to me at first,
but know it seems to be clear enough.

Cheers,
Tommy

Aral

unread,
Jun 28, 2008, 1:40:00 PM6/28/08
to Google App Engine
As per http://groups.google.com/group/google-appengine/t/a09d3305f27b214e
and the issue reported at http://code.google.com/p/googleappengine/issues/detail?id=527,
I hope that this will be reviewed, and/or a paid option made
available, or, failing all those, that the error will be more
gracefully handled (currently it gives a Google 500 error that cannot
be caught by the app.)

Thanks,
Aral

Primijos

unread,
Jul 3, 2008, 4:26:42 PM7/3/08
to Google App Engine
On Jun 10, 3:51 am, ryan <ryanb+appeng...@google.com> wrote:
> Hi all! As a heads up, we've recently added a cap on the total number
> of indexed property values that a single entity may have. This

Hi Ryan,

just a question: how does that combine with
__searchable_text_index properties? Right now I have a relative small
Entity which has about 15 fields, defined as a Searchable entity (this
adds the "__searchable_text_index" automatically). The point is, when
put()ing some entities (not all), I get an error ("Too many indexed
properties for entity") relating to an index combining my properties
with the automatic property, one of the properties ('quadrant') is a
list with 154 values, the other one is a 'date' property and the third
is the __searchable_text_index (list?) property. I'm suspecting that,
depending on the values of other string properties of the entity, the
__searchable_text_index grows to a point at which
quadrant(154)*__searchable_text_index(?)*date(1) goes beyond 5000
combinations. Am I guessing right? Any hint regarding this issue?

If it is so, I think that could be a hard limit for text
searching... :-/

thanks in advance, best,
Jose

ryan

unread,
Jul 3, 2008, 4:41:28 PM7/3/08
to Google App Engine
On Jul 3, 1:26 pm, Primijos <primi...@gmail.com> wrote:
> just a question: how does that combine with __searchable_text_index properties?

good question! it means that entities based on SearchableModel from
google.appengine.ext.search are limited to at most 5000 indexable
keywords. that's just the unique indexable keywords, though, so in
practice, you should be able to store much longer TextProperty and
StringProperty values before you hit the cap.

> The point is, when put()ing some entities (not all), I get an error ("Too
> many indexed properties for entity") relating to an index
> combining my properties with the automatic property, one of the
> properties ('quadrant') is a list with 154 values, the other
> one is a 'date' property and the third is the
> __searchable_text_index (list?) property. I'm suspecting that,
> depending on the values of other string properties of the
> entity, the __searchable_text_index grows to a point at which
> quadrant(154)*__searchable_text_index(?)*date(1) goes beyond 5000

that's correct. that index and the quadrant property with 154 values
will severely limit the size of the string and text property values
for entities of this kind. specifically, you'll be cut off at 5000 /
154 = 32 unique indexable keywords. that's still a fair amount, but
it's not a lot. if you want to be able to store more words in the
string and text properties, you'll either need to remove this index or
reduce the number of values in the quadrant property.

José Oliver Segura

unread,
Jul 4, 2008, 3:44:48 AM7/4/08
to google-a...@googlegroups.com
On Thu, Jul 3, 2008 at 10:41 PM, ryan <ryanb+a...@google.com> wrote:
>
> good question! it means that entities based on SearchableModel from
> google.appengine.ext.search are limited to at most 5000 indexable
> keywords. that's just the unique indexable keywords, though, so in
> practice, you should be able to store much longer TextProperty and
> StringProperty values before you hit the cap.

Uh, I seee. That's true but problems will arise whenever you need to
use some kind of ListProperty instead of just "plain" StringProperties in
your Entity, combined with a SearchableModel...

> that's correct. that index and the quadrant property with 154 values
> will severely limit the size of the string and text property values
> for entities of this kind. specifically, you'll be cut off at 5000 /
> 154 = 32 unique indexable keywords. that's still a fair amount, but
> it's not a lot. if you want to be able to store more words in the

Well, I don't know too much about indexing, but just 32 unique
keywords doesn't seem too much, at first sight. Think, for example,
about a simple weblog or publishing system in which you can have,
let's say, a "Post" or "Article" entity with a title, a body and a list of
tags associated... and you want to be able to do "full text search"
on articles by either tag or content (title/body) of the posts/articles
with a combined index (__searchable_text_index/tags)...
This cap means that you can only have a limit of 5000 combinations
of the words in the (body+title)*tags count. For a 5 tags article, that
means, at most, 1000 words in body+title, which doesn't seem too
much depending on the content of the article, maybe I'm wrong. Don't
you think (all) that this allows only "too little" contents/text searches/
text search combinations?

> string and text properties, you'll either need to remove this index or
> reduce the number of values in the quadrant property.

Uhm... I think I must give it a twist, since none of those options
will allow me for the kind of search I'm trying to do. I need to keep the
values of the 154 quadrants, and allow the combined search on text
plus quadrant. What about changing the quadrants property from
list to String? This would make them appear also in the __searchable
property and, thus, match against searches by quadrant and text.
The only problems are that I won't be able to query by exact quadrant
combined with __searchable and that, depending on the name of
quadrants, I can get "false positives" in searches by text that matches
with quadrants (mixing "text content" and "quadrant values" as __searchable)
The first point (combined "AND" searches) need a look, since I don't
know how exactly works text search in GAE, the problem I see is that
given a quadrant and a word, I need to match all entities with both quadrant
and word in the __searchable index, not just with one of them.

Maybe that's an option to explore... what dou you think?

best,
Jose

tijer

unread,
Jul 8, 2008, 3:45:16 AM7/8/08
to Google App Engine
This is somewhat a dealbreaker. Not that I'm going to withdraw
completely from App Engine, but it just means that App Engine cannot
be used for serious search with these limits.

The maxlength of an article in a (very feature-crippled) blog of "1000
words in body+title" is something I noticed when I tried to turn
search on.

On top of that, for some reason App Engine finds it necessary to add 3
separate __searchable_text_index's to the index - which further
cripples the ability to save.

Sad indeed - I hope some solution is found in the future.


On Jul 4, 9:44 am, "José Oliver Segura" <primi...@gmail.com> wrote:

ryan

unread,
Jul 8, 2008, 2:03:18 PM7/8/08
to Google App Engine
out of curiosity, for those of you using appengine.ext.search, are you
simply using it to provide normal full text search over your public
facing app? if so, consider using google's site search or ajax search
api instead. they're not subject to these limitations, and they're
both much more powerful.

more info:

http://www.google.com/sitesearch
http://code.google.com/apis/ajaxsearch/
http://snarfed.org/space/site+search+with+the+Google+AJAX+Search+API

José Oliver Segura

unread,
Jul 9, 2008, 4:31:46 AM7/9/08
to google-a...@googlegroups.com

In my case, I'm using (well, I would like to use) it as an
additional filter, not only doing full text search. I would need to
find items with some text and also with some conditions like having a
given date or a given tag (that is, filtering by additional
properties)

so, if I'm not wrong, I don't think sitesearch/ajaxsearch is an option...

best,
Jose

tijer

unread,
Jul 20, 2008, 3:49:11 PM7/20/08
to Google App Engine
I'm using search (or would be, if it was possible) to search a forum
and newsportal that I'm running on Google App Engine. I mean
everything is cool and perfect, but without search it is difficult to
build a forum where the same questions will be asked over and over
again...



On Jul 9, 10:31 am, "José Oliver Segura" <primi...@gmail.com> wrote:

Calvin Spealman

unread,
Jul 20, 2008, 4:00:06 PM7/20/08
to google-a...@googlegroups.com
Building an index of your own wouldn't be a terrible difficult or
unwieldy thing to do.

For the properties you need to index like that, you could probably
even subclass StringProperty to do some automatic indexing of the
words when a the value changes. The index could be of items of this
kind:

class IndexEntry(db.Model):
word = db.StringProperty(required=True)
found_in = db.ListProperty()

Keeping it updated would be pretty easy.

Granted, including it would be nice. But, a GAE-friendly text indexer
would be a nice project, yeah? I could take a crack myself.

--
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/

thedude

unread,
Aug 24, 2008, 5:24:39 PM8/24/08
to Google App Engine

i'm in the same boat regarding a very basic app that needs search
across blog article content.

the limit was hit withing the first hour on a very basic searchable
model structure including only the article content and url. the
content needs to be searched within the system for various reasons
that have been mentioned by others.

appengine is the most amazing thing since sliced bread, but if we
can't build apps with datastores of decent amounts of data to "search"
and filter, i'm afraid it will be regulated to fairly small/midsize
apps, etc.....

hopefully i'm confused. what would google's recommended approach be
to app engine users who say want to store a bunch of web sites
content, key'ed by url, and search that content quickly and easily for
keywords?

thanks,
rich
Reply all
Reply to author
Forward
0 new messages