Uh, I seee. That's true but problems will arise whenever you need to
use some kind of ListProperty instead of just "plain" StringProperties in
your Entity, combined with a SearchableModel...
> that's correct. that index and the quadrant property with 154 values
> will severely limit the size of the string and text property values
> for entities of this kind. specifically, you'll be cut off at 5000 /
> 154 = 32 unique indexable keywords. that's still a fair amount, but
> it's not a lot. if you want to be able to store more words in the
Well, I don't know too much about indexing, but just 32 unique
keywords doesn't seem too much, at first sight. Think, for example,
about a simple weblog or publishing system in which you can have,
let's say, a "Post" or "Article" entity with a title, a body and a list of
tags associated... and you want to be able to do "full text search"
on articles by either tag or content (title/body) of the posts/articles
with a combined index (__searchable_text_index/tags)...
This cap means that you can only have a limit of 5000 combinations
of the words in the (body+title)*tags count. For a 5 tags article, that
means, at most, 1000 words in body+title, which doesn't seem too
much depending on the content of the article, maybe I'm wrong. Don't
you think (all) that this allows only "too little" contents/text searches/
text search combinations?
> string and text properties, you'll either need to remove this index or
> reduce the number of values in the quadrant property.
Uhm... I think I must give it a twist, since none of those options
will allow me for the kind of search I'm trying to do. I need to keep the
values of the 154 quadrants, and allow the combined search on text
plus quadrant. What about changing the quadrants property from
list to String? This would make them appear also in the __searchable
property and, thus, match against searches by quadrant and text.
The only problems are that I won't be able to query by exact quadrant
combined with __searchable and that, depending on the name of
quadrants, I can get "false positives" in searches by text that matches
with quadrants (mixing "text content" and "quadrant values" as __searchable)
The first point (combined "AND" searches) need a look, since I don't
know how exactly works text search in GAE, the problem I see is that
given a quadrant and a word, I need to match all entities with both quadrant
and word in the __searchable index, not just with one of them.
Maybe that's an option to explore... what dou you think?
best,
Jose
In my case, I'm using (well, I would like to use) it as an
additional filter, not only doing full text search. I would need to
find items with some text and also with some conditions like having a
given date or a given tag (that is, filtering by additional
properties)
so, if I'm not wrong, I don't think sitesearch/ajaxsearch is an option...
best,
Jose
For the properties you need to index like that, you could probably
even subclass StringProperty to do some automatic indexing of the
words when a the value changes. The index could be of items of this
kind:
class IndexEntry(db.Model):
word = db.StringProperty(required=True)
found_in = db.ListProperty()
Keeping it updated would be pretty easy.
Granted, including it would be nice. But, a GAE-friendly text indexer
would be a nice project, yeah? I could take a crack myself.
--
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/