Should I Implement the Search API?

95 views
Skip to first unread message

Kaan Soral

unread,
Apr 17, 2013, 3:59:10 PM4/17/13
to google-a...@googlegroups.com
I haven't even considered reading the Search API documents up until yesterday, the production limitations of the Search API made me consider it as a non-existent feature.

However it seems a lot of time has passed since it was first introduced and after reading the documents, I really really want to use it.

Is there any ETA for scalable production usage?
Will it ever reach a point that "Total Index Size"/"API Calls" will be able to reach TB's/Millions?

Looking forward to hearing experiences/knowledge around Search Feature.

Jason Collins

unread,
Apr 17, 2013, 6:08:05 PM4/17/13
to google-a...@googlegroups.com
We've been doing a fair amount of work with Search API lately. Some of the current challenges seem to be around operations aspects. E.g., we're trying to keep the search documents in sync with datastore entities. Because transactions don't extend across datastore and search indexes, they can get out of sync. So we occasionally need to look for orphaned search documents which means walking across them all and looking to datastore to see if the entity exists.

This is very challenging given current quotas and something like the MapReduce library would be very, very helpful (specifically key-splitting / sharding) so that we could perform this operation in a smaller amount of time.

j

Kaan Soral

unread,
Apr 17, 2013, 6:23:53 PM4/17/13
to google-a...@googlegroups.com
Don't know if it would apply to your scenario but it might be a good idea to update search documents only when the datastore element changes significantly or gets removed. In my scenario syncing isn't an issue, however the document rank would change proportionally to the custom rank of the items, so the quotas are still worrying.

Bryce Cutt

unread,
Apr 17, 2013, 8:37:44 PM4/17/13
to google-a...@googlegroups.com
Jason,

Have you thought of using transactional tasks to update the search index? Or are you more concerned with concurrent updates to the same model?

- Bryce

Kaan Soral

unread,
Apr 18, 2013, 4:35:43 PM4/18/13
to google-a...@googlegroups.com
I've implemented Search API, it's wonderful, implementation was easier than I thought.

However search documents/indexes perish after a SDK restart, I was using devappserver2's latest version, upgraded to 1.7.7's devappserver2, however the problem persisted, no matter what I tried, if anyone else has this issue, please star these:

http://code.google.com/p/googleappengine/issues/detail?can=2&start=0&num=100&q=&colspec=ID%20Type%20Component%20Status%20Stars%20Summary%20Language%20Priority%20Owner%20Log&groupby=&sort=&id=6791
http://code.google.com/p/googleappengine/issues/detail?can=2&start=0&num=100&q=label%3AComponent-FullTextSearch&colspec=ID%20Type%20Component%20Status%20Stars%20Summary%20Language%20Priority%20Owner%20Log&groupby=&sort=&id=9019


Another bug I encountered is that, whenever you query for something with no result, it throws an exception including:
""" ProtocolBufferEncodeError: Required field: matched_count not set. """

However it's not too important, as you can catch that exception, however the "not persisting" issue is a show stopper for testing
Reply all
Reply to author
Forward
0 new messages