Full Text Search Options

210 views
Skip to first unread message

Jon McMillan

unread,
Feb 10, 2012, 11:13:02 AM2/10/12
to Google App Engine
Hi,

We're 3 weeks from launching our application and in ~8 weeks IndexTank
will no longer be accessible. I've been holding my breath waiting for
a new torch carrier for IndexTank or for App Engine to get full text
search.

I've been a 1 man army on this project and App Engine taking most of
the server stuff off my shoulders has been a blessing. Can anyone
recommend a hosted search option that I can use to bridge the gap
between IndexTank and GAE? Am I best off launching Lucene/Solr on EC2
myself?

We're dealing with 20k Word documents. The text has already been
extracted and can be sent directly. The content doesn't change often
and search can lag for months. I don't need/want page crawling.

Thanks

Ikai Lan (Google)

unread,
Feb 10, 2012, 5:03:47 PM2/10/12
to google-a...@googlegroups.com
Jon,

How do you feel about backends? I don't know how big the word documents are, but at that scale you might be able to get away with a small backend that you run persistently with an in-memory Lucene or Whoosh instance. 

Alternatively, about a year ago I did a couple of experiments hosting Solr on a $20 VPS and found that the network overhead only seemed to add 20ms-100ms. It seemed to be more about the fact that I was pushing the CPU/memory utilization of the $20 VPS through the roof. I never load tested this solution.

--
Ikai Lan 
Developer Programs Engineer, Google App Engine




--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.


Barry Hunter

unread,
Feb 10, 2012, 5:03:35 PM2/10/12
to google-a...@googlegroups.com
http://www.houndsleuth.com/ is designed as a hosted Search solution
for AppEngine.

Ernest Criss

unread,
Feb 11, 2012, 1:53:13 PM2/11/12
to google-a...@googlegroups.com
You can try taking a look at http://www.thriftdb.com. I'm currently using it for my project until GAE's full text search api is made available.

Bryce Cutt

unread,
Feb 12, 2012, 6:13:48 AM2/12/12
to Google App Engine, mcmill...@gmail.com
A quick solution for you would be to switch to Searchify (http://
www.searchify.com). Searchify has taken the open source code from
Indextank and set up up a duplicate service. I have switched over one
of my Indextank sites to their service and things have been working
well so far. They mentioned in an email that they plan to keep the
pricing within 10% of Indextank but that is not final. This will allow
you to keep doing things the way you are now with no code re-writes
(just change your secure URL).

If Searchify decides to shut down or is not as reliable as you like
you can grab the Indextank source code and set up your own personal
version on a VPS or whatever. The source code is here:
https://github.com/linkedin/indextank-engine
https://github.com/linkedin/indextank-service

The official GAE Full Text Search is in trusted tester. Unless the
pricing does not make sense I plan to switch my apps over to this when
it is production ready. I am writing an Indextank compatible interface
to it (that works both locally and over HTTP) so that I can switch
over with minimal code changes. I am including a REST interface mostly
because I have a few projects not on GAE that could benefit from
awesome FTS.

You can get a rough idea about how GAE FTS works from the slides here:
http://www.gstatic.com/io/2011/presentations/full_text_search/

As Ernest mentioned you could also try ThriftDB. It is also in beta. I
have used their service a bit to see how it is and so far things have
gone well. The interface is similar enough to Indextank that you would
probably find it an easy switch (depending on which features you
need). I have not used it enough to vouch for its reliability.

- Bryce

Brian McHughs

unread,
Feb 13, 2012, 6:33:55 PM2/13/12
to Google App Engine
I know there is a hosted search solution hitting the market March
1st. It is very similar to indextank and probably worth checking out.

www.search-demon.com

Good luck!

Brian

Johan Euphrosine

unread,
Feb 14, 2012, 7:23:40 AM2/14/12
to google-a...@googlegroups.com
Hi Jon

Did you already signup for the FTS trusted tester?
https://docs.google.com/a/google.com/spreadsheet/viewform?formkey=dEdWcnRJUXZ2VGR3YmVsT1Q1WVB2Smc6MQ

> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
>

--
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

Adam Sah

unread,
Feb 14, 2012, 2:36:24 PM2/14/12
to google-a...@googlegroups.com
fyi, their FAQ is 404ing and their site lacks developer API docs, so it's unclear what they offer.  too bad.

Adam Sah

unread,
Feb 14, 2012, 2:50:59 PM2/14/12
to google-a...@googlegroups.com
We went with SOLR/lucene hosted on IntoVPS and it's been terrific: very flexible and reliable.

adam

Amy Unruh

unread,
Feb 14, 2012, 9:26:05 PM2/14/12
to google-a...@googlegroups.com
Adam,

What's the URL that is 404-ing?

On Wed, Feb 15, 2012 at 6:36 AM, Adam Sah <as...@bbfdirect.com> wrote:
fyi, their FAQ is 404ing and their site lacks developer API docs, so it's unclear what they offer.  too bad.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/Nvi4QBGy1RgJ.

Lucas

unread,
Feb 24, 2012, 7:40:52 AM2/24/12
to Google App Engine
very popular topic. everyone eyes on same feature =)
Reply all
Reply to author
Forward
0 new messages