Search API Cost question

161 views
Skip to first unread message

Keith Johnston

unread,
Feb 18, 2015, 2:12:48 PM2/18/15
to google-a...@googlegroups.com
I am trying to understand the cost of using the search API. I get the total storage and query cost, but I don't understand the "Indexing Searchable Documents" cost. How often is that 2 dollar charge incurred? And what is the storage measurement of? The documents or the index? Or both?



ResourceCost
Total storage (documents and indexes)$0.18 per GB per month
Queries$0.50 per 10K queries
Indexing searchable documents$2.00 per GB

Vinny P

unread,
Feb 19, 2015, 2:46:56 AM2/19/15
to google-a...@googlegroups.com
On Wed, Feb 18, 2015 at 1:12 PM, Keith Johnston <ke...@spotkin.com> wrote:
I am trying to understand the cost of using the search API. I get the total storage and query cost, but I don't understand the "Indexing Searchable Documents" cost. How often is that 2 dollar charge incurred? And what is the storage measurement of? The documents or the index? Or both?



It's the space required for your indexes, expressed as $ per GB per monthly billing cycle. 

$2/GB might seem expensive at first glance, but don't worry about it too much - your indexes are much, much smaller than the storage required for your documents.

 
-----------------
-Vinny P
Technology & Media Consultant
Chicago, IL

App Engine Code Samples: http://www.learntogoogleit.com

kj

unread,
Feb 19, 2015, 11:44:28 AM2/19/15
to google-a...@googlegroups.com
Thanks!

Kaan Soral

unread,
Feb 19, 2015, 3:55:52 PM2/19/15
to google-a...@googlegroups.com
Are there any upper limits on the index size? (It was either 1gb or 1tb a while ago, they increased/decreased it, I couldn't keep track)

I have a lot of stuff indexed, including all movies/tvshows and their descriptions, and my usage is 0.52gb's (Stored Data :)

While I was pre-speculating like kj, I thought I would easily hit the limits, yet I see now that, it will only happen if my usage/user content really explodes

Carlos Lallana

unread,
Feb 26, 2015, 3:12:46 PM2/26/15
to google-a...@googlegroups.com
Hey Kaan,

As stated in this section of the documentation [1], the maximum size per index is 10 GB, with an unlimited number of indexes allowed.

Kaan Soral

unread,
Feb 26, 2015, 3:17:31 PM2/26/15
to google-a...@googlegroups.com
Thanks for the reply

I don't think 10gb is easy to reach, but growing is probably the aim of most of the appengine apps, and reaching that limit would probably be a nightmare

What exactly happens at that point?

I hope the system automatically purges the low-rank documents on it's own

Is it also a hard-limit, or does an internal bell goes off, and someone warns you that you're first to reach the limit, and starts discussing what to do

I really wish it was unlimited, just for the sake of the mind

Carlos Lallana

unread,
Feb 27, 2015, 4:28:34 AM2/27/15
to google-a...@googlegroups.com
Hi Kaan,

It is indeed not easy to reach the 10 GB size for an index, but yet not impossible. As you can read here, "when an app tries to exceed this amount, an insufficient quota error is returned".

Note that if you have purchased one of our Silver, Gold, or Platinum support packages, you'll be able to request a quota increase for this limit.

But before that, I would suggest to shard (divide or split) your index into multiple indexes. In order to do so, you should:

1. Create an additional index.
2. Store a new document in the second index if needed.
3. When you need to search for a document, search both indexes. For better performance, you can make asynchronous search calls.

According to what my colleague David (Google wise man) explained to me, this approach will not scale indefinitely. Beyond two or maybe three sharded indexes, this technique will become relatively inefficient. The Search API is not designed for infinitely large document collections, so if you plan to scale a great deal then you would need to consider alternative architectures. However this technique will at least allow you to double your search capacity.

Hope that info helps!
Message has been deleted

Simeon Ivaylov Petrov

unread,
Mar 27, 2015, 7:29:21 AM3/27/15
to google-a...@googlegroups.com
If I search into multiple indexes I will not be able to sort and paginate all the merged results correctly, is that right? Is there a solution for this problem?

Kaan Soral

unread,
Mar 27, 2015, 8:49:28 AM3/27/15
to google-a...@googlegroups.com
No solution, have you reached the limit?

As it seems, reaching the limit means doom, at that point you would have to search for an external solution to re-create the entire index with, or hope the limit can be increased, yet that only postpones the issue
Message has been deleted

Simeon Ivaylov Petrov

unread,
Mar 27, 2015, 9:38:50 AM3/27/15
to google-a...@googlegroups.com
I must "convert" 10TeraBytes of MySQL data into Search API indexed documents and I think I will definitely reach the 10GigaBytes index limit :). I actually don't know how much "Search API space" (documents+indexes) is needed for 10TB of MySQL data but I think it will be much more than 10GB. So I think I must sacrifice sorting and paginating if I will use more indexes... What do you thing?

Kaan Soral

unread,
Mar 27, 2015, 9:42:51 AM3/27/15
to google-a...@googlegroups.com
I think you should either use the datastore or the bigquery solution

Because sorting doesn't work as you imagine it to be, it's more of an in-memory sorting, the actual sorting is based on the rank, and there is only one document rank, so in reality, there is just one sort order with the search documents, yet many filters, as far as I asses and use the search

Simeon Ivaylov Petrov

unread,
Mar 27, 2015, 9:52:52 AM3/27/15
to google-a...@googlegroups.com
The main purpose of my application is full-text search so I've chosen the Search API. Far as I know Datastore and Big query don't have full-text searching capabilities. 

Search API actually supports many types of sorting (https://cloud.google.com/appengine/docs/python/search/options), but I) think they don't apply if you search into multiple indexes and then merge results... Same with pagination (offsets and cursors appy to single indexes)...

Kaan Soral

unread,
Mar 27, 2015, 10:01:38 AM3/27/15
to google-a...@googlegroups.com
"There is an explicit property, SortOptions.limit, that controls the size of the sort. You can never sort more than 10,000 docs, the default is 1,000."

So if you have 1000000000 entities, it fetches the top ranking 10.000, then sorts them, so the entire 100000....0 isn't sorted, I don't call that sorting

I don't know what's your use case is, but this detail probably discards many use cases

---

For my use case, Search API is perfect, I just convert popularity to search document rank

Simeon Ivaylov Petrov

unread,
Mar 27, 2015, 10:12:12 AM3/27/15
to google-a...@googlegroups.com
If I have 1000000000 entities and I make a query that produce much less records than 10.000 (I think I'll have no more then 1000 results for query) they will be sorted correctly... Search api fetches results from your query. I agree with you if you make an empty query, yes, it will fetch the top ranking 10.000 and then make the custom sorting.

Kaan Soral

unread,
Mar 27, 2015, 10:19:32 AM3/27/15
to google-a...@googlegroups.com
Makes sense

I have a hunch the search index size limitations are there to prevent search products that might compete with google search itself, because the limits used to be higher, but at one point they degraded

I guess we will wait and see if a reply from Google representative comes to aid you

Is that 10TB data constant and not-growing? (If that is the case, you might get away with a multi-index setup, or just find a way to cram it all to one index etc.)

Simeon Ivaylov Petrov

unread,
Mar 27, 2015, 10:31:42 AM3/27/15
to google-a...@googlegroups.com
The data is growing up, but if I'll implement the multi-index solution, it will be not a problem because, as far I understood, there is an unlimited number of indexes you can create. I don't know how much performant will be so many asynchronous searches instead... For instance, if I have 1000 indexes and so I make 1000 simultaneous asynchronous searches, will it be preformant enough?
,

Kaan Soral

unread,
Mar 27, 2015, 10:47:54 AM3/27/15
to google-a...@googlegroups.com
Generally, over all services of appengine, ~50 is a safe limit for async operations, for some ~500 is a soft limit, 1000 is probably over the hard limit for many

For search, 5-6 parallel searches might work, they are already heavy operations

I have a hunch they might improve many of the quotas and break the chains of appengine in the future, but I don't know when

10TB is a lot of data ... it's comparable to wikipedia

Call to appengine/google: It would be great if you could index wikipedia for experimental purposes, from a dbdump of wikipedia and report the document size / index size, I always wonder the limitations of Search API, it would provide a definite limitation

Simeon Ivaylov Petrov

unread,
Mar 27, 2015, 11:00:36 AM3/27/15
to google-a...@googlegroups.com
I agree with you... The data is 10TB of MySQL space, I hope it will occupate much less space with the documents solution. It could be compressed to 1TB or even 1GB, who knows :).. I'll try and let you know.

Kaan Soral

unread,
Mar 27, 2015, 12:05:06 PM3/27/15
to google-a...@googlegroups.com
It might cost you ~$400 etc, also executing consistent operations on appengine is another challenge, yet for the sake of experimentation, I'm pretty curious, please keep us updated
Reply all
Reply to author
Forward
0 new messages