As you've likely heard, when Google App Engine leaves Preview in the second half of 2011, the pricing model will change. Prices are listed here: http://www.google.com/enterprise/appengine/appengine_pricing.html. But that leaves a lot of questions unanswered, this FAQ is intended to help answer some of the frequently asked questions about the new model. We are interested in hearing additional thoughts and comments you have based on this. Once it is relatively stable I'll add it to our official docs. If you find there is something you want to know but it is not yet answered, just ask and I'll try to answer it as clearly as possible. We've made some changes based on the feedback we've gotten (from this group in particular), they are bolded below but not updated on the external pages yet. There are still blanks to fill in and I will be sending that information to this group first in order as it is available. Finally, thank you for your questions and bearing with us as we are ironing out details, I and the whole App Engine team very much appreciate it.
Senior Product Manager, Google App Engine
Definitions Instance: A small virtual environment to run your code with a reserved amount of CPU and Memory. Frontend Instance: An Instance running your code and scaling dynamically based on the incoming requests but limited in how long a request can run. Backend Instance: An Instance running your code with limited scaling based on your settings and potentially starting and stopping based on your actions. Scheduler: Part of the App Engine infrastructure that determines which Instance should serve a request including whether or not a new Instance is needed.
Serving Infrastructure Q: What’s an Instance?
A: When App Engine starts running your code it creates a small virtual environment to run your code with a reserved amount of CPU and Memory. For example if you are running a Java app, we will start a new JVM for you and load your code into it.
Q: Is an App Engine Instance similar to a VM from infrastructure providers? A: Yes and no, they both have a set amount of CPU and Memory allocated to them, but GAE instances don’t have the overhead of operating systems or other applications running, so a much larger percentage of the CPU and memory is considered “usable.” They also operate against high-level APIs and not down through layers of code to virtual device drivers, so it’s more efficient, and allows all the services to be fully managed.
Q: How does GAE determine the number of Frontend Instances to run? A: For each new request, the Scheduler decides whether there is an available Instance for the request, the request should wait, or a new Instance should be created to service the request. It looks at the number of Instances, the throughput of the Instances, and the number of requests waiting. Based on that it predicts how long it will take before it can serve the request (aka the Pending Latency). If it predicts the delay will be over 1 second, a new Instance is created. If it looks like an Instance is no longer needed, it will take that Instance down.
Q: Should I assume I will be charged for the number of Instances currently being shown in the Admin console? A: No, we are working to change the Scheduler to optimize the utilization of instances, so that number should go down somewhat. If you are using Java, you can also make your app threadsafe and take advantage of handling concurrent requests. You can look at that as an upper bound on how many Instances you will be charged for.
Q: How can I control the number of instances running? A: With the new Scheduler you’ll have the ability to choose a set of parameters that will help you specify how many instances are spun up to serve your traffic. More information about the specific parameters and how they will affect the Scheduler will be available on this within a few weeks.
Q: What can I control in terms of how many requests an Instance can handle? A: The single largest factor is your application’s latency in handling the request. If you service requests quickly, a single instance can handle a lot of requests. Also, Java apps support concurrent requests, so it can handle additional requests while waiting for other requests to complete. This can significantly lower the number of Instances your app requires.
Q: Will there be a solution for Python concurrency? Will this require any code changes? Python concurrency will be handled by our release of Python 2.7 on App Engine. We’ve heard a lot of feedback from our Python users who are worried that the incentive is to move to Java because of its support for concurrent requests, so we’ve made a change to the new pricing to account for that. While Python 2.7 support is currently in progress it is not yet done so we will be providing a half-sized instance for Python (at half the price) until Python 2.7 is released.
Q: How many requests can an average instance handle? A: Single-threaded Instances (python or java) can currently handle 1 concurrent request. Single-threaded Instances (python or java) can currently handle 1 concurrent request. Therefore there is a direct relationship between the latency and number of requests which can be handled on the instance per second, for instance: 10ms latency = 100 request/second/Instance, 100ms latency = 10 request/second/Instance, etc. Multi-Threaded Instances can handle many concurrent requests. Therefore there is a direct relationship between the cpu consumed and the number of requests/second. For instance, for a B4 (approx 2.4GHz) instance: consuming 10 Mcycles/request = 240 request/second/Instance, 100 Mcycles/request = 24 request/second/Instance, etc. These numbers are the ideal case but they are pretty close to what you should be able to accomplish on an Instance. Multi-Threaded instances are currently only supported for Java; we are planning support for Python later this year.
Q: Why is Google charging for instances rather than CPU as in the old model? Were customers really asking for this? A: CPU time only accounts for a portion of the resources used by App Engine. When App Engine runs your code it creates an Instance, this is a maximum amount of CPU and Memory that can be used for running a set of your code. Even if the CPU is not currently working due to waiting for responses, the instance is still resident and considered “in use” so, essentially, it still costs Google money. Under the current model, apps that have high latency (or in other words stay resident for long periods of time without doing anything) are not able to scale because it would be cost-prohibitive to Google. So, this change is designed to allow developers to run any sort of application they would like but pay for all of the resources that are being used.
Q: What does this mean for existing customers? A: Many customers have optimized for low CPU usage to keep bills low, but in turn are often using a large amount of memory (by having high latency applications). This new model will encourage low latency applications even if it means using larger amounts of CPU.
Q: How will always-on work under the new model? A: Still determining how this will work, answer coming very soon (no seriously, we are almost done).
Q: What is the difference between On-demand Instances and Reserved Instances?
A: On-demand Instances have no pre-commitment in terms of the number that will be used. You pay for them as you use them. Reserved Instances are pre-commitment to a certain number of Instance Hours in a week. They are cheaper but you must pay for all the Instance Hours that you have pre-committed to whether you use them or not. This does not mean they have to be running the whole time.
Q: Wait, so Reserved instances don’t mean you have to keep them running the whole time? A: No, it is just a way to get cheaper instance-hours by pre-committing to them.
Q: What is the time granularity of the instance pricing? ie if I have an instance up for 5 minutes, what am I charged, $0.08 / 60*5?
A: Instances are charged for their uptime and until they are idle for 15 minutes (when the scheduler takes them down). So if you have an on-demand Instance only serving traffic for 5 minutes, you will pay for 5+15 minutes, or $0.08 / 60 * 20 = 2.6 cents.
Q: You seem to be trying to account for RAM in the new model. Will I be able to purchase Frontend Instances that use different amounts of memory? A: We are only planning on having one size of Frontend Instance.
Q: Do Frontend instances handle Task Queues and Cron? A: Yes.
Q: Can the experimental Go Runtime handle concurrent requests? A: Not currently.
Costs Q: Is the $9/app/month a fee or a minimum spend?
A: Based on the feedback we’ve received we are changing this $9 fee to be a minimum spend rather than a fee a originally listed. In other words you will still have to spend $9/month in order to scale but you won’t pay an additional $9 for your first $9 worth of usage each month. The $500/account/month will still be a fee as it covers the cost of operational support.
Q: Will most customers have to move to Paid Apps? A: No, we expect the majority of current active apps will still fall under the free quota.
Q: Will existing apps be grandfathered in and continue under today’s billing model?
A: No, existing apps will fall under the new billing model once App Engine is out of preview.
Q: Will most customers’ bills increase? If so, why is Google increasing the price for App Engine? A: Yes, most paying customers will see higher bills. During the preview phase of App Engine we have been able to observe what it costs to run the product as well as what typical use patterns have been. We are changing the prices now because GAE is going to be a full product for Google and therefore needs to have a sustainable revenue model for years to come.
APIs Q: How were the APIs priced? A: For the most part the APIs are priced similarly to what they cost today, but rather than charging for CPU hours we are charging for operations. For instance the Channel API is $0.01/100 channels. This is approximately what users pay today (although it would be paid as a fraction of a CPU hour). The datastore API is the most significantly changed and is described below.
Q: For the items under APIs on the pricing page that just have a check, what does that mean? A: Those items come free with using App Engine.
Q: For XMPP, how does the new model work? How much do presence messages cost?
A: For XMPP we will only be charging an operation fee for outgoing stanzas. Incoming stanzas are just considered requests similar to any other request and so we’ll charge for the bandwidth used as well as whatever it takes to process the request in terms of Instance Hours. We don’t charge for presence messages other than the bandwidth it consumes. This is almost exactly how it works today with the exception that your bill would show CPU hours as opposed to Stanzas.
Q: For Email, how much do incoming emails cost? A: Incoming emails will just be considered requests similar to any other request and so we’ll charge for the bandwidth used as well as whatever it takes to process the request in terms of Instance Hours. This is in essence how it works today.
Q: Will the Front End Cache feature ever be formalized as an expected, documented part of the service offering? A: We are currently looking at various options, but don’t yet have any plans for when this would happen.
Q: What is being charged for in terms of Datastore operations? What do you expect the ratio to be between the new pricing metric and the Datastore API calls metric we have today? A: Today we charge for the CPU consumed per entity written, index written, entity read, query index scanned, and query result read. Under the new model we will charge per operation rather than CPU, and we will no longer charge for query index scans. This means the cost of your queries will be tied exclusively to the size of your result set. We expect the cost of these operations will be approximately 4x the cost of the equivalent CPU under today’s model, but for apps that make heavy use of indexes, this will be somewhat offset by the fact that we will no longer be charging for query index scans. The admin console today shows total Datastore API Calls, but this is not a good gauge of how many operations you will be charged for under the new model. Your costs will be highly dependent on the types and contents of your API calls, not the number of calls themselves, which is what we currently display. For example a single get() API call may retrieve 1 Entity or 100 Entities, and a beginTransaction() API call doesn’t consume any billable resources.
Q: Could emails sent to admins be cheaper or free? A: That’s a possibility that we can look into.
Usage Types Q: What does the Premier cost of "$500/account" mean? Per Google Apps Account? Per Developer Account, Per Application Owner Account?
A: It is per Organization (which would translate into per Google Apps account if you are currently a Google Apps customer). So, for instance if you are working at gregco.com and you signed up for a Premier account, gregco.com users will be able to create apps which are billed to the gregco.com account.
Q: Will there be special programs for non-profit usage? A: Possibly, we are currently looking into this.
Q: Will there be special programs for educational usage? A: Possibly, we are currently looking into this.
Q: Will there be special programs for open-source projects? A: Possibly, we are currently looking into this.
Usage Limits Q: If I migrate to HR Datastore, does that mean I have a "newly created" application, and will get the new, lower, free quota for email? Could you grandfather in migrated apps at the old 2000 limit?
A: Yes, we can grandfather in the email quota for HRD apps that are migrating from M/S apps.