Controlling the maximum number of instances / service in App Engine Standard?

Sett 1 254 ganger
Hopp til første uleste melding

Attila-Mihaly Balazs

ulest,
23. mai 2017, 08:17:4923.05.2017
til Google App Engine
I have an app (Python 2.7, App Engine standard, EU region) which uses CloudSQL to store some low-volume data (like user profiles). Unfortunately CloudSQL has two limitations with regards to the maximum number of connections:

- a global limit of 4000 (https://cloud.google.com/sql/faq - for second-generation instances)
- a per instance limit of 12 (from the same URL as above - "Each App Engine instance running in a standard environment cannot have more than 12 concurrent connections to a Google Cloud SQL instance")

To make sure that I respect this I can take two approaches, neither of them foolproof:

- use frontend (F) instances with automatic scaling and set "max_concurrent_requests" sufficiently low. This solves the second issue, but if there are sufficient requests such that GAE scales my app to many instances, I can break the global connection limit (and there is no "max_num_instances" or something similar for the auto scaler)
- use backaend (B) instances where I can specify "max_instances", however I don't know how many requests are routed to one instance (when "threadsafe: true" is set in app.yaml) - though I suspect not many - and I frequently get the "Request was aborted after waiting too long to attempt to service your request." error because I have too few instances (I'm running in the EU region so can't start more than 25 instances with the basic scaler :( ).

One quick hack would be to use the backend instances but also run multiple services (like backend0, backend1, backend2, etc) and have the client randomly pick one (I control both the server and the client) to work around the 25 instance / service limit, however that feels very much like a hack :(.

Nicholas (Google Cloud Support)

ulest,
25. mai 2017, 16:14:0725.05.2017
til Google App Engine
Those are legitimate concerns when dealing with Cloud SQL at this scale.  I would recommend as you've suggested that you keep the max_concurrent_requests sufficiently low to avoid the per instance limit.  This can also tend to keep your instances somewhat more responsive in general.

Regarding the 4000 connection limit, this limit may not seem like much at a large scale but it can be stretched.  Depending on the nature of what those connections are doing, you may be able to offload read-only operations to a read replica, likely reducing the number of concurrent connections required for the master and read replica(s).

Note that there are also other storage services that may be more appropriate depending on the type of data stored, the volume of data, the availability you need and the scale.  For low-volume data (like user profiles), Cloud Datastore may be better suited.  Not knowing exactly what your needs are, I'd recommend reviewing each storage and its limitations to decide what's best for your use case.

Jeff Schnitzer

ulest,
26. mai 2017, 09:11:2726.05.2017
til Google App Engine
I’m confused - this is “low-volume data” but you're worried about exceeding 4000 connections? Maybe I’m missing something in translation; how many QPS do you expect?

4000 active connections is pretty crazy, even for a high-traffic system. http://www.mysqlcalculator.com/ expects that to consume about 11G on the server. 

Just use a connection pool with a hard cap of 12 and a modest timeout. You want to avoid leaving a lot of idle connections around, but if you have 4000 actually busy connections I would be worried more about your single node melting down. You probably want to consider federation or alternative database architectures.

Jeff

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/679e332c-21a1-4a38-b5bf-ab7abd681437%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Alex Martelli

ulest,
26. mai 2017, 13:15:2126.05.2017
til google-a...@googlegroups.com
On Fri, May 26, 2017 at 6:10 AM, Jeff Schnitzer <je...@infohazard.org> wrote:
I’m confused - this is “low-volume data” but you're worried about exceeding 4000 connections? Maybe I’m missing something in translation; how many QPS do you expect?

I'm not sure why the QPS, per se, would matter here; rather, the key load metric would appear to be the peak number of requests actively being served (which depends both on peak QPS and on time needed to service each request).
 

4000 active connections is pretty crazy, even for a high-traffic system. http://www.mysqlcalculator.com/ expects that to consume about 11G on the server. 

Yes. However, with, say, 12 connections per instance, having 334 instances would surpass 4000 connections; and, using autoscaled front-ends on App Engine standard, there is no way to guarantee there will never be 334 or more instances active.
 

Just use a connection pool with a hard cap of 12 and a modest timeout. You want to avoid

I imagine you mean a pool per instance (it's not that clear how to pool across instances) and that does allow you to put a cap of 12 connections per instance. However, it doesn't cap the number of instances (you can't, with autoscaling).
 
leaving a lot of idle connections around, but if you have 4000 actually busy connections I would be worried more about your single node melting down. You probably want to

You mean the node (not directly under my control) that's serving CloudSQL?
 
consider federation or alternative database architectures.

An "alternative architecture" is the approach I would take: a separate service (implemented with B-class instances and thus subject to a cap on max # of instances), each instance with a connection pool; keeping the front-end service totally free from any SQL connection. When now the front-end service does a SQL update or query, it would instead queue up a task for the separate service (presumably using task queues, though maybe pubsub could also be considered). Details such as exact # of instances of the separate service and on what basis to scale them up or down (maybe from a third, "monitoring" service watching the task queues' length) would need to be tweaked based on load-testing (with the hard-coded limit of no more than N/12 separate-service instances -- N being at most 4,000 but maybe less if avoiding overload on the CloudSQL server improves things; also, that 12 may be negotiable downwards if fewer connections in the per-instance pool improve things).


Alex
 

Jeff

On Thu, May 25, 2017 at 1:14 PM, 'Nicholas (Google Cloud Support)' via Google App Engine <google-appengine@googlegroups.com> wrote:
Those are legitimate concerns when dealing with Cloud SQL at this scale.  I would recommend as you've suggested that you keep the max_concurrent_requests sufficiently low to avoid the per instance limit.  This can also tend to keep your instances somewhat more responsive in general.

Regarding the 4000 connection limit, this limit may not seem like much at a large scale but it can be stretched.  Depending on the nature of what those connections are doing, you may be able to offload read-only operations to a read replica, likely reducing the number of concurrent connections required for the master and read replica(s).

Note that there are also other storage services that may be more appropriate depending on the type of data stored, the volume of data, the availability you need and the scale.  For low-volume data (like user profiles), Cloud Datastore may be better suited.  Not knowing exactly what your needs are, I'd recommend reviewing each storage and its limitations to decide what's best for your use case.

On Tuesday, May 23, 2017 at 8:17:49 AM UTC-4, Attila-Mihaly Balazs wrote:
I have an app (Python 2.7, App Engine standard, EU region) which uses CloudSQL to store some low-volume data (like user profiles). Unfortunately CloudSQL has two limitations with regards to the maximum number of connections:

- a global limit of 4000 (https://cloud.google.com/sql/faq - for second-generation instances)
- a per instance limit of 12 (from the same URL as above - "Each App Engine instance running in a standard environment cannot have more than 12 concurrent connections to a Google Cloud SQL instance")

To make sure that I respect this I can take two approaches, neither of them foolproof:

- use frontend (F) instances with automatic scaling and set "max_concurrent_requests" sufficiently low. This solves the second issue, but if there are sufficient requests such that GAE scales my app to many instances, I can break the global connection limit (and there is no "max_num_instances" or something similar for the auto scaler)
- use backaend (B) instances where I can specify "max_instances", however I don't know how many requests are routed to one instance (when "threadsafe: true" is set in app.yaml) - though I suspect not many - and I frequently get the "Request was aborted after waiting too long to attempt to service your request." error because I have too few instances (I'm running in the EU region so can't start more than 25 instances with the basic scaler :( ).

One quick hack would be to use the backend instances but also run multiple services (like backend0, backend1, backend2, etc) and have the client randomly pick one (I control both the server and the client) to work around the 25 instance / service limit, however that feels very much like a hack :(.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/679e332c-21a1-4a38-b5bf-ab7abd681437%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.

Attila-Mihaly Balazs

ulest,
11. juni 2017, 11:56:0511.06.2017
til Google App Engine
Thank you Alex for articulating the reasons so clearly.

To sum it up: in the current setup you can't use Django on AppEngine with FE instances (and CloudSQL) since you can pretty easy DoS yourself (because of the auto-scaler spinning up too many instances and thus quickly exhausting the connection limit). IMHO this can have an easy fix: add a setting to the auto scaler saying "never start more than X instances".

Alex Martelli

ulest,
11. juni 2017, 12:40:4511.06.2017
til google-a...@googlegroups.com
On Sun, Jun 11, 2017 at 8:56 AM, Attila-Mihaly Balazs <dify...@gmail.com> wrote:
Thank you Alex for articulating the reasons so clearly.

To sum it up: in the current setup you can't use Django on AppEngine with FE instances (and CloudSQL) since you can pretty easy DoS yourself (because of the auto-scaler spinning up too many instances and thus quickly exhausting the connection limit). IMHO this can have an easy fix: add a setting to the auto scaler saying "never start more than X instances".

This seems to me to be a very reasonable feature request. If you could please open it on the public issue tracker, that would fit best in our workflow (tuned for the normal case of users opening issues, rather than the rarer one of googlers opening them on users' behalf).


Thanks,

Alex

 

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.

Attila-Mihaly Balazs

ulest,
14. juni 2017, 07:28:0414.06.2017
til Google App Engine
Good idea Alex. I added one here: https://issuetracker.google.com/u/1/issues/62611440

Thanks,
Attila
Svar alle
Svar til forfatter
Videresend
0 nye meldinger