App Engine sometimes needs to stop Backend instances for various
reasons other than your manual intervention.
You can write a shutdown
hook to do something for your halfway job within the 30 seconds
window.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/MPFGbkGMsl8J.
I believe it's hanging when I'm calling AdWords apis (I have tried breaking these calls up a bit, waiting a little between batches, not caching the urlfetch) but I lose some of the logging when I get this error and can't seem to figure out if there are local variables eating up all the memory or if I need to continue to break up the api calls.Is there any way to help me diagnose?
service.call #RPCs real time
rdbms.Exec 1004 78364ms
urlfetch.Fetch 721 266851ms
rdbms.CloseConnection 497 7027ms
rdbms.OpenConnection 497 7599ms
rdbms.ExecOp 491 48623ms
logservice.Flush 10 41ms
memcache.Set 1 6ms
memcache.Get 1 7ms
@576702ms urlfetch.Fetch real=59837ms api=0ms cost=0 billed_ops=[]
Request: URLFetchRequest<Method=2, Url='https://adwords.google.com/api/a.../v201306/AdGroupCriterionService', ...>
Response: URLFetchResponse<>
OK - here's what I see. The script ran two out of two times successfully with AppStats (of course). There are many, many rdbc and AdWords API calls. However, the longest real time spent was on a single AdWords API call for 60 seconds.
- The first time I successfully ran the script with appstats I got an "Invalid or stale record" error when trying to view the timeline. I made no changes, waited about an hour, then reran the script and it worked...
1. Copied all code (including dependencies) to Google Cloud Storage bucket.
2. Created shell bash script to be run as a startup script by GCE instance that included downloading all required libraries and the code from the bucket. It also copies the syslog to another bucket upon complete. Put this in its own bucket on GS. Side note: the adspygoogle API code had a lot of logging at the info level if you make a lot of calls. I forked a copy and modified the log level to debug.
example library download (used default debian-7-wheezy-v20130926 image):
apt-get -y install python-mysqldb
example copy from GS bucket via metadata. The cs-bucket field is just the string value of the name of your bucket. since the gsutil copied the files into a folder with the name of the bucket I had to then copy the files into my working directory in order to access it while running my main code. I very much like working with the preinstalled gsutil - would love to see a similar utility for Cloud SQL that does more than administration (e.g. can query the databases).
DEST_DIR=$(pwd)CS_BUCKET=$(curl http://metadata/computeMetadata/v1beta1/instance/attributes/cs-bucket)gsutil cp -R gs://$CS_BUCKET $DEST_DIRcp -r $CS_BUCKET/. $DEST_DIR
3. Reserved an external IP on Compute. I need the external IP to support Cloud SQL access. I authorized this IP explicity per instructions on Cloud SQL here https://developers.google.com/cloud-sql/docs/external.
4. Modified the module code on GAE to just start a GCE instance. Useful information found here https://github.com/GoogleCloudPlatform/compute-getting-started-python. I add metadata for the GS bucket locations (startup script and code) and I added the external IP (natIP) to the instance and I toned down some of the logging (the startup script takes a while to run, so I increased the buffer time between attempts to shut down the instance).
If using the gce file from the information above, then adding an external IP looks something like this, where external_ip is the string of the address:
instance['networkInterfaces'] = [{'accessConfigs': [{'type': 'ONE_TO_ONE_NAT', 'name': 'External NAT', 'natIP': external_ip}],'network': '%s/global/networks/%s' % (self.project_url, network)}]