Frustrated by logs

155 views
Skip to first unread message

Kristopher Giesing

unread,
May 24, 2013, 11:51:23 AM5/24/13
to google-a...@googlegroups.com
I'm having a devil of a time using the logs to debug my application.

There's a basic pattern of access that I'm trying to troubleshoot.  In order to do this, I would like to analyze correlations between the following properties of requests:

- Request path
- Request status
- Instance ID used to serve the request

If I could filter on the first two properties in the admin console, I would probably be able to diagnose the issue.  But that never seems to work properly; the admin console seems to give me some random subset of the relevant logs - sometimes none, sometimes a few, but never all of them.

So I tried downloading the logs using appcfg.sh, and then filtering the logs myself.  That works - except that the amount of data per request that you get from downloading logs doesn't include the instance ID used to serve the request!

I'm getting enormously frustrated by this.  Am I missing something about the admin console, or is its search function really this broken?  Is there some way to get more verbose logging from the logs-download feature of appcfg.sh?

Any insights or help would be greatly appreciated.

Thanks,

- Kris

Takashi Matsuo

unread,
May 24, 2013, 2:03:52 PM5/24/13
to google-a...@googlegroups.com

Hi Kristopher,

On Fri, May 24, 2013 at 8:51 AM, Kristopher Giesing <kris.g...@gmail.com> wrote:
I'm having a devil of a time using the logs to debug my application.

There's a basic pattern of access that I'm trying to troubleshoot.  In order to do this, I would like to analyze correlations between the following properties of requests:

- Request path
- Request status
- Instance ID used to serve the request

If I could filter on the first two properties in the admin console, I would probably be able to diagnose the issue.  But that never seems to work properly; the admin console seems to give me some random subset of the relevant logs - sometimes none, sometimes a few, but never all of them.

Yes, the process stops searching when it hits a certain amount of time limit. However, if you click 'Next Page' on the result, you can navigate through the result. I hope it works for you.
 

So I tried downloading the logs using appcfg.sh, and then filtering the logs myself.  That works - except that the amount of data per request that you get from downloading logs doesn't include the instance ID used to serve the request!

I'm getting enormously frustrated by this.  Am I missing something about the admin console, or is its search function really this broken?  Is there some way to get more verbose logging from the logs-download feature of appcfg.sh?
 
The instance id is available via 'INSTANCE_ID' environment variable. How about to write the instance_id to application logs, so that you can download them with --severity option?

But yes, I understand you saying there should be a way to easily include the instance ids in the downloaded logs. Can you file a feature request for that?

-- Takashi


Any insights or help would be greatly appreciated.

Thanks,

- Kris

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Takashi Matsuo | Developers Programs Engineer | tma...@google.com

Vinny P

unread,
May 24, 2013, 2:49:27 PM5/24/13
to google-a...@googlegroups.com
Hello Kristopher,


On Friday, May 24, 2013 10:51:23 AM UTC-5, Kristopher Giesing wrote:
I'm getting enormously frustrated by this.  Am I missing something about the admin console, or is its search function really this broken?  Is there some way to get more verbose logging from the logs-download feature of appcfg.sh?

Yup, you're correct. Searching logs can be a PITA.

Personally, I use a combination of a homebuilt logging inspector (a backend using the Logging API to inspect error logs) and Google BigQuery to analyze my logs. Streak wrote up a blog entry about using Google BQ to inspect logging, you can read it here: http://blog.streak.com/2012/07/export-your-google-app-engine-logs-to.html .



-----------------
-Vinny P
Technology & Media Advisor
Chicago, IL

My Go side project: http://invalidmail.com/

Takashi Matsuo

unread,
May 24, 2013, 3:24:24 PM5/24/13
to google-a...@googlegroups.com

As Vinny said, certainly BigQuery is something you can consider.
Here is another project for ingesting App Engine logs into BigQuery:

-- Takashi


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Matthew Blain

unread,
May 24, 2013, 5:00:47 PM5/24/13
to google-a...@googlegroups.com
The instance is available today in the logs you can download. I recently updated logparser.py to not fail when it shows up in the logs.

(Looks like I didn't update the docstring for this, but the fundamentals are correct.)

Takashi Matsuo

unread,
May 24, 2013, 5:32:16 PM5/24/13
to google-a...@googlegroups.com

Thanks Matthew,

I confirmed that you can download logs with instance_id by running the Python appcfg.py with the '--include_all' option as follows:
$ appcfg.py request_logs --oauth2 -A APP_ID --version VERSION --include_all LOG_FILENAME

Unfortunately, it doesn't seem Java appcfg have that option yet though.

Nick

unread,
May 25, 2013, 4:01:12 AM5/25/13
to google-a...@googlegroups.com
The logs, while good to have, are a super PITA to use.

Things that would be great to fix:
  • The search option labelled 'Since' is exactly the opposite of what it says. Everyone i've ever spoken to gets confused by this, the logs work backwards from this time, not forwards. I must have helped at least 10 people understand this to use the logs.
  • On the dashboard you can see errors in the last 24 hours, and if you click through they take you into the logs to see what the problem is. I have never, ever seen this work. The filter it applies (status and path) flat out doesn't work and apparently never has.
  • It would be super, super useful if you could filter out different log levels. For example, if you filter for anything with info and above, you still see all debug logs. This means even when looking for errors or warnings, you have to look by eye through plenty of lines of logs. An extra option to show only log levels above a certain level would be really useful
  • The ability to control how long to search backwards through logs would be great. Sometimes, you really need to know if something happened at all in the last month, and i'd rather pay to search a months or a years worth of logs than sit there clicking 'Use Next link to search older records.'. Maybe having the timestamp filter as a 'from' and 'to', rather than 'Since' would help.
I appreciate that distributed logging and live log search is a non-trivial problem, but it feels like someone already has done all the hard work and that the log viewer just sort of stopped progressing at some point.
Having said that, the addition of being able to search again for older entries back through time was really useful. Before that you had to be lucky enough to catch problems pretty quickly.

Jeff Schnitzer

unread,
May 25, 2013, 3:37:57 PM5/25/13
to Google App Engine
I do want to point out that the log system is actually one of GAE's best features. If you spend any time working on Heroku or Appfog or EB you will be utterly shocked at how primitive the logging systems there are.

This is not to say that GAE logging is perfect, but it's *miles* ahead of other PaaS systems. My biggest complaint with GAE's log system is that it doesn't remember the state of the UI - how many times have I changed the dropdown from 20 to 50 entries? I'm half-tempted to write a chrome extension to "fix" this problem.

Jeff

Kristopher Giesing

unread,
May 25, 2013, 4:39:08 PM5/25/13
to google-a...@googlegroups.com
Comments below...


On Saturday, May 25, 2013 1:01:12 AM UTC-7, Nick wrote:
The logs, while good to have, are a super PITA to use.

Things that would be great to fix:
  • The search option labelled 'Since' is exactly the opposite of what it says. Everyone i've ever spoken to gets confused by this, the logs work backwards from this time, not forwards. I must have helped at least 10 people understand this to use the logs.
This was the crux of my issue.  I suspected that "since" didn't do what I wanted, but I had no idea what it actually did.

Now that I know how the system works, I have to say, the UI is awfully misleading.  I know Google isn't exactly famous for its HCI, but... jeez. I've never seen a search return an empty results list and then give you a "next page" link.  "Next page" of what result set?  Isn't it empty?  It doesn't make any sense.

Imagine using Google to search the web and have it give you an empty result set, with a link at the bottom that says in tiny print "Last IP address searched: 53.207.121.84. Use the Next link to search higher IP ranges."  Would you use that search engine?

I understand that the search can take a long time if you search *everything*.  This is why I thought "Since" would you a maximum time into the past where the search would halt.  And additionally the search would stop after the maximum rows specified in the "Rows" pop-up.  If that was the way the search feature actually worked, it would be pretty useful; you'd just have to change the default for "Since" to (now minus 1 hour) rather than (now) and viola, very usable UI.
  • On the dashboard you can see errors in the last 24 hours, and if you click through they take you into the logs to see what the problem is. I have never, ever seen this work. The filter it applies (status and path) flat out doesn't work and apparently never has.
It *does* actually work... as long as you click "next" an indeterminate number of times.  The filter is correct, but the time range searched is an arbitrary distance from the request log you want.  I didn't realize this until this thread educated me on how the time ranges really function.

- Kris

Kristopher Giesing

unread,
May 25, 2013, 4:41:06 PM5/25/13
to google-a...@googlegroups.com
Thanks Takashi.  I think I used the Python appcfg.py script for something else a long time ago - so IIRC it's possible to use this script with Java apps.  Is that true?

- Kris

Kristopher Giesing

unread,
May 25, 2013, 7:35:18 PM5/25/13
to google-a...@googlegroups.com, je...@infohazard.org
And as an addendum, once I finally understood how to get to the logs I needed, I was able to diagnose the issue (at least from the server side - I now know the fault is somewhere on the client).  Without being able to trace all the events, including server instance IDs, I wouldn't have been able to make that determination.

David Hardwick

unread,
May 27, 2013, 8:47:11 PM5/27/13
to google-a...@googlegroups.com
We used Splunk to analyze our GAE logs and it was extremely helpful.  It can be costly, but the insight it gives you can also save you a lot of costs too.

Takashi Matsuo

unread,
May 28, 2013, 2:08:57 AM5/28/13
to google-a...@googlegroups.com
On Sat, May 25, 2013 at 1:41 PM, Kristopher Giesing <kris.g...@gmail.com> wrote:
Thanks Takashi.  I think I used the Python appcfg.py script for something else a long time ago - so IIRC it's possible to use this script with Java apps.  Is that true?

Yes!
Reply all
Reply to author
Forward
0 new messages