Benchmark adb timeout on /proc/timer_list parsing

53 views
Skip to first unread message

Jamshed Vesuna

unread,
Apr 22, 2015, 4:20:39 AM4/22/15
to tele...@chromium.org
About 1 in 3 trials of running a custom benchmark* fails due to an adb timeout. It looks like a timer is calculating differences based on system jiffies by parsing /proc/timer_list on the device (running this on a tablet), but times out as the timer_list file is extremely large and takes a while to parse. 
Has anyone encountered this and found a solution? My output is below.

*Note that my custom benchmark is just a page_cycler benchmark with a custom page set (called url0)


> sudo ./run_benchmark telemetryBenchmarks.url0  

WARNING:root:Found (system,reference), but you do not have a DISPLAY environment set.

WARNING:root:--browser omitted. Using most recent local build: android-jb-system-chrome

[ RUN      ] url0

77 KB/s (3383 bytes in 0.042s)

WARNING:root:Unable to install test certificate authority on device: 300467da71d37100. Will fallback to ignoring certificate errors.

WARNING:root:Invalid output for showmap ['/system/bin/sh: showmap: not found']

WARNING:root:Invalid output for showmap ['/system/bin/sh: showmap: not found']

# 04/21/15 23:16:37: No response for adb -s 300467da71d37100 shell 'cat "/proc/timer_list" 2> /dev/null', retrying

CRITICAL:root:********************************************************************************

CRITICAL:root:Stack dump for thread 'TimeoutThread-1-for-MainThread'

CRITICAL:root:********************************************************************************

CRITICAL:root:File: "/usr/lib/python2.7/threading.py", line 783, in __bootstrap

CRITICAL:root:  self.__bootstrap_inner()

CRITICAL:root:File: "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner

CRITICAL:root:  self.run()

CRITICAL:root:File: "/home/jamshed/src/build/android/pylib/utils/reraiser_thread.py", line 76, in run

CRITICAL:root:  self._ret = self._func(*self._args, **self._kwargs)

CRITICAL:root:File: "/home/jamshed/src/build/android/pylib/utils/timeout_retry.py", line 150, in RunOnTimeoutThread

CRITICAL:root:  ret[0] = func(*args, **kwargs)

CRITICAL:root:File: "/home/jamshed/src/build/android/pylib/device/decorators.py", line 50, in impl

CRITICAL:root:  return f(*args, **kwargs)

CRITICAL:root:File: "/home/jamshed/src/build/android/pylib/device/device_utils.py", line 907, in ReadFile

CRITICAL:root:  return self.old_interface.GetProtectedFileContents(device_path)

CRITICAL:root:File: "/home/jamshed/src/build/android/pylib/android_commands.py", line 1206, in GetProtectedFileContents

CRITICAL:root:  return command_runner(command)

CRITICAL:root:File: "/home/jamshed/src/build/android/pylib/android_commands.py", line 727, in RunShellCommand

CRITICAL:root:  "'%s'" % command, timeout_time).splitlines()

CRITICAL:root:File: "/home/jamshed/src/third_party/android_testrunner/adb_interface.py", line 99, in SendShellCommand

CRITICAL:root:  retry_count=retry_count)

CRITICAL:root:File: "/home/jamshed/src/third_party/android_testrunner/adb_interface.py", line 80, in SendCommand

CRITICAL:root:  retry_count=retry_count)

CRITICAL:root:File: "/home/jamshed/src/third_party/android_testrunner/run_command.py", line 56, in RunCommand

CRITICAL:root:  return_output=return_output, stdin_input=stdin_input)

CRITICAL:root:File: "/home/jamshed/src/third_party/android_testrunner/run_command.py", line 128, in RunOnce

CRITICAL:root:  t.join(timeout_time)

CRITICAL:root:File: "/usr/lib/python2.7/threading.py", line 960, in join

CRITICAL:root:  self.__block.wait(delay)

CRITICAL:root:File: "/usr/lib/python2.7/threading.py", line 358, in wait

CRITICAL:root:  _sleep(delay)

CRITICAL:root:********************************************************************************

Jamshed Vesuna

unread,
Apr 22, 2015, 5:56:11 AM4/22/15
to tele...@chromium.org
For now, I fixed this by changing the command from 'cat /proc/timer_list', which returns about 27,000 lines, to 'grep -i "jiffies:" /proc/timer_list', which returns about 5 lines, speeding up the parsing time.
Message has been deleted
Message has been deleted
Message has been deleted

John Budorick

unread,
Apr 23, 2015, 11:16:05 AM4/23/15
to Ned Nguyen, tele...@chromium.org, rne...@chromium.org
It is, but not for the reason you might think. That stack trace uses a version of DeviceUtils that is almost three months old. perezju@ converted ReadFile off of android_commands in https://codereview.chromium.org/806843002/, which landed in late January.

On Wed, Apr 22, 2015 at 10:32 PM Ned Nguyen <nedn...@google.com> wrote:
+John, Randy

This finding is interesting.


On Wednesday, April 22, 2015 at 1:20:39 AM UTC-7, Jamshed Vesuna wrote:

Ned

unread,
Apr 23, 2015, 2:35:26 PM4/23/15
to John Budorick, tele...@chromium.org, rne...@chromium.org
So I assume that fixing https://code.google.com/p/chromium/issues/detail?id=476709 should also fix this? 

John Budorick

unread,
Apr 23, 2015, 2:40:19 PM4/23/15
to Ned, tele...@chromium.org, rne...@chromium.org
No, that's not relevant to this. The code in question is already using DeviceUtils.ReadFile (instead of the equivalent via old_interface).

I'm not sure if there's still an underlying issue here or not, but the stack trace is obsolete and has been for months. If it recurs with an updated version of the code, I'll be happy to look at it again.

Ned

unread,
Apr 23, 2015, 2:49:18 PM4/23/15
to John Budorick, tele...@chromium.org, rne...@chromium.org
I see, so he's probably using an outdated version of telemetry. 

However, we are still reading '/proc/timer_list' directly & parse jiffies ourselves here: https://code.google.com/p/chromium/codesearch#chromium/src/tools/telemetry/telemetry/core/platform/linux_based_platform_backend.py&l=57

Note that android_platform_backend subclasses linux_based_platform_blackend, so the optimization Jamshed points out still applies.

John Budorick

unread,
Apr 23, 2015, 4:02:38 PM4/23/15
to Ned, tele...@chromium.org, rne...@chromium.org
I'm not sure I understand your meaning about the "optimization Jamshed points out." AndroidPlatformBackend has been using the DeviceUtils.ReadFile interface (rather than the AndroidCommands equivalent via old_interface) since last July: https://codereview.chromium.org/358993003/

Ned

unread,
Apr 23, 2015, 4:56:30 PM4/23/15
to John Budorick, tele...@chromium.org, rne...@chromium.org
Hi John, 
I mean that instead of pulling the huge file "/proct/timer_list" from device to host (https://code.google.com/p/chromium/codesearch#chromium/src/tools/telemetry/telemetry/core/platform/linux_based_platform_backend.py&l=56), and parse line with 'jiffies', we can use 'grep -i "jiffies:" /proc/timer_list" to reduce the size of data we transfer from the device to the host.

John Budorick

unread,
Apr 23, 2015, 5:23:50 PM4/23/15
to Ned, tele...@chromium.org, rne...@chromium.org
Ah, yes, I had definitely lost you. That's a completely reasonable optimization, though it'd obviously be on the telemetry side.

Ned

unread,
Apr 27, 2015, 12:43:42 PM4/27/15
to John Budorick, tele...@chromium.org, rne...@chromium.org
For the record, patch https://codereview.chromium.org/1106043003/ does the optimization mentioned above.

Randy Nephew

unread,
Apr 27, 2015, 12:57:42 PM4/27/15
to Ned, John Budorick, tele...@chromium.org
I'm now seeing this outputed to stdout when running telemetry:
'  .idle_jiffies   : 269191'
'  .last_jiffies   : 269193'
'  .next_jiffies   : 269194'
'jiffies: 269193'
'  .idle_jiffies   : 273945'
'  .last_jiffies   : 273945'
'  .next_jiffies   : 273964'
'jiffies: 273945'
Was this intentional? It seems to add a lot of spam to the output without having any -v flags set.
.....
[ RUN      ] http://siriuslymeg.tumblr.com/
'  .idle_jiffies   : 42552378'
'  .last_jiffies   : 42552378'
'  .next_jiffies   : 42552380'
'jiffies: 42552379'
WARNING:root:Dumpsys battery not receiving updates. Run dumpsys battery reset if this is in error.
'  .idle_jiffies   : 42557231'
'  .last_jiffies   : 42557583'
'  .next_jiffies   : 42557584'
'jiffies: 42557583'
[       OK ] http://siriuslymeg.tumblr.com/ (57065 ms)
[ RUN      ] http://wapbaike.baidu.com/
'  .idle_jiffies   : 42557938'
'  .last_jiffies   : 42557961'
'  .next_jiffies   : 42557962'
'jiffies: 42557962'
WARNING:root:Dumpsys battery not receiving updates. Run dumpsys battery reset if this is in error.
'  .idle_jiffies   : 42562740'
'  .last_jiffies   : 42562740'
'  .next_jiffies   : 42562817'
'jiffies: 42562741'
[       OK ] http://wapbaike.baidu.com/ (51574 ms)
[  PASSED  ] 10 tests.

Ned

unread,
Apr 27, 2015, 1:00:05 PM4/27/15
to Randy Nephew, John Budorick, tele...@chromium.org
I think some implementation of linux_based_platform_backend.RunCommand() output the command's output to the stdout
Reply all
Reply to author
Forward
0 new messages