Re: Stability of the JSON formats from Telemetry

14 views
Skip to first unread message

Ned

unread,
Mar 2, 2018, 9:58:56 AM3/2/18
to Egor Pasko, chrome-spe...@google.com, telemetry, gb...@chromium.org, benjh...@chromium.org

On Fri, Mar 2, 2018 at 9:37 PM Egor Pasko <pa...@chromium.org> wrote:
Hello Telemetry owners,

George (gbiv@) and I are considering using the "raw JSON" exported from results.html in Colaboratory notebooks. In the past I used chartjson, and now discovered that that there is also "merged JSON" (which I assume to be a few "raw" jsons merged together, not sure).

For Colab import some sort of validation of the data format would be nice (if done as part of the ontebook). This makes us wonder about a few things:

1. what are the recommended formats for the usecase of a data analysis over a number of Telemetry run comparisons?
2. are these formats stabilized? do they change frequently?

Thanks!

Egor Pasko

unread,
Mar 2, 2018, 10:03:00 AM3/2/18
to telemetry, gb...@chromium.org, Ned Nguyen, benjh...@chromium.org

Ned

unread,
Mar 2, 2018, 10:04:00 AM3/2/18
to Egor Pasko, telemetry, gb...@chromium.org, benjh...@chromium.org, Ethan Kuefner, Simon Hatch
Ooops, sorry.

-chrome-speed-services@ to avoid cross posting @google & @chromium

Ben Hayden

unread,
Mar 2, 2018, 12:53:29 PM3/2/18
to Ned, Egor Pasko, telemetry, gb...@chromium.org, Ethan Kuefner, Simon Hatch
We'd like to deprecate chartjson and promote use of HistogramSet JSON for all analysis of telemetry and chromeperf dashboard data.

HistogramSet JSON doesn't change frequently, but there are a few changes in the pipeline:
  • Sample objects,
  • deprecating RelatedHistogramMap and RelatedHistogramBreakdown, and
  • possibly eventually a deeper rearchitecting of HistogramSet JSON to reduce repetition of diagnostic keys and type names in order to reduce memory and bandwidth usage.
With that in mind, I'd like to see if there's a way to use the Python HistogramSet API in colab so that you don't need to reimplement the format over and over.
In catapult#3806, I speculated about deploying a kernel containing that API, plus possibly some helper functions to fetch data from the dashboard and generate tables and charts.
Does that kernel sound like a good idea? Would you be interested in setting that up?

Juan Antonio Navarro Pérez

unread,
Mar 5, 2018, 4:33:57 AM3/5/18
to Ben Hayden, Ned, Egor Pasko, telemetry, gb...@chromium.org, Ethan Kuefner, Simon Hatch
My bet is that the path of least friction would be to use the Python HistogramSet API, as Ben suggested, to load the histograms and then dump the data you need into some csv files (I haven't tried it myself but there is also a histograms2csv already). Then those files can then be easily imported and worked with in colab.

After we all gather a bit more experience doing this we will probably have a better picture of what a "colab kernel for histograms" will look like. (That sounds like an exciting idea too!)

--
You received this message because you are subscribed to the Google Groups "telemetry" group.
To unsubscribe from this group and stop receiving emails from it, send an email to telemetry+...@chromium.org.

Juan Antonio Navarro Pérez

unread,
Mar 5, 2018, 11:46:46 AM3/5/18
to Ben Hayden, Ned, Egor Pasko, telemetry, gb...@chromium.org, Ethan Kuefner, Simon Hatch
For what it's worth, given a results.html file and using the scripts in tracing/bin, I just did:

$ results2json results.html results.json
$ histograms2csv results.json results.csv

Then imported into a colab and worked wonderfully.

George Burgess

unread,
Apr 5, 2018, 6:08:21 AM4/5/18
to per...@google.com, benjh...@chromium.org, Ned Nguyen, pa...@chromium.org, tele...@chromium.org, eaku...@chromium.org, simon...@chromium.org
Thanks for all the pointers, especially to tracing/bin!

It looks like the python API gives all the functionality we could ask for here. The histograms2csv doesn't quite cover all the information we need (IIRC, it was lacking the pageset repeat number, and I couldn't figure out how to get it to give non-summarized data. My memory is hazy though, since I did this 3 weeks ago, and meant to update this with my findings then...), but whipping up a quick script using the provided API was really straightforward.

Making a colab kernel once we have a better idea of what all we want sounds like a good idea. I'm not anywhere near a colab expert, though, so... :)
Reply all
Reply to author
Forward
0 new messages