WTF Testing Service API

149 views

Skip to first unread message

Adrian Yee

unread,

Jun 20, 2011, 9:22:48 PM6/20/11

to web-testin...@googlegroups.com

Hi Everyone,

We really liked the idea of the Web Testing Framework APIs and wanted to
create an implementation of it for GTmetrix. Upon development however,
we found that the Testing Framework API wasn't quite complete, and parts
of it were a little too WebPagetest specific. To address these issues,
we've taken the API and essentially "completed" it by filling the
missing pieces and adding some changes to make it more generic and RESTful.

We're hoping this will kick off further discussion and finalize a
framework that can be used by everyone. Please take a look at our
proposed API specification here:

http://gtmetrix.com/api/

A few thoughts regarding the API:

* We should have a core set of standard test parameters. Not all test
services will support them, but the names and values will be standard.
For non-standard (or yet to be standard) parameters, I propose that
these be sent as x-{service}-{param} (eg. x-metrix-adblock).

* The parameters that the GTmetrix API currently supports is NOT a
complete list of what the WTF API will support.

* We've tried to make the API more RESTful by making the responses a
little more self documenting. You should be able to figure out what to
do from just the response. For example, the response to starting a test
contains a poll_state_url attribute with the URL to poll, instead of
having the user put the URL together themselves.

* With a poll state response, we return the state (so the user knows
when to stop polling) and when the test is complete, the results and
URLs to other downloadable resources.

* The results object will vary depending on the service. GTmetrix
returns a summary of the test results, but other services may return,
for example, a HAR file.

* The resources object may not exist for some services. GTmetrix
returns URLs for other downloadable resources like the HAR file or Page
Speed and YSlow beacons or PDF report. We think this is better than
encoding large chunks of binary data and putting it into a HAR file.

* We've also added the /locations request for returning a list of the
test locations. We might want to also incorporate what user agents are
available in each location.

Authentication

This was discussed on the list previously, but there was no real
consensus on what to do. We decided to keep things simple and stick
with the standard HTTP Basic Access Authentication route. This coupled
with HTTPS should keep things secure without having to re-invent the wheel.

Response Format

Again, for the sake of simplicity, we've used REST principles and HTTP
status codes for status and errors, passing the error message in the
content body as JSON.

More Discussion

* How can we make the API work with services which are able to return a
result (almost) immediately (eg. Page Speed Online)? Perhaps the
response from the start test can return results/resources?

* What about returning the queue position with the test state?

* Is there any way to get JSONP working with all this?

* Should we also support returning XML responses, as well as JSON?

* Do we want to support callbacks?

Based off this revised specification, we've created an implementation of
the API for GTmetrix. We invite the web performance community to take a
closer look and share their thoughts.

Hopefully this brings us one step closer to finalizing the WTF API
specification!

Adrian

P.S. Unfortunately, we weren't able to make it to velocity this year, so
please excuse us if some of the above has been already discussed.

Pat Meenan

unread,

Jun 22, 2011, 12:17:45 PM6/22/11

to web-testin...@googlegroups.com

Awesome, thanks for pushing forward on this. If it's not too much to
ask, would you mind updating the project wiki with the changed
interfaces (unless there are any objections)?
http://code.google.com/p/web-testing-framework/wiki/TestingServiceAPI I
added your account as a member, just let me know if I need to add a
different account.

My comments inline below...

> * We should have a core set of standard test parameters. Not all test
> services will support them, but the names and values will be standard.
> For non-standard (or yet to be standard) parameters, I propose that
> these be sent as x-{service}-{param} (eg. x-metrix-adblock).

Sounds good to me.

> * We've tried to make the API more RESTful by making the responses a
> little more self documenting. You should be able to figure out what to
> do from just the response. For example, the response to starting a
> test contains a poll_state_url attribute with the URL to poll, instead
> of having the user put the URL together themselves.
>
> * With a poll state response, we return the state (so the user knows
> when to stop polling) and when the test is complete, the results and
> URLs to other downloadable resources.

Does the initial response also include the state as well as the url? I
was basically shooting to have the initial response just be the same as
the poll and results responses (depending if the test is finished or not).

> * The results object will vary depending on the service. GTmetrix
> returns a summary of the test results, but other services may return,
> for example, a HAR file.

As much as possible I would like to standardize this (and I think it's
pretty critical that we do). My main question is if we want to use HAR
as the response container (and just populate it as much as makes sense
for a given service) and extend the HAR spec as needed or if we want a
structured summary response that tells you where you can go to get the HAR.

> * The resources object may not exist for some services. GTmetrix
> returns URLs for other downloadable resources like the HAR file or
> Page Speed and YSlow beacons or PDF report. We think this is better
> than encoding large chunks of binary data and putting it into a HAR file.

Are there services where the response will be transitional and not
stored on the server (and do we have to plan for that)? I could see
extending the HAR spec to allow for including optimization checks -
though a report PDF, etc not so much). Is the PDF report generated from
the other available data and just in a different format or does it
include information not otherwise available? If it can be reconstructed
from the HAR and optimization data then I'd love to see that be a
service that sits on top of something like the data access API's
(basically a service that can take your resulting HAR and turn it into a
PDF report).

> * How can we make the API work with services which are able to return
> a result (almost) immediately (eg. Page Speed Online)? Perhaps the
> response from the start test can return results/resources?

My thinking was that in this case the response from submitting the test
request would be the same as you would get if you went to retrieve the
results later (basically what you are proposing). That would allow for
stateless servers that just return the result immediately and don't have
to keep anything on the server for later.

> * What about returning the queue position with the test state?

Yes, please :-) (optional since it may not make sense for everyone).

> * Is there any way to get JSONP working with all this?

Without auth or if the auth could be done using query parameters then it
should be trivial. Doing it with any of the auth schemes we discussed,
not so much. I don't know that it's really a big deal but it does mean
that you won't be able to have a completely browser-based interface that
talks cross domain to a testing service (well, easily anyway).

> * Should we also support returning XML responses, as well as JSON?

My vote would be no (even though that's my current API). There are
reasonable JSON libraries for all languages and it's generally a whole
lot easier than XML to parse. Allowing for both will just mean that
some tools will not work with some services because the service may or
may not have implemented the XML interfaces.

> * Do we want to support callbacks?

Yes. The main use case I see for this is daisy-chaining the testing to
the data API's so you can have the test results pushed directly into the
data layer without having to have an active agent do a pull/push
manually. It streamlines the flow a bunch (probably need to support
multiple so people can tee the data into several different processing
pipelines).

> Based off this revised specification, we've created an implementation
> of the API for GTmetrix. We invite the web performance community to
> take a closer look and share their thoughts.
>
> Hopefully this brings us one step closer to finalizing the WTF API
> specification!

Sweet, thank you.

-Pat

Adrian Yee

unread,

Jun 22, 2011, 4:29:55 PM6/22/11

to web-testin...@googlegroups.com

On 06/22/11 09:17, Pat Meenan wrote:
> Awesome, thanks for pushing forward on this. If it's not too much to
> ask, would you mind updating the project wiki with the changed
> interfaces (unless there are any objections)?
> http://code.google.com/p/web-testing-framework/wiki/TestingServiceAPI I
> added your account as a member, just let me know if I need to add a
> different account.

Will do when I get a chance. Would like to get more discussion going
and get a general agreement on things before updating it though.

>> * We've tried to make the API more RESTful by making the responses a
>> little more self documenting. You should be able to figure out what to
>> do from just the response. For example, the response to starting a
>> test contains a poll_state_url attribute with the URL to poll, instead
>> of having the user put the URL together themselves.
>>
>> * With a poll state response, we return the state (so the user knows
>> when to stop polling) and when the test is complete, the results and
>> URLs to other downloadable resources.
>
> Does the initial response also include the state as well as the url? I
> was basically shooting to have the initial response just be the same as
> the poll and results responses (depending if the test is finished or not).

It currently doesn't, but it probably should. In this case,
poll_state_url won't exist and state = "completed". Would test errors
be handled the same way as polling (ie. state = "error" and error =
"message")? Seems a bit weird that in this case we don't use the HTTP
status codes to report an error, but to keep things consistent with
polling, I'm willing to accept this inconsistency.

>> * The results object will vary depending on the service. GTmetrix
>> returns a summary of the test results, but other services may return,
>> for example, a HAR file.
>
> As much as possible I would like to standardize this (and I think it's
> pretty critical that we do). My main question is if we want to use HAR
> as the response container (and just populate it as much as makes sense
> for a given service) and extend the HAR spec as needed or if we want a
> structured summary response that tells you where you can go to get the HAR.

I'm not sure you can standardize on the response. Services are all
going to return completely different things and forcing everything into
a HAR file seems inefficient. As you can see with GTmetrix, it can
return a whole slew of HAR, beacons, and other files, but some other
service might only return a single score.

A user of the API may not want all the resources returned from the test,
so returning a large chunk of data would be wasteful in that case. Add
in the space and processing required to Base64 encode/decode binary data
(in the case of WebPagetest, could be a large video). I guess you could
specify which resources you wanted when you start the test.

One thing I forgot to take into consideration was returning multiple
results (ie. first and repeat views). results should probably be
changed to an array to support this and resources would be moved into
results.

>> * The resources object may not exist for some services. GTmetrix
>> returns URLs for other downloadable resources like the HAR file or
>> Page Speed and YSlow beacons or PDF report. We think this is better
>> than encoding large chunks of binary data and putting it into a HAR file.
>
> Are there services where the response will be transitional and not
> stored on the server (and do we have to plan for that)? I could see
> extending the HAR spec to allow for including optimization checks -
> though a report PDF, etc not so much). Is the PDF report generated from
> the other available data and just in a different format or does it
> include information not otherwise available? If it can be reconstructed
> from the HAR and optimization data then I'd love to see that be a
> service that sits on top of something like the data access API's
> (basically a service that can take your resulting HAR and turn it into a
> PDF report).

With GTmetrix a PDF report is just a PDF version of the report page that
GTmetrix creates on the web front end.

I don't think you can standardize everything (I'd like to be proven
wrong though!) since it needs to be flexible enough for usage that we
don't envision.