Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Group info
Members: 42
Language: English
Group categories: Not categorized
More group info »
Recent pages and files
HTTP Tracing - Export Format    
This document is intended to describe a HTTP Archive format that should be used when exporting data from Firebug Net panel. The current version of the format isn't finalized and is open for further proposals.

Firebug Net Panel

The purpose of Firebug's Net panel is collecting and displaying various info about network activity related to a web page. This info is useful for verifying page functionality from the network perspective and also for analyzing page load performance. The latter is also one of the Net panel's goals - assisting a web developer to optimize and accelerate page load.

Since tracing data are collected on the client side (within the browser) it's important to have a way how to export all from Firebug and allow processing by other tools. Such a possibility is very useful for further analysis by other (custom) tools and could be also useful for statistical page processing.

The Net panel export is implemented as an extension (download here) for Firebug 1.4 (a26 and higher) and having a suitable and flexible export format is the key for this feature. At the present, the feature is available as a Firebug extension. The source code can be downloaded from here.

HTTP Trace Tools

There is several existing tools that can be used for HTTP tracking. To name a few:
They all have various advantages over each other. In-browser tools can easily group requests by page and analyze browse cache usage while network-level tools can easily gather low level detailed info (for example HTTP compression). But in general, they are all intended to track HTTP traffic.

HTTP Archive v1.1

One of the goals of the HTTP Archive format is to be flexible enough so, it can be adopted across projects and various tools. This should allow effective processing and analyzing data coming from various sources. Notice that resulting HAR file can contain privacy & security sensitive data and user-agents should find some way to notify the user of this fact before they transfer the file to anyone else.

The current proposal described below is based on existing HTTPWatch's export format (based on XML), but is in JSON format.

HAR files are required to be saved in UTF-8 encoding, other encodings are forbidden.

A web-based viewer for HTTP Archive data is available here.
A NetExport extension for Firebug 1.4.0a26 and higher available here.

Exported Data Structure

Let's take a look a the structure definition.

<log>
This object represents root of all exported data.

{
    "log": {
        "version" : "1.1",
        "creator" : {},
        "browser" : {},
        "pages": [],
        "entries": []
    }
}
  • version [string] - Version number of the format. If empty, string "1.1" is assumed by default.
  • creator [object] - Name and version info of the log creator application.
  • browser [object, optional] - Name and version info of used browser.
  • pages [array, optional] - List of all exported (tracked) pages. Leave out this field if the application does not support grouping by pages.
  • entries [array] - List of all exported (tracked) requests.

There is one <page> object for every exported web page and one <entry> object for every HTTP request. In case when an HTTP trace tool isn't able to group requests by a page, the <pages> object is empty and individual requests doesn't have a parent page.


<creator> & <browser>
These objects share the same following structure.

"creator": {
    "name": "Firebug",
    "version": "1.5"
}

"browser": {
    "name": "Firefox",
    "version": "3.5"
}
  • name [string] - Name of the application/browser used to export the log.
  • version [string] - Version of the application/browser used to export the log.

<pages>

This object represents list of exported pages.


"pages": [

    {
        "startedDateTime": "2009-04-16T12:07:25.123+01:00",
        "id": "page_0",
        "title": "Test Page",
        "pageTimings": {...}

    }

]

  • startedDateTime [string] - Date and time stamp for the beginning of the page load (ISO 8601 - YYYY-MM-DDThh:mm:ss.sTZD, e.g. 2009-07-24T19:20:30.45+01:00).
  • id [string] - Unique identifier of a page within the <log>. Entries use it to refer the parent page.
  • title [string] - Page title.
  • pageTimings[object] - Detailed timing info about page load.

<pageTimings>
This object describes timings for various events (states) fired during the page load. All times are specified in milliseconds. If a time info is not available appropriate field is set to -1.

"pageTimings": [

    {
        "onContentLoad": 1720,
        "onLoad": 2500
    }
]
  • onContentLoad [number, optional] - Content of the page loaded. Number of milliseconds since page load started (page.startedDateTime). Use -1 if the timing does not apply to the current request.
  • onLoad [number,optional] - Page is loaded (onLoad event fired). Number of milliseconds since page load started (page.startedDateTime). Use -1 if the timing does not apply to the current request.
Depeding on the browser, the onContentLoad property represents DOMContentLoad event or document.readyState == interactive.


<entries>
This object represents an array with all exported HTTP requests. Sorting entries by startedDateTime (starting from the oldest) is preferred way how to export data since it can make importing faster. However the reader application should always make sure the array is sorted (if required for the import).

"entries": [

    {
        "pageref": "page_0",
        "startedDateTime": "2009-04-16T12:07:23.596Z",
        "time": 50,
        "request": {...},
        "response": {...},
        "cache": {...},
        "timings": {}
    }
]
  • pageref [string, unique, optional] - Reference to the parent page. Leave out this field if the application does not support grouping by pages.
  • startedDateTime [string] - Date and time stamp of the request start (ISO 8601 - YYYY-MM-DDThh:mm:ss.sTZD).
  • time [number] - Total elapsed time of the request in milliseconds. This is the sum of all timings available in the timings object (i.e. not including -1 values) .
  • request [object] - Detailed info about the request.
  • response [object] - Detailed info about the response.
  • cache [object] - Info about cache usage.
  • timings [object] - Detailed timing info about request/response round trip.

<request>
This object contains detailed info about performed request.

"request": {
    "method": "GET",
    "url": "http://www.example.com/path/param=value",
    "httpVersion": "HTTP/1.1",
    "cookies": [],
    "headers": [],
    "queryString" : [],
    "postData" : {},
    "headersSize" : 150,
    "bodySize" : 0
},
  • method [string] - Request method (GET, POST, ...).
  • url [string] - Absolute URL of the request (fragments are not included).
  • httpVersion [string] - Request HTTP Version.
  • cookies [array] - List of cookie objects.
  • headers [array] - List of header objects.
  • queryString [object] - Structured (parsed) info about the query string.
  • postData [object, optional] - Posted data info.
  • headersSize [number] - Total number of bytes from the start of the HTTP request message until (and including) the double CRLF before the body. Set to -1 if the info is not available.
  • bodySize [number] - Size of the request body (POST data payload) in bytes. Set to -1 if the info is not available.
The total request size sent can be computed as follows (if both values are available):
var totalSize = entry.request.headersSize + entry.request.bodySize;


<response>
This object contains detailed info about the response.

"response": {
    "status": 200,
    "statusText": "OK",
    "httpVersion": "HTTP/1.1",
    "cookies": [],
    "headers": [],
    "content": {},
    "redirectURL": "",
    "headersSize" : 160,
    "bodySize" : 850
 },
  • status [number] - Response status.
  • statusText [string] - Response status description.
  • httpVersion [string] - Response HTTP Version.
  • cookies [array] - List of cookie objects.
  • headers [array] - List of header objects.
  • content [object] - Details about the response body.
  • redirectURL [string] - Redirection target URL from the Location response header.
  • headersSize [number] - Total number of bytes from the start of the HTTP response message until (and including) the double CRLF before the body. Set to -1 if the info is not available.
  • bodySize [number] - Size of the received response body in bytes. Set to zero in case of responses coming from the cache (304). Set to -1 if the info is not available.

The total response size received can be computed as follows (if both values are available):
var totalSize = entry.response.headersSize + entry.response.bodySize;


<cookies>
This object contains list of all cookies (used in <request> and <response> objects).

"cookies": [
    {
        "name": "TestCookie",
        "value": "Cookie Value",
        "path": "/",
        "domain": "www.janodvarko.cz",
        "expires": "2009-07-24T19:20:30.123+02:00",
        "httpOnly": false
    }
]
  • name [string] - The name of the cookie.
  • value [string] - The cookie value.
  • path [string, optional] - The path pertaining to the cookie.
  • domain [string, optional] - The host of the cookie.
  • expires [string, optional] - Cookie expiration time. (ISO 8601 - YYYY-MM-DDThh:mm:ss.sTZD, e.g. 2009-07-24T19:20:30.123+02:00).
  • httpOnly [boolean, optional] - Set to true if the cookie is HTTP only, otherwise false.

<headers>
This object contains list of all headers (used in <request> and <response> objects).

"headers": [
    {
        "name": "Accept-Encoding",
        "value": "gzip,deflate"
    },
    {
        "name": "Accept-Language",
        "value": "en-us,en;q=0.5"
    },
    ...
]


<queryString>
This object contains list of all paramters & values parsed from a query string, if any (embedded in <request> object).

"queryString": [
    {
        "name": "param1",
        "value": "value1"
    },
    {
        "name": "param1",
        "value": "value1"
    }
]

HAR format expects NVP (name-value pairs) formatting of the query string.


<postData>
This object describes posted data, if any (embedded in <request> object).

"postData": {
    "mimeType": "multipart/form-data",
    "params": [
        {
            "name": "paramName",
            "value": "paramValue",
            "fileName": "example.pdf",
            "contentType": "application/pdf"
        }
    ],
    "text" : "plain posted data"
}

  • mimeType [string] - Mime type of posted data.
  • params [array] - List of posted parameters (in case of URL encoded parameters).
  • text [string] - Plain text posted data
  • params.name [string] - name of a posted parameter.
  • params.value [string, optional] - value of a posted parameter or content of a posted file.
  • params.fileName [string, optional] - name of a posted file.
  • params.contentType [string, optional] - content type of a posted file.

Note that text and params fields are mutually exclusive.


<content>
This object desribes details about response content (embedded in <response> object).

"content": {
    "size": 33,
    "compression": 0,
    "mimeType": "text/html; charset="utf-8",
    "text": "<html><head></head><body/></html>\n"
}

  • size [number] - Length of the returned content in bytes. Should be equal to response.bodySize if there is no compression and bigger when the content has been compressed.
  • compression [number, optional] - Number of bytes saved. Leave out this field if the information is not available.
  • mimeType [string] - MIME type of the response text (value of the Content-Type response header). The charset attribute of the MIME type is included (if available).
  • text [string, optional] - Response body sent from the server or loaded from the browser cache. This field is populated with textual content only. This text is HTTP decoded (decompressed & unchunked), than trans-coded from its original character set into UTF-8. Leave out this field if the information is not available.

<cache>
This objects contains info about a request coming from browser cache.

"cache": {
    "beforeRequest": {},
    "afterRequest": {},
}
  • beforeRequest [object, optional] - State of a cache entry before the request. Leave out this field if the information is not available.
  • afterRequest [object, optional] - State of a cache entry after the request. Leave out this field if the information is not available.

This is how the object should look like if no cache information are available (or you can just leave out the entire field).

"cache": {}

This is how the object should look like if the the info about the cache entry before request is not available and there is no cache entry after the request.

"cache": {
    "afterRequest": null
}

This is how the object should look like if there in no cache entry before nor after the request.

"cache": {
    "beforeRequest": null,
    "afterRequest": null
}

This is how the object should look like to indicate that the entry was not in the cache but was store after the content was downloaded by the request.

"cache": {
    "beforeRequest": null,
    "afterRequest": {
        "expires": "2009-04-16T15:50:36",
        "lastAccess": "2009-16-02T15:50:34",
        "eTag": "",
        "hitCount": 0
    }
}

Both beforeRequest and afterRequest object share the following structure.

"beforeRequest": {
    "expires": "2009-04-16T15:50:36",
    "lastAccess": "2009-16-02T15:50:34",
    "eTag": "",
    "hitCount": 0,
}
  • expires [string, optional] - Expiration time of the cache entry.
  • lastAccess [string] - The last time the cache entry was opened.
  • eTag [string] - Etag
  • hitCount [number] - The number of times the cache entry has been opened


<timings>
This object describes various phases within request-response round trip. All times are specified in milliseconds.

"timings": {
    "blocked": 0,
    "dns": -1,
    "connect": 15,
    "send": 20,
    "wait": 38,
    "receive": 12
}
  • blocked [number, optional] - Time spent in a queue waiting for a network connection. Use -1 if the timing does not apply to the current request.
  • dns [number, optional] - DNS resolution time. The time required to resolve a host name. Use -1 if the timing does not apply to the current request.
  • connect [number, optional] - Time required to create TCP connection. Use -1 if the timing does not apply to the current request.
  • send [number] - Time required to send HTTP request to the server.
  • wait [number] - Waiting for a response from the server.
  • receive [number] - Time required to read entire response from the server (or cache).

The send, wait and receive timings are not optional and must have non-negative values.

An exporting tool can omit the blocked, dns and connect timings on every request if it is unable to provide them. Tools that can provide these timings can set their values to -1 if they don’t apply. For example, connect would be -1 for requests they re-use an existing connection.

The time value for the request must be equal to the sum of the timings supplied in this section (excluding any -1 values).

Custom Fields

The specification allows adding new custom fields into the output format. Following rules must be applied:

  • Custom fields and elements MUST start with an underscore (spec fields should never start with an underscore.
  • Parsers MUST ignore all custom fields and elements if the file was not written by the same tool loading the file.
  • Parsers MUST ignore all non-custom fields that they don't know how to parse because the minor version number is greater that the maximum
    minor version for which they were written.
  • Parsers can reject files that contain non-custom fields that they know were not present in a specific version of the spec.

Versioning Scheme

The spec number has following syntax:


<major-version-number>.<minor-version-number>


Where the major version indicates overall backwards compatibility and the minor version indicates incremental changes. So, any backwardly compatible changes to the spec will result in an increase of the minor version. If an existing fields had to be broken then major version would increase (e.g. 2.0).


Examples:

1.2 -> 1.3

1.111 -> 1.112 (in case of 111 more changes)

1.5 -> 2.0 (2.0 is not compatible with 1.5)



So following construct can be used to detect incompatible version if a tool supports HAR since 1.1.


if (majorVersion != 1 || minorVersion < 1)

{

    throw "Incompatible version";

}


In this example a tool throws an exception if the version is e.g.: 0.8, 0.9, 1.0, but works with 1.1, 1.2, 1.112 etc. Version 2.x would be rejected.






Version: 
1 message about this page
Jul 22 2009 by Jan Odvarko
Click on http://groups.google.com/group/firebug-working-group/web/http-tracing---export-format
- or copy & paste it into your browser's address bar if that doesn't
work.
Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google