I want to create a HAR file of a PUT request. The PUT request uploads
a binary file. As expected, in the resulting HAR file I can see my
uploaded file data in "request.postData.text". "text" is defined in
the spec as "Plain text posted data". Therefore, there is some
encoding that has to happen to represent my binary file as text. But
since there is not an "encoding" attribute in the "postData"
structure, how do I know how to decode "text" to get back to the
original binary data?
In the case of Charles, when it creates the HAR file it treats the
original binary data as text in some platform-specific encoding (on OS
X it chooses MacRoman), then converts that to Unicode and encodes non-
ASCII characters as Unicode escape sequences in the form "\u00ce".
This is bad, because if I'm writing a HAR parser I would need to know
that the original encoding was MacRoman so I can do the Unicode ->
MacRoman -> Binary transformation. But again, this isn't Charles
fault, it's the spec's.
It seems this problem was solved in 1.2 for response data by adding an
"encoding" attribute. Could this also be added for request data?
Thanks,
Ben
I'm the developer of Charles. It's interesting that Charles just
chooses the platform default encoding in this instance. I'm going to
change Charles to use ISO-8859-1 consistently. I like ISO-8859-1 for
this purpose as it isn't a lossy conversion when interpreting an array
of bytes, whereas UTF-8 can be in the instance of invalid sequences I
think. MacRoman has the same properties, so this change shouldn't have
any material impact.
I agree an "encoding" attribute for postData seems reasonable and
consistent, we'd then include the body as base64 (if necessary) and no
encoding issues would exist. We might need to use base64 whenever we
don't know the encoding, as otherwise we end up doing some undefined
transcoding.
cheers,
Karl