Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Adding media type for sources to source maps

55 views
Skip to first unread message

Andy Sterland

unread,
Jun 2, 2014, 7:01:28 PM6/2/14
to dev-js-s...@lists.mozilla.org, Jonathan Turner, Dan Moseley, Ron Buckton
Hi all,

I'm looking to get some feedback on a proposal to add a prefixed* property to the source map schema to describe the language of a source file. The proposal is to cover both files that are compiled from one source language and the rarer case of having multiple languages contributing to the compiled file that executes. Though not covered in the proposal below it could be useful to also include the language of the generated file e.g. text/css or application/javascript etc.

For the case where there is only one source language the value of the x_ms_mediaTypes property is an array with only one string defining a media type for all the source files. The string itself should be a media type as defined by RFC6838. As there really isn't an exhaustive list of media types it's really an opaque string between source map producers and consumers but ideally producers should document their media type.

Single source language example:
{
"version": 3,
"file": "combined.js",
"sources": ["a.ts", "b.ts"],
"names": ["method"],
"mappings": "AAAA,...",
"x_ms_mediaTypes": ["application/x.typescript;version=1.0.3.0"]
}

For the case where there is mixed source content the source map would look like:
{
"version": 3,
"file": "combined.js",
"sources": ["a.ts", "b_old.ts"],
"names": ["method"],
"mappings": "AAAA,...",
"x_ms_mediaTypes": ["application/x.typescript;version=1.0.3.0", "application/x.typescript;version=1.0.0.0"],
"x_ms_sourceMediaTypes": [0, 1]
}
In this case the x_ms_mediaTypes array is a list of all the unique media types and an additional property x_ms_sourceMediaTypes is added to map source files to media types using the array indices. In the example below the source file b_old.ts is at index 1 of the sources array and is at index 1 of the x_ms_sourceMediaTypes array which values points to index 1 of the x_ms_mediaTypes which has the value of "application/x.typescript;version=1.0.0.0".

The main scenario we want to support is removing ambiguity around the source language. In the IE F12 developers we load a language service that does things such as syntax coloring the text and determining the span of statements etc. In F12 there is a bunch of languages services for JavaScript, TypeScript, CSS, CoffeeScript, XML, HTML, C# etc. the trouble with source maps is that we don't always know 'which' language service to load for source mapped files. Today we base the language service on the URI and the content-type header which has had mixed success and more than a few sites send info that causes F12 to load the wrong language service, which can't be worked around. Beyond the language type itself the version of the language is virtually impossible to work out but can be important as language versions can have syntactic differences which would require a different language service.

We're looking at adding this in both the TypeScript compiler and the F12 debugger in IE and would love to get feedback from everyone as it seems like it would be useful for all.

Thoughts?

-Andy


* With the hope of it becoming part of the standard in the future.

Brian Slesinsky

unread,
Jun 2, 2014, 8:00:35 PM6/2/14
to Andy Sterland, Jonathan Turner, Dan Moseley, dev-js-s...@lists.mozilla.org, Ron Buckton
It makes sense to me.

- Brian



On Mon, Jun 2, 2014 at 4:01 PM, Andy Sterland <Andy.S...@microsoft.com>
wrote:
> _______________________________________________
> dev-js-sourcemap mailing list
> dev-js-s...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-js-sourcemap
>

Fitzgerald, Nick

unread,
Jun 2, 2014, 8:22:55 PM6/2/14
to dev-js-s...@lists.mozilla.org
We talked about this a while back:
https://groups.google.com/d/msg/mozilla.dev.js-sourcemap/bZGz99CsN6I/5UiGQ3v69RoJ

What I like about your proposal is that its backwards compatible,
because it feels too late to mess with the sourcesContent.

Overall I like it! I've added a couple notes inline below.

Also, sorry to begin the bikeshedding, but I'd like to propose that we
name the new property "sourcesMediaType" which is in the same vein as
"sourcesContent" for adding more info to the "sources" property. Nice to
be consistent, right? :)

On 6/2/14, 4:01 PM, Andy Sterland wrote:
> For the case where there is only one source language the value of the x_ms_mediaTypes property is an array with only one string defining a media type for all the source files. The string itself should be a media type as defined by RFC6838. As there really isn't an exhaustive list of media types it's really an opaque string between source map producers and consumers but ideally producers should document their media type.

So you're saying that the one media type would apply for all sources?
That works well when you have only one media type, but what about when
you have a bunch of typescript files and one js file? It seems like it
would be more general to define that this media type is for the first n
sources, this media type for the second m sources... etc. Something like:

```
sources: ["a.ts", "b.ts", "c.ts", "d.js"],
sourcesMediaType: [
[3, "application/x.typescript;version=1.0.3.0"],
[1, "application/javascript"]
]
```

The other benefit of this approach is that there is no magical
distinction between a length 1 list of media types and a length > 1 list
of media types.

Not married to that specific form of [n, "some/mediaType"], but think
that the idea of generalizing + unifying the use cases is valuable.

> We're looking at adding this in both the TypeScript compiler and the F12 debugger in IE and would love to get feedback from everyone as it seems like it would be useful for all.

It would be nice to come to an agreement and properly add this in the
spec before you guys ship it as an extension, since this is something
we've expressed interest in before :)

Nick

Brian Slesinsky

unread,
Jun 2, 2014, 8:57:11 PM6/2/14
to fit...@mozilla.com, dev-js-s...@lists.mozilla.org
On Mon, Jun 2, 2014 at 5:22 PM, Fitzgerald, Nick <nfitz...@mozilla.com>
wrote:

>
> So you're saying that the one media type would apply for all sources? That
> works well when you have only one media type, but what about when you have
> a bunch of typescript files and one js file? It seems like it would be more
> general to define that this media type is for the first n sources, this
> media type for the second m sources... etc. Something like:
>
> ```
> sources: ["a.ts", "b.ts", "c.ts", "d.js"],
> sourcesMediaType: [
> [3, "application/x.typescript;version=1.0.3.0"],
> [1, "application/javascript"]
> ]
>

This seems slightly less straightforward since there's more room for
calculating the index wrong while reading or writing it, compared to a
parallel array.

I'm not sure we need to worry about the size of the parallel array. Sorting
the source files by language might be a good idea if it compresses better,
but even without that, we're talking about two characters per source file
(assuming fewer than 10 languages). If you do sort it, repeating the same
number is going to compress very well. So perhaps better to stick with the
more straightforward parallel array and let gzip handle it?

Having a special case for when every source file is the same language seems
useful mostly for readability, since a list of all zeros would compress
well too. A slightly more general rule would be to extend the parallel
array with zeros if it's shorter than the list of source files. You could
put your most commonly used language at the end (with index 0) and have a
pretty short list of mappings. But perhaps that's more confusing than
readable.

- Brian

Fitzgerald, Nick

unread,
Jun 3, 2014, 12:16:17 PM6/3/14
to Brian Slesinsky, dev-js-s...@lists.mozilla.org, fit...@mozilla.com
On 6/2/14, 5:57 PM, Brian Slesinsky wrote:
>
> This seems slightly less straightforward since there's more room for
> calculating the index wrong while reading or writing it, compared to a
> parallel array.

Less straightforward, but its consistent.

>
> I'm not sure we need to worry about the size of the parallel array.
> Sorting the source files by language might be a good idea if it
> compresses better, but even without that, we're talking about two
> characters per source file (assuming fewer than 10 languages). If you
> do sort it, repeating the same number is going to compress very well.
> So perhaps better to stick with the more straightforward parallel
> array and let gzip handle it?
>
> Having a special case for when every source file is the same language
> seems useful mostly for readability, since a list of all zeros would
> compress well too. A slightly more general rule would be to extend the
> parallel array with zeros if it's shorter than the list of source
> files. You could put your most commonly used language at the end (with
> index 0) and have a pretty short list of mappings. But perhaps that's
> more confusing than readable.

I was mostly concerned with removing the special case and making it
consistent across various scenarios because relying on the length of the
array doesn't feel elegant to me.

Relying on gzip is fine for the network, but source maps can get pretty
large on disk, which is frustrating as well. David Nolen was just
expressing this to me at JSConf.

The more I think it over, though, the less the special case is bothering
me, now.

Another option would be to separate the media types from the parallel
array of media types for specific sources. We could VLQ the parallel
array as relative indices into the list of all media types. You know,
the same thing we do in the rest of source maps ;)

{
...
sources: ["a.ts", "b.ts", "c.ts", "d.js"],
sourcesMediaType: "CAAD", // 1, 0, 0, -1
mediaTypes: ["application/javascript",
"application/x.typescript;version=1.0.30"]
}

What do you guys think of this? I like it because it is consistent
without special cases, compresses both all-one-media-type and
mostly-one-media-type pretty well, and fits with the way we do things in
source maps already.

Nick

John Lenz

unread,
Jun 3, 2014, 1:19:57 PM6/3/14
to fit...@mozilla.com, Brian Slesinsky, dev-js-s...@lists.mozilla.org
I think "sourcesDefaultMediaType" and "sourcesMediaType" rather than
overloading sourcesMediaType. As these are optional, generally useful, and
we can add them without changing the meaning of any existing source maps,
we can add this to the spec without problem. As long as we can agree on
the form.

Regarding size, Brian and Google Web Toolkit team have been pretty
successful in reducing the size of the source maps by removing information
that isn't useful to the debuggers (reducing them to basically line
mappings rather than token maps). This is controlled by the source map
producer but can be done as a post-process. But that is a discussion for
another thread.

If we do add this, I would like to document common "known" media types in
the spec appendix (CSS,HTML,JS,CoffeeScript,TypeScript,SASS,etc) to reduce
the ambiguity that naturally comes along with using media types.




On Tue, Jun 3, 2014 at 9:16 AM, Fitzgerald, Nick <nfitz...@mozilla.com>
wrote:

Fitzgerald, Nick

unread,
Jun 3, 2014, 2:02:55 PM6/3/14
to John Lenz, Brian Slesinsky, dev-js-s...@lists.mozilla.org, fit...@mozilla.com
On 6/3/14, 10:19 AM, John Lenz wrote:
> I think "sourcesDefaultMediaType" and "sourcesMediaType" rather than
> overloading sourcesMediaType.

I'm not sure I follow, can you expand?

> As these are optional, generally useful, and we can add them without
> changing the meaning of any existing source maps, we can add this to
> the spec without problem.

Yup, not breaking old source map parsers is definitely a requirement.

Every proposal in this thread has fulfilled this, unless I'm missing
something.

> As long as we can agree on the form.
>

+Infinity

> Regarding size, Brian and Google Web Toolkit team have been pretty
> successful in reducing the size of the source maps by removing
> information that isn't useful to the debuggers (reducing them to
> basically line mappings rather than token maps). This is controlled
> by the source map producer but can be done as a post-process. But
> that is a discussion for another thread.

I did the same thing in [0] but my goal was speeding up the time it
takes to generate a source map.

[0]
https://github.com/mozilla/pretty-fast/blob/master/pretty-fast.js#L658-L699

>
> If we do add this, I would like to document common "known" media types
> in the spec appendix (CSS,HTML,JS,CoffeeScript,TypeScript,SASS,etc) to
> reduce the ambiguity that naturally comes along with using media types.

Sounds good to me.

Ron Buckton

unread,
Jun 3, 2014, 2:44:50 PM6/3/14
to John Lenz, fit...@mozilla.com, Brian Slesinsky, dev-js-s...@lists.mozilla.org
When Andy and I were discussing this initially, I had proposed the following:

- Add "x_ms_mediaTypes" to the source map
- This contains an array of unique media types
- Add "x_ms_sourceMediaTypes" to the source map
- This is a string that contains a variable sized Base64VLQ encoded set of offsets
- Each entry from left-to-right is respective to the same ordinal entry within the "sources" array
- Missing entries are right-filled to the end of the "sources" array
- Each offset is encoded using the Base64 VLQ format, without separators
- If this property is not present, it is assumed that it is filled with offset zero
- No assumption is made on the order of entries in the "sources" array, as that is currently implementation dependent

An example of this encoding might be:

```
{
...
"sources": ["a.js", "b.ts", "c.js", "d.js", "e.js", "f.js"],
...
" x_ms_mediaTypes": ["application/javascript", "application/x.typescript"],
"x_ms_sourceMediaType": "ACBAAA"
}
```

Or, with right-filling:

```
{
...
"sources": ["a.js", "b.ts", "c.js", "d.js", "e.js", "f.js"],
...
"x_ms_mediaTypes": ["application/javascript", "application/x.typescript"],
"x_ms_sourceMediaType": "ACBA"
}
```

The reasons I originally considered using a modified application of Base64 VLQ were:

- By using offsets over indices, long runs of the same media type can be more readily compressed.
- Using a modified application of Base64 VLQ (without separators) reduces uncompressed bytes over the wire, as well as being more readily compressed.
- Tools that understand source maps already understand Base64 VLQ

We ended up going with a parallel array of indices to make it more human readable, though compression and reduced footprint may win out in the end.

Ron

> -----Original Message-----
> From: dev-js-sourcemap [mailto:dev-js-sourcemap-
> bounces+rbuckton=micros...@lists.mozilla.org] On Behalf Of John Lenz
> Sent: Tuesday, June 3, 2014 10:20 AM
> To: fit...@mozilla.com
> Cc: Brian Slesinsky; dev-js-s...@lists.mozilla.org
> Subject: Re: Adding media type for sources to source maps
>
> I think "sourcesDefaultMediaType" and "sourcesMediaType" rather than
> overloading sourcesMediaType. As these are optional, generally useful, and
> we can add them without changing the meaning of any existing source maps,
> we can add this to the spec without problem. As long as we can agree on the
> form.
>
> Regarding size, Brian and Google Web Toolkit team have been pretty
> successful in reducing the size of the source maps by removing information
> that isn't useful to the debuggers (reducing them to basically line
> mappings rather than token maps). This is controlled by the source map
> producer but can be done as a post-process. But that is a discussion for
> another thread.
>
> If we do add this, I would like to document common "known" media types in
> the spec appendix (CSS,HTML,JS,CoffeeScript,TypeScript,SASS,etc) to reduce
> the ambiguity that naturally comes along with using media types.
>
>
>
>

John Lenz

unread,
Jun 5, 2014, 10:26:24 AM6/5/14
to Ron Buckton, Brian Slesinsky, dev-js-s...@lists.mozilla.org, fit...@mozilla.com
The more I think about this the more I like the original proposal. I'm fine
with VLQ encoding as well if folks feel it is necessary.

Nick?

Fitzgerald, Nick

unread,
Jun 5, 2014, 2:09:26 PM6/5/14
to Ron Buckton, Brian Slesinsky, dev-js-s...@lists.mozilla.org, John Lenz
I wrote up (and edited a little) this proposal. I like it because we
came up with basically the same thing independently, and it best
abstracts away repeated data.

Should everyone agree to it, this should be able to just drop into the
spec: https://gist.github.com/fitzgen/35d7e3905a915238aa14

Note that I opted to use indices directly instead of relative offsets
because the space savings gained by relative offsets don't kick in until
you have 16 or greater unique media types and it didn't seem worth
complicating the spec any further for such an edge case.

Thoughts?

Brian Slesinsky

unread,
Jun 5, 2014, 2:13:50 PM6/5/14
to fit...@mozilla.com, dev-js-s...@lists.mozilla.org, John Lenz, Ron Buckton
I'd rather stick with the original proposal. The complexity of VLQ encoding
doesn't seem worth it here, at least not without some benchmark showing it
makes a difference on realistic data.




On Thu, Jun 5, 2014 at 11:09 AM, Fitzgerald, Nick <nfitz...@mozilla.com>

John Lenz

unread,
Jun 5, 2014, 3:43:51 PM6/5/14
to Brian Slesinsky, Ron Buckton, dev-js-s...@lists.mozilla.org, fit...@mozilla.com
I prefer the version without VLQ as well.


On Thu, Jun 5, 2014 at 11:13 AM, Brian Slesinsky <skyb...@google.com>
wrote:

Fitzgerald, Nick

unread,
Jun 5, 2014, 4:30:40 PM6/5/14
to John Lenz, Brian Slesinsky, Ron Buckton, dev-js-s...@lists.mozilla.org, fit...@mozilla.com
Well if everyone else prefers that version, I'll defer to that.
> <mailto:dev-js-s...@lists.mozilla.org>
> <nfitz...@mozilla.com <mailto:nfitz...@mozilla.com>>
> <mailto:dev-js-s...@lists.mozilla.org>
> https://lists.mozilla.org/listinfo/dev-js-sourcemap
>
> _______________________________________________
> dev-js-sourcemap mailing list
> dev-js-s...@lists.mozilla.org
> <mailto:dev-js-s...@lists.mozilla.org>
> https://lists.mozilla.org/listinfo/dev-js-sourcemap
>
>
>
>

Andy Sterland

unread,
Jun 5, 2014, 5:58:13 PM6/5/14
to fit...@mozilla.com, John Lenz, Brian Slesinsky, dev-js-s...@lists.mozilla.org, Ron Buckton
That sure works for me.

Out of curiosity would a change like this change the version number? I'm not that familiar what the rules are for when the version number would change (for example only breaking/incompatible changes etc.).
https://lists.mozilla.org/listinfo/dev-js-sourcemap

John Lenz

unread,
Jun 5, 2014, 6:05:43 PM6/5/14
to Andy Sterland, Brian Slesinsky, Ron Buckton, dev-js-s...@lists.mozilla.org, fit...@mozilla.com
No, because it doesn't change the meaning of any existing source maps and
optional. Existing consumers are required to ignore additional fields if
they don't understand them. We just rev the spec. Generally, I like to
get "sign off" from the browser vendors on this list (Chrome, Firefox, and
now Internet Explorer). But it seems unlikely that the Chrome folks will
object, but I pinged Paul Irish just in case. But at this point I think we
can start the spec changes.


On Thu, Jun 5, 2014 at 2:58 PM, Andy Sterland <Andy.S...@microsoft.com>
wrote:

Andy Sterland

unread,
Jun 11, 2014, 6:07:27 PM6/11/14
to John Lenz, Brian Slesinsky, Ron Buckton, dev-js-s...@lists.mozilla.org, fit...@mozilla.com
Great. Thanks for explaining.

Do you have an updated gist without the VLQ? (I believe that’s the consensus.)

From: John Lenz [mailto:conca...@gmail.com]
Sent: Thursday, June 5, 2014 3:06 PM
To: Andy Sterland
Cc: fit...@mozilla.com; Brian Slesinsky; Ron Buckton; dev-js-s...@lists.mozilla.org
Subject: Re: Adding media type for sources to source maps

No, because it doesn't change the meaning of any existing source maps and optional. Existing consumers are required to ignore additional fields if they don't understand them. We just rev the spec. Generally, I like to get "sign off" from the browser vendors on this list (Chrome, Firefox, and now Internet Explorer). But it seems unlikely that the Chrome folks will object, but I pinged Paul Irish just in case. But at this point I think we can start the spec changes.

> <mailto:dev-js-s...@lists.mozilla.org<mailto:dev-js-s...@lists.mozilla.org>>
> <nfitz...@mozilla.com<mailto:nfitz...@mozilla.com> <mailto:nfitz...@mozilla.com<mailto:nfitz...@mozilla.com>>>
> <mailto:dev-js-s...@lists.mozilla.org<mailto:dev-js-s...@lists.mozilla.org>>
>
> https://lists.mozilla.org/listinfo/dev-js-sourcemap
>
> _______________________________________________
> dev-js-sourcemap mailing list
> dev-js-s...@lists.mozilla.org<mailto:dev-js-s...@lists.mozilla.org>
> <mailto:dev-js-s...@lists.mozilla.org<mailto:dev-js-s...@lists.mozilla.org>>

Ron Buckton

unread,
Jul 11, 2014, 5:55:39 PM7/11/14
to Andy Sterland, John Lenz, Brian Slesinsky, dev-js-s...@lists.mozilla.org, fit...@mozilla.com
I haven’t heard any updates on this since June, but I went ahead and forked<https://gist.github.com/rbuckton/02048395a8dd1330b29a> Nick Fitzgerald’s gist with revisions based on this discussion.

Thanks,
Ron
0 new messages