* interface VTTCue
* attribute DOMString vertical; * attribute boolean snapToLines; * attribute (long or AutoKeyword) line; * attribute long position; * attribute long size; * attribute DOMString align; * attribute DOMString text; * DocumentFragment getCueAsHTML();
* attribute DOMString regionId
* enum DirectionSetting { "" /* horizontal */, "rl", "lr" }; * enum AlignSetting { "start", "middle", "end", "left", "right" };
* attribute DirectionSetting vertical; * attribute AlignSetting align;
Do you plan to stage these changes behind a runtime flag?
It sounds like the kind of thing that might be doable entirely in one CL...
On Fri, Aug 9, 2013 at 5:29 PM, Adam Barth <aba...@chromium.org> wrote:
Do you plan to stage these changes behind a runtime flag?It is possible, though probably easier not to use a flag. What is your preference?It sounds like the kind of thing that might be doable entirely in one CL...Yes.
On Fri, Aug 9, 2013 at 4:34 PM, Glenn Adams <gl...@skynav.com> wrote:
On Fri, Aug 9, 2013 at 5:29 PM, Adam Barth <aba...@chromium.org> wrote:
Do you plan to stage these changes behind a runtime flag?It is possible, though probably easier not to use a flag. What is your preference?It sounds like the kind of thing that might be doable entirely in one CL...Yes.It's probably better to do as one CL. I presume the other implementations are make this change a well.
On Fri, Aug 9, 2013 at 5:36 PM, Adam Barth <aba...@chromium.org> wrote:
On Fri, Aug 9, 2013 at 4:34 PM, Glenn Adams <gl...@skynav.com> wrote:
On Fri, Aug 9, 2013 at 5:29 PM, Adam Barth <aba...@chromium.org> wrote:
Do you plan to stage these changes behind a runtime flag?It is possible, though probably easier not to use a flag. What is your preference?It sounds like the kind of thing that might be doable entirely in one CL...Yes.It's probably better to do as one CL. I presume the other implementations are make this change a well.I'm also making that presumption, but it wouldn't hurt to investigate their plans.
On Fri, Aug 9, 2013 at 4:40 PM, Glenn Adams <gl...@skynav.com> wrote:On Fri, Aug 9, 2013 at 5:36 PM, Adam Barth <aba...@chromium.org> wrote:
On Fri, Aug 9, 2013 at 4:34 PM, Glenn Adams <gl...@skynav.com> wrote:
On Fri, Aug 9, 2013 at 5:29 PM, Adam Barth <aba...@chromium.org> wrote:
Do you plan to stage these changes behind a runtime flag?It is possible, though probably easier not to use a flag. What is your preference?It sounds like the kind of thing that might be doable entirely in one CL...Yes.It's probably better to do as one CL. I presume the other implementations are make this change a well.I'm also making that presumption, but it wouldn't hurt to investigate their plans.Ok. LGTM once you've double-checked with other implementers.
Hi Glenn,
You say that you're going to move .text from TextTrackCue to VTTCue,
but also that "The existing TextTrackCue constructor is retained to
instantiate generic cues containing raw text ..."
For those not aware, a spec fork has appeared [1] since Silvia and Ian
disagree on this. Compare WHATWG [2] and W3C [3].
Which spec do you intend to follow?
I don't think that keeping the
TextTrackCue constructor and text property makes a lot of sense after
TextTrackCue has been stripped of its WebVTT semantics. As far as I
can tell, a TextTrackCue created by script can't be rendered at all,
since it doesn't have any rendering rules. In other words, I think the
WHATWG spec makes more sense here.
Hi Glenn,
You say that you're going to move .text from TextTrackCue to VTTCue,
but also that "The existing TextTrackCue constructor is retained to
instantiate generic cues containing raw text ..."
For those not aware, a spec fork has appeared [1] since Silvia and Ian
disagree on this. Compare WHATWG [2] and W3C [3].
Which spec do you intend to follow? I don't think that keeping the
TextTrackCue constructor and text property makes a lot of sense after
TextTrackCue has been stripped of its WebVTT semantics. As far as I
can tell, a TextTrackCue created by script can't be rendered at all,
since it doesn't have any rendering rules. In other words, I think the
WHATWG spec makes more sense here.
On Fri, Aug 9, 2013 at 4:40 PM, Glenn Adams <gl...@skynav.com> wrote:
On Fri, Aug 9, 2013 at 5:36 PM, Adam Barth <aba...@chromium.org> wrote:
On Fri, Aug 9, 2013 at 4:34 PM, Glenn Adams <gl...@skynav.com> wrote:
On Fri, Aug 9, 2013 at 5:29 PM, Adam Barth <aba...@chromium.org> wrote:
Do you plan to stage these changes behind a runtime flag?It is possible, though probably easier not to use a flag. What is your preference?It sounds like the kind of thing that might be doable entirely in one CL...Yes.It's probably better to do as one CL. I presume the other implementations are make this change a well.I'm also making that presumption, but it wouldn't hurt to investigate their plans.Ok. LGTM once you've double-checked with other implementers.
LGTM. I am happy to review the code for you when it becomes available. I'm assuming that existing applications will simply need to check for the presence of VTTCue to determine which constructor they should use. Correct?
As for formats other than WebVTT and TTML I have no specificknowledge, but I don't understand why the generic TextTrackCue is the
appropriate interface for those formats, as opposed to the
format-specific interface, with the non-renderable data stuck in a
.data property or some such.
Can we hold off on this change until we can get some agreement between the specs? I'd hate for us to end up in a world where we have half of each of the specs since that's really harmful to developers.What's the opinion of other browser vendors too? We shouldn't be following the HTMLWG spec if other vendors are going to follow the WHATWG spec.
Note that this is because that spec is broken. It should be introducing aOn Wed, 21 Aug 2013, Glenn Adams wrote:
> On Wed, Aug 21, 2013 at 4:44 AM, Philip Jägenstedt <phi...@opera.com>wrote:
> >
> > As for formats other than WebVTT and TTML I have no specific
> > knowledge, but I don't understand why the generic TextTrackCue is the
> > appropriate interface for those formats, as opposed to the
> > format-specific interface, with the non-renderable data stuck in a
> > .data property or some such.
>
> See [1], Section 5.1.4 for an example of a generic use of TextTrackCue
> unrelated to WebVTT and TTML, where a specific use of the text attribute
> is prescribed.
>
> [1] http://www.cablelabs.com/specifications/CL-SP-HTML5-MAP-I02-120510.pdf
new interface, just like WebVTTCue, it shouldn't be using the
intentionally abstract TextTrackCue.
We shouldn't be messing up HTML and WebKit to support a spec that's just
doing things wrong in the first place, IMHO.
On Wed, Aug 21, 2013 at 11:56 AM, Elliott Sprehn <esp...@chromium.org> wrote:
Can we hold off on this change until we can get some agreement between the specs? I'd hate for us to end up in a world where we have half of each of the specs since that's really harmful to developers.What's the opinion of other browser vendors too? We shouldn't be following the HTMLWG spec if other vendors are going to follow the WHATWG spec.That seems a larger question than presented by this case in point. Mozilla has indicated to me they intend to implement the changes outlined here as indicated in the HTMLWG specs.My own opinion is that W3C specifications should take precedence if there is an inconsistency.
> It is also worth noting the existing implementation of an early attempt at aI admit zero knowledge of the internals of the WebKit/Blink
> generic cue [4][5], which does expose a text attribute by means of
> subclassing the existing WebVTT flavor of cue, and then voiding it of its
> WebVTT semantics: a rather odd and roundabout way to achieve this it seems.
> The fact that this early generic cue was implemented in this fashion
> indicates to me that placing the text attribute on VTTCue (only) and not
> retaining it on TextTrackCue leads to a convoluted implementation, and not a
> better one.
>
> [4]
> https://code.google.com/p/chromium/codesearch#chromium/src/third_party/WebKit/Source/core/html/track/TextTrackCueGeneric.h
> [5]
> https://code.google.com/p/chromium/codesearch#chromium/src/third_party/WebKit/Source/core/html/track/TextTrackCueGeneric.cpp
implementation, and will leave it to the reviewers to comment on the
best we to implement whichever spec they think Blink should follow.
Note that this is because that spec is broken. It should be introducing aOn Wed, 21 Aug 2013, Glenn Adams wrote:
> On Wed, Aug 21, 2013 at 4:44 AM, Philip Jägenstedt <phi...@opera.com>wrote:
> >
> > As for formats other than WebVTT and TTML I have no specific
> > knowledge, but I don't understand why the generic TextTrackCue is the
> > appropriate interface for those formats, as opposed to the
> > format-specific interface, with the non-renderable data stuck in a
> > .data property or some such.
>
> See [1], Section 5.1.4 for an example of a generic use of TextTrackCue
> unrelated to WebVTT and TTML, where a specific use of the text attribute
> is prescribed.
>
> [1] http://www.cablelabs.com/specifications/CL-SP-HTML5-MAP-I02-120510.pdf
new interface, just like WebVTTCue, it shouldn't be using the
intentionally abstract TextTrackCue.
We shouldn't be messing up HTML and WebKit to support a spec that's just
doing things wrong in the first place, IMHO.
As for formats other than WebVTT and TTML I have no specific
knowledge, but I don't understand why the generic TextTrackCue is the
appropriate interface for those formats, as opposed to the
format-specific interface, with the non-renderable data stuck in a
.data property or some such.
On Sat, 24 Aug 2013, Silvia Pfeiffer wrote:> Their use case is a common one.
Indeed. That's why we adjusted the spec to support exactly that: multiple
formats, each with their own cue rendering rules, each with their own cue
interface. Note that the use case here isn't non-rendering cues, it's cues
in a different format than VTT. There's no "generic" cue need as far as I
can tell.
On Fri, 23 Aug 2013, Glenn Adams wrote:Not all forms of cues need to be rendered, no. But it doesn't matter what
>
> You appear to be assuming that (1) all forms of cues need rendering
> rules
the rendering rules are for those that don't need to be rendered.
Not necessarily every distinct format; some will have similar needs and
> and (2) that one should define a new cue format specific sub-interface
> for every distinct format.
can reuse the same interface. For example MicroDVD and PowerDivX would
probably use the same interface, if either got implemented by an HTML UA.
Where?
> As has been pointed out a number of times, there are already
> implementations and JS client code using this technique.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
☆PhistucKOn Sat, Aug 24, 2013 at 1:32 AM, Glenn Adams <gl...@skynav.com> wrote:
On Fri, Aug 23, 2013 at 4:16 PM, Ian Hickson <i...@hixie.ch> wrote:
On Fri, 23 Aug 2013, Glenn Adams wrote:Not all forms of cues need to be rendered, no. But it doesn't matter what
>
> You appear to be assuming that (1) all forms of cues need rendering
> rules
the rendering rules are for those that don't need to be rendered.
Not necessarily every distinct format; some will have similar needs and
> and (2) that one should define a new cue format specific sub-interface
> for every distinct format.
can reuse the same interface. For example MicroDVD and PowerDivX would
probably use the same interface, if either got implemented by an HTML UA.
Where?
> As has been pointed out a number of times, there are already
> implementations and JS client code using this technique.
I think I've pointed this out to you at least four times before, but I'll do so again:See section 5.2 Closed Captioning.
FWIW, this is the wrong forum for this discussion. I recommend moving it
to somewhere more appropriate, like the WHATWG list.
To implement the formatting rules of a non-VTT format, you'd need a new
On Sat, 24 Aug 2013, Silvia Pfeiffer wrote:
>
> The TextTrackCueGeneric class was implemented by Apple to deal with cues
> that come from in-band text tracks (i.e. from inside a video file), but
> are not in WebVTT format and therefore don't follow the WebVTT rendering
> rules. If following the W3C spec, that functionality would indeed now be
> provided by the TextTrackCue object and does not need creation of a
> separate class. There is no solution for the needs of the
> TextTrackCueGeneric class in the WHATWG right now. It's one of the key
> reasons the W3C spec has this unfortunate fork.
cue type, not TextTrackCue, since if you used TextTrackCue you wouldn't
get any rendering (since it's not associated with a format).
Indeed. That's why we adjusted the spec to support exactly that: multiple
> Their use case is a common one.
formats, each with their own cue rendering rules, each with their own cue
interface. Note that the use case here isn't non-rendering cues, it's cues
in a different format than VTT. There's no "generic" cue need as far as I
can tell.
> Focusing just on the needs for a .text attribute, you might come to theThat's the right conclusion, as evidenced by the fact that there are cue
> conclusion that there will be cue types that won't have text attributes.
types that don't have text (DVD image subtitles, e.g., or prerecorded
audio descriptions, or binary data blobs).
Text is certainly important on the Web, but it stands aside images, video,
> The Web as we know it is based on text.
audio, proprietary binary blobs, and many other formats.
This is clearly false (much to the chagrin of many of us). There's no sane
> Everything on the Web that provides information has a text equivalent.
text equivalent to Rachmaninoff's Piano Concerto No. 2 in C minor. There's
no sane text equivalent to the binary data that describes how to create
the graph on a slide as a video of a professor drawing that graph plays in
the background. And more importantly, even if there could be, and even if
there should be, there's not necessarily an _actual_ equivalent in the
format in which that data is encoded. It's just not accurate to say that
every timed cue format will always have textual data representing each cue.
On Sun, 25 Aug 2013, Silvia Pfeiffer wrote:
>
> Having no rendering is the whole idea of it. The rendering is left to
> the JS dev. The browser just exposes the cues (that are in non-VTT
> format) to the JS dev.
If the use case is browsers half-heartedly implementing some other text
track format by parsing its cues but not implementing the rendering rules
for them, then we shouldn't support the use case. Such half-hearted
support is bad for the Web. It causes fragmentation, it leads to standards
failure, it's actively harmful.
If you have some other use case in mind, then you should bring it up on
the WHATWG list. I'm not aware of any having been brought up that would
involve new text track interfaces that aren't already handled.
On Sun, 25 Aug 2013, Glenn Adams wrote:Let's talk concrete formats here. Exactly what format are we talking about
> On Sat, Aug 24, 2013 at 10:40 PM, Ian Hickson <i...@hixie.ch> wrote:
> > On Sun, 25 Aug 2013, Silvia Pfeiffer wrote:
> > >
> > > Having no rendering is the whole idea of it. The rendering is left
> > > to the JS dev. The browser just exposes the cues (that are in
> > > non-VTT format) to the JS dev.
> >
> > If the use case is browsers half-heartedly implementing some other
> > text track format by parsing its cues but not implementing the
> > rendering rules for them, then we shouldn't support the use case. Such
> > half-hearted support is bad for the Web. It causes fragmentation, it
> > leads to standards failure, it's actively harmful.
>
> This is where you're thinking goes wrong: exposing content from non-VTT
> cues via text is not a "half hearted" implementation when there is no
> intention that the UA render the cue.
browsers implementing the parsing of that don't have any rendering rules?
What does "this" refer to in this sentence?
> I would suggest you defer to Silvia's judgment on this matter,
> particularly since you have said that this is now her "baby".
This spec [MP2MAP] Section 5.2 also defines a mapping for CEA-708 (including embedded 608) captions, which in MPEG-2 are encoded in user private data in the video elementary stream. At present, few UAs support the decoding/rendering of embedded 708 (DTVCC) or embedded 608 captions. Consequently, exposing this raw data to JS client code permits one to construct a polyfill to render such captions until such time that such support is widely implemented. However, that day might not come, e.g., due to lower interest in UA vendors in supporting MPEG-2 (than supporting newer formats that support embedded WebVTT or TTML). In recognition of this state of affairs, [MP2MAP] defines (Section 5.2) a mapping to a generic text track as follows:TextTrack.kind = "captions"TextTrack.label = "pid" (where pid denotes the PID that contains the video ES that embeds captions)
and, "for each PES or private data packet in the program stream represented by the TextTrack", a cue is created (by the UA, not client JS) wherein:
Yeah, Silvia cleared up some things for me, but I'm not entirely clearOn Mon, Aug 26, 2013 at 5:27 PM, Glenn Adams <gl...@skynav.com> wrote:
> On Mon, Aug 26, 2013 at 1:32 AM, Philip Jägenstedt <phi...@opera.com>
> wrote:
>>
>> On Sun, Aug 25, 2013 at 4:44 PM, Glenn Adams <gl...@skynav.com> wrote:
>> > Of these three specified uses, only the last is potentially renderable
>> > as
>> > captions, while the first two are clearly unrenderable metadata.
>>
>> Is there any particular reason why a new interface for in-band MPEG-2
>> cues isn't used, as opposed to putting extra information into the
>> label? Also, what is to be done with in-band text tracks which *are*
>> supposed to be rendered? It looks like the only option is to render
>> them using scripts using getCueAsHTML?
>
>
> I believe Silvia's last message addresses these questions. Let me know if
> you feel more input is required.
about which kinds of in-band MPEG-2 tracks you want to expose using
the TextTrackCue interface.
The PDF says "For all MPEG-2 stream types
that are not UA recognized audio or video stream types, the UA MUST
create a new TextTrack in the TextTrackList of the media resource."
Does this mean that you intend to use it for any normal (non-metadata)
tracks which can be rendered but have no particular rendering rules?
As for metadata in-band tracks, are the kinds of in-band metadata
tracks completely open-ended, or why is it not feasible to expose
those using specific interfaces?
For example, for PMT it seems more
reasonable to just have a PMTCue with stream_pid, pid and
es_descriptors rather than encoding that information as JSON and
putting it in TextTrackCue.text.
Well, we might create such sub-interfaces in the future, but we should do so at the expense of providing support for JS client parsing approaches that don't depend on future, unknown standardization activities.
On Tue, 27 Aug 2013, Glenn Adams wrote:>Then why are we still debating this?
> To make it clear, I don't care if these are exposed using the TextTrackCue
> interface or some other GenericCue or MetadataCue interface derived from
> TextTrackCue.
Well Silvia can fix the W3C fork any time she wants; in the meantime, theOn Wed, 28 Aug 2013, Glenn Adams wrote:
> On Wed, Aug 28, 2013 at 2:21 PM, Ian Hickson <i...@hixie.ch> wrote:
> > On Tue, 27 Aug 2013, Glenn Adams wrote:
> > >
> > > To make it clear, I don't care if these are exposed using the
> > > TextTrackCue interface or some other GenericCue or MetadataCue
> > > interface derived from TextTrackCue.
> >
> > Then why are we still debating this?
>
> Because you and Silvia haven't fixed the specs so that (1) there is no
> fork and (2) generic cues are explicitly supported.
WHATWG spec already supports generic cues. You just create a new interface
for your cue, and derive it from TextTrackCue, in exactly the same way as
WebVTT does VTTCue. There's no changes needed to the HTML spec for this.
It's not something you'd put in HTML. It's something you'd put in whateverOn Wed, 28 Aug 2013, Glenn Adams wrote:
> On Wed, Aug 28, 2013 at 2:30 PM, Ian Hickson <i...@hixie.ch> wrote:
> > On Wed, 28 Aug 2013, Glenn Adams wrote:
> > > On Wed, Aug 28, 2013 at 2:21 PM, Ian Hickson <i...@hixie.ch> wrote:
> > > > On Tue, 27 Aug 2013, Glenn Adams wrote:
> > > > >
> > > > > To make it clear, I don't care if these are exposed using the
> > > > > TextTrackCue interface or some other GenericCue or MetadataCue
> > > > > interface derived from TextTrackCue.
> > > >
> > > > Then why are we still debating this?
> > >
> > > Because you and Silvia haven't fixed the specs so that (1) there is
> > > no fork and (2) generic cues are explicitly supported.
> >
> > Well Silvia can fix the W3C fork any time she wants; in the meantime,
> > the WHATWG spec already supports generic cues. You just create a new
> > interface for your cue, and derive it from TextTrackCue, in exactly
> > the same way as WebVTT does VTTCue. There's no changes needed to the
> > HTML spec for this.
>
> ok, then Silvia needs to define a GenericCue interface sub-type in the
> real HTML (i.e., W3C) spec, and you can ignore it in your WHATWG sandbox
> as you see fit; that works for me
spec creates the cues (e.g. the MPEG2 spec or whatever).
Given that, do you have a strong a strong preference
for whether the .text property should go on TextTrackCue or VTTCue?
I
don't think it's very important, the only thing is that it's a lot
easier to later move it from VTTCue to TextTrackCue than the other way
around, so I slightly favor moving it.
If you're not intending to implement the MPEG-2 spec right now, do the
Blink changes need to block on resolving the remaining issues?
Philip