Aryeh Gregor
unread,May 8, 2016, 7:47:01 AM5/8/16You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Simon Pieters, Philip Jägenstedt, dev-pl...@lists.mozilla.org, Florin Mezei
On Sun, May 8, 2016 at 9:15 AM, Simon Pieters <
sim...@opera.com> wrote:
> httparchive (494,168 pages):
>
> SELECT COUNT(*) AS num, REGEXP_EXTRACT(LOWER(body),
> r'<track\s(?:[^>]+\s)?kind\s*=\s*([a-z]+|["\'][^"\']+["\'])') as match
> FROM [httparchive:har.2016_04_15_chrome_requests_bodies]
> GROUP BY match
> ORDER BY num DESC
>
> Row num match
> 1 17616286 null
> 2 523 "subtitles"
> 3 108 "captions"
> 4 58 "metadata"
> 5 6 "subtitle"
> 6 6 'subtitles'
> 7 5 "thumbnails"
> 8 3 'captions'
> 9 1 "dotsub"
> 10 1 "${assettracktype}"
> 11 1 'subtitle'
>
>
> We could add "subtitle" as a new keyword if that turns out to be a problem.
Thanks for the data! Looks like we're talking on the order of 0.001%
of pages, so I think this can be safely landed.