Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How Long Before My New Pages Get 'Scraped' by the Importer

17 views
Skip to first unread message

Joe Medley

unread,
Aug 20, 2015, 12:19:18 PM8/20/15
to mozilla...@lists.mozilla.org
Gang,

I'm trying to be a good citizen here by using the importer (https://browsercompat.herokuapp.com/importer/) on any pages that I edit. Several days ago I created some new pages and I'm still getting a message that they have 'not been scraped'.

How long does scraping typically take. I've got some large new APIs I need to work on.

Joe

Jean-Yves Perrier

unread,
Aug 20, 2015, 12:40:46 PM8/20/15
to dev...@lists.mozilla.org, Jwhi...@mozilla.com
Hi!

The list is manually feeded to the importer: so there is no predictible
time for scrapping new pages.

John will have the definitive answer ;-)


--
Jean-Yves
> _______________________________________________
> dev-mdc mailing list
> dev...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-mdc
> MDN contributor guide: http://bit.ly/ContributorGuide
> Doc project Trello board: https://trello.com/b/HAhl54zz/status
>


--
Jean-Yves Perrier
Senior St. Project Manager / Documentation Project / MDN

Jeremie Patonnier

unread,
Aug 20, 2015, 1:47:11 PM8/20/15
to Jean-Yves Perrier, dev-mdc, Jwhi...@mozilla.com
Hi!

FWIW, we are still in alpha testing for browsercompat and not in the
process of scrapping all MDN yet.

++
Jeremie
Jeremie
.............................
Web : http://jeremie.patonnier.net
Twitter : @JeremiePat <http://twitter.com/JeremiePat>

Eric Shepherd

unread,
Aug 20, 2015, 6:37:01 PM8/20/15
to dev...@lists.mozilla.org
Jean-Yves Perrier wrote:
> The list is manually feeded to the importer: so there is no
> predictible time for scrapping new pages.
Jeremie Patonnier wrote:
> The list is manually feeded to the importer: so there is no
> predictible time for scrapping new pages.
OK... I've seen a bunch of folks (seems to be very common among native
French speakers) make this mistake, and today, seeing it a few times in
a row, I figured would be a decent time to point it out:

It's not "scrapping." That's the present participle form of "scrap," or
to throw away or discard.

It's "scraping," which is the present participle form of "scrape," or to
drag a tool across a target in order to remove something.

I have actually gotten briefly confused and horrified to learn that
something was being "scrapped" (which would be the past tense of
"scrap"), and relieved moments later to realize it's meant to be
"scraped." :)

Hope this helps. :)

--

Eric Shepherd
Senior Technical Writer
Mozilla <https://www.mozilla.org/>
Blog: http://www.bitstampede.com/
Twitter: http://twitter.com/sheppy
Check my Availability <https://freebusy.io/eshe...@mozilla.com>

Sebastian Zartner

unread,
Aug 20, 2015, 6:44:48 PM8/20/15
to Eric Shepherd, MDC Mailinglist
On 21 August 2015 at 00:36, Eric Shepherd <eshe...@mozilla.com> wrote:

> Jean-Yves Perrier wrote:
> > The list is manually feeded to the importer: so there is no
> > predictible time for scrapping new pages.
> Jeremie Patonnier wrote:
> > The list is manually feeded to the importer: so there is no
> > predictible time for scrapping new pages.
> OK... I've seen a bunch of folks (seems to be very common among native
> French speakers) make this mistake, and today, seeing it a few times in
> a row, I figured would be a decent time to point it out:
>
> It's not "scrapping." That's the present participle form of "scrap," or
> to throw away or discard.
>
> It's "scraping," which is the present participle form of "scrape," or to
> drag a tool across a target in order to remove something.
>
> I have actually gotten briefly confused and horrified to learn that
> something was being "scrapped" (which would be the past tense of
> "scrap"), and relieved moments later to realize it's meant to be
> "scraped." :)
>
> Hope this helps. :)
>

Maybe we need to get some English lessons from you the next time we see us.
:-)

Sebastian

Eric Shepherd

unread,
Aug 20, 2015, 6:58:59 PM8/20/15
to Sebastian Zartner, MDC Mailinglist
Sebastian Zartner wrote:
> Maybe we need to get some English lessons from you the next time we
> see us. :-)
LOL

Seriously though, I hope when I point out little things like that that
it's taken in the spirit in which it's given -- trying to help
contributors' already frightfully good grasp of English writing get even
better.

I know Jérémie and I have talked about that in the past and he's been
enthusiastically grateful when I point things out, but that won't apply
universally, so I try to be cautious. :)

John Whitlock

unread,
Aug 20, 2015, 7:06:56 PM8/20/15
to Jean-Yves Perrier, dev-mdc
Hi Joe. Can you give me one or more URLs for the pages you are editing?

On Thu, Aug 20, 2015 at 11:40 AM, Jean-Yves Perrier <jype...@gmail.com>
wrote:

> Hi!
>
> The list is manually feeded to the importer: so there is no predictible
> time for scrapping new pages.
>

Sebastian Zartner

unread,
Aug 21, 2015, 1:54:36 AM8/21/15
to John Whitlock, Jean-Yves Perrier, dev-mdc
Hi John,

here's the list of Joe's changes:
https://developer.mozilla.org/en-US/dashboards/revisions?user=jpmedley

Btw. it would be nice if that list could be filtered by new pages. See bug
1197080 <https://bugzilla.mozilla.org/show_bug.cgi?id=1197080>.

Sebastian

Jeremie Patonnier

unread,
Aug 21, 2015, 5:10:17 AM8/21/15
to Eric Shepherd, MDC Mailinglist, Sebastian Zartner
Hi!

2015-08-21 0:58 GMT+02:00 Eric Shepherd <eshe...@mozilla.com>:

> I know Jérémie and I have talked about that in the past and he's been
> enthusiastically grateful when I point things out, but that won't apply
> universally, so I try to be cautious. :)
>

Indeed I'm :) I wasn't aware of the difference between scrap and scrape so,
thank you :) For me writing scrapping here was a mistake that comes from
French. In French its customary to double the consonant letters when they
are between vowels, so I often do this in English if I don't double check.
Here was the case it had an impact on the meaning. So thanks to point that
out :)

Best,

John Whitlock

unread,
Aug 21, 2015, 11:08:22 AM8/21/15
to Sebastian Zartner, Jean-Yves Perrier, dev-mdc
The new pages are under /Web/API, so it would be added to the importer when
I run the mirror_mdn_features tool. This is a manual process, because I
review the change list which potentially deletes data (for example, if an
MDN page is removed or moved). I'll re-run the tool at my next opportunity
to capture the new pages.

I've opened bug 1197210 to track this issue:

https://bugzilla.mozilla.org/show_bug.cgi?id=1197210

>From a quick look at the new pages, they don't appear to have issue that
would cause the importer to cry. New pages aren't usually the problem -
it's the old monsters from the early days of the standardized BrowserCompat
tables. Fixing some existing pages will give you a sense of what works and
what is problematic.

Fixing pages will be easier with the re-written importer, which handles
issues that the current importer marks as section_skipped or halt_import.
That change is in code review, and will hopefully ship by end of August.

John
0 new messages