Wikimedia Commons gargle update

5 views
Skip to first unread message

Federico Leva (Nemo)

unread,
Nov 11, 2016, 6:05:48 AM11/11/16
to wikiteam...@googlegroups.com
FYI I'm now uploading new items like
https://archive.org/details/wikimediacommons-2014-08-01 (for the first
time in 2 years). I'll keep updating
http://archiveteam.org/index.php?title=Wikimedia_Commons .

I'd especially need help in fixing the errors found by
commonschecker.py, of which I'll upload the logs in each item (most
items have at least a couple). For instance, the item linked above has a
couple "corrupt" files (partially downloaded, or unexpected size) and
some lines which say "empty file" (and I have no idea what these are).
It might suffice to download those few files separately and upload the
"errata" manually; or if you're a coder you might want to find a way for
commonschecker.py to also redownload the missing files and update the
corresponding zip before I upload or delete it.

Due to https://phabricator.wikimedia.org/T134148 , I'm now running the
scripts on a workstation with 7400 GiB disk on the GARR network (which
right now manages to upload at about 150 Mbit/s to the Internet Archive).

Nemo

Emilio J. Rodríguez-Posada

unread,
Nov 11, 2016, 7:56:46 AM11/11/16
to wikiteam...@googlegroups.com
What is the average percentage of files that fail to download?



Nemo

--
You received this message because you are subscribed to the Google Groups "wikiteam-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wikiteam-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Federico Leva (Nemo)

unread,
Nov 11, 2016, 7:58:11 AM11/11/16
to wikiteam...@googlegroups.com
Emilio J. Rodríguez-Posada, 11/11/2016 13:56:
> What is the average percentage of files that fail to download?

I've not calculated it.

Nemo

Federico Leva (Nemo)

unread,
Nov 14, 2016, 6:10:16 PM11/14/16
to wikiteam...@googlegroups.com
Emilio J. Rodríguez-Posada, 11/11/2016 13:56:
> What is the average percentage of files that fail to download?

The commonschecker output is being uploaded too now (e.g.
https://archive.org/download/wikimediacommons-2014-05-03/2014-05-03.log
), so whoever wants to help can now perform such analysis and checks.

Nemo
Reply all
Reply to author
Forward
0 new messages