Thanks

20 views
Skip to first unread message

Eckhard Licher

unread,
Oct 11, 2015, 1:49:49 AM10/11/15
to openmundi
Gerald,

I have almost given up on the World Factbook -- HTML changes every year made the WFB an extremely moving target to scan. Having created scanners for at least 4 HTML versions in the last one and a half decades I was about to give up. I was even checking the openmundi group for news ever less frequently.

Having checked this morning I am happy to see important progress. Particulary the path taken in de-chroming the crufty original markup seems to be very promising for the near to mid-term future.

So thanks A LOT for the de-chromed html version. The cleaned-up html pages seem to scan nicely with my python bs4-based scanner (unmodified). I need to do some more tests but the preliminary results are very promising. Only scanning the header information from the de-chromed html pages needs to be changed. But this is a piece of cake, obviously.

Before updating my toolchain, making it reliant on your de-chromed html pages, I would like to know your update policy for these pages. If you are going to produce at least annual updates I will follow and create .json, .md, .html, .tex and other versions from these on a regular basis as well.

Once again, thanks a lot for the good work. Keep it up.

Eckhard

Gerald Bauer

unread,
Oct 11, 2015, 6:13:59 AM10/11/15
to Eckhard Licher, openmundi
Hello,
As always thanks for your comments and trying out the new
machinery. Welcome back.

I'm about half done with the new update / machinery - the plan is
to also update the factbook gem / library plus adding a new build
script that (auto)-runs using the skriptbot once a month or once a
week or even once a day.

The idea is to update the "chrome-less" html pages plus also update
the factbook.json files. This time following your model, that is,
using all factbook entities and codes (instead of "just" a selection
of countries and using internet domain top level country codes).

Keep up your great work on openfactbook - great to see it updated,
it's a great resource.

Cheers.

[1] github.com/skriptbot/factbook.json
Reply all
Reply to author
Forward
0 new messages