World Factbook Country Profiles (Pages) without Chrome (Just "Core" Data in HTML) - New Repo /factbook

23 views
Skip to first unread message

Gerald Bauer

unread,
Sep 23, 2015, 12:25:41 PM9/23/15
to openmundi
Hello,

In case anyone follows along I'm trying to update the world
factbook country profiles.

As of today (Sep/23) there's no new download package on the official
CIA World Factbook site. The last package for download is from 2014.

Thus, the new idea is to use the online "live" pages. Since these
pages have a lot of chrome e.g. page decoration such as site
navigation headers, footers, etc. around the "core" country profile
data I've started with a script that cleans up the profile page and
strips it down to the basics.

See the page for Austria (au) or Brazil (br) as first examples in
the new /factbook repo [1]. Using these "core" pages without any
extras should make it easier to keep track of changes and turn it into
structured data (e.g. factbook.json) even as some formatting and
styles get rearranged (as happend sometime back in April).

Questions? Comments? As always welcome. Cheers.

[1] github.com/openmundi/factbook

Gerald Bauer

unread,
Sep 23, 2015, 6:48:57 PM9/23/15
to openmundi
Hello,

    In case anyone follow along - I've added an index page [1] for live browsing of the profile pages. Again see the Austria (au) [2] and Brazil (br) [3]pages as examples.

   For now the styling is tuned for testing (and not for reading) e.g.

   - Sections get a navy (dark blue) text (foreground) color
   - Subsections get a blue text (foreground) color
   - Category (e.g. class category) gets a lime (light green) background (and green if inline)
   - Category Data (e.g. class category_data) gets a silver (light gray) background (and gray if inline) 

    That should help to see / double check the structure. Cheers.


Gerald Bauer

unread,
Sep 26, 2015, 1:48:24 PM9/26/15
to openmundi
Hello,

In case anyone follows along - the new facebook/ repo now includes
all 200+ country profiles [1] (as cleaned-up HTML pages, that is,
without any headers, footers, scripts, etc - just the "core" country
profile).

Note: You can now browse the pages online also by region [2]. The
regions used (see below) are - of course - the regions 1:1 copied from
the original pages:

- Europe
- South Asia | Central Asia | East & Southeast Asia | Middle East
- Africa
- North America | Central America and Caribbean | South America
- Australia-Oceania
- Antarctica
- Oceans | World

Cheers.

PS: To download all pages (stored in the _profiles/ folder) clone the
git archive or use the .zip archive download button.

[1] openmundi.github.io/factbook
[2] openmundi.github.io/factbook/regions
Reply all
Reply to author
Forward
0 new messages