[open-government] A Brief Legal History of Open Government Data

Josh Tauberer

unread,

Jan 29, 2012, 8:37:53 PM1/29/12

to openhous...@googlegroups.com, open-go...@lists.okfn.org

Over the last year I've been tracing the history of open government data
laws back in time and have found some interesting stories involving....

The Visigoths, and perhaps the earliest open-law law,

Flour inspectors in the 1760s, and codification of law,

Disbursements disclosure in 18th century China, and

The first FOI law in 18th century Sweden.

These stories and more can be found at
http://opengovdata.io/2012-02/page/4/brief-legal-history-open-government-data.

...which is Chapter 4 of my creatively titled book

Open Government Data: The Book
http://opengovdata.io/

And by "book" I mean website with a lot of words.

I'll be posting about other chapters as they are finalized. Next up will
be "17 Principles of Open Government Data."

Feedback warmly welcome.

--
- Josh Tauberer (@JoshData)
- GovTrack.us | POPVOX.com

http://razor.occams.info | www.govtrack.us | www.popvox.com

_______________________________________________
open-government mailing list
open-go...@lists.okfn.org
http://lists.okfn.org/mailman/listinfo/open-government

Michael Roberts

unread,

Jan 29, 2012, 9:14:12 PM1/29/12

to Josh Tauberer, open-go...@lists.okfn.org, openhous...@googlegroups.com

Hi Josh,

Recently I put together this site that had been dormant for some time. It is still very much a work in progress. http://www.idmlinitiative.org
It contains information for researchers about early examples of donors and NGO's sharing open aid data through organizations such as INDIX (early 1990's), Aida - now Aiddata.org, and through protocols such as IDML (1998) and CEFDA. They are early examples of projects that have had similar ambitions to initiatives such as IATI. It is really incredible to see the groundswell of support at the moment for open government data from all levels of government.

Cheers,
Michael

--------------------------------------------------------------------------
Michael Roberts -- Acclar Open Aid Data - Co-Founder
web: www.acclar.org
email: michael...@acclar.org

facebook: http://www.facebook.com/acclar.open
twitter: @acclar
skype: mroberts_112

Holm, Jeanne M (1760)

unread,

Jan 30, 2012, 8:55:16 AM1/30/12

to open-go...@lists.okfn.org

Josh--

Really great work on the Open Data book. Thanks for taking the time to put so much down in one place.

As we evolve the discussions and framework around open government data, it's critical that we understand the history, principles, and examples of how we got to where we are today.

Nice work!

--Jeanne Holm

**********************************************************
Jeanne Holm
Evangelist, Data.gov
U.S. General Services Administration

Cell: (818) 434-5037
Twitter/Facebook/LinkedIn: JeanneHolm

Gregory Slater

unread,

Feb 3, 2012, 1:34:14 PM2/3/12

to openhous...@googlegroups.com, open-go...@lists.okfn.org

Josh Tauberer,

Thanks for the link to your history of open gov data, and the on-line book. Interesting.

- greg slater

On Jan 29, 2012, at 5:37 PM, Josh Tauberer wrote:

> Over the last year I've been tracing the history of open government data laws back in time and have found some interesting stories involving....
>
> The Visigoths, and perhaps the earliest open-law law,
>
> Flour inspectors in the 1760s, and codification of law,
>
> Disbursements disclosure in 18th century China, and
>
> The first FOI law in 18th century Sweden.
>
> These stories and more can be found at http://opengovdata.io/2012-02/page/4/brief-legal-history-open-government-data.
>
> ...which is Chapter 4 of my creatively titled book
>
> Open Government Data: The Book
> http://opengovdata.io/
>
> And by "book" I mean website with a lot of words.
>
> I'll be posting about other chapters as they are finalized. Next up will be "17 Principles of Open Government Data."
>
> Feedback warmly welcome.
>
> --
> - Josh Tauberer (@JoshData)
> - GovTrack.us | POPVOX.com
>
> http://razor.occams.info | www.govtrack.us | www.popvox.com
>

> --
> You received this message because you are subscribed to the Google Groups "Open House Project" group.
> To post to this group, send email to openhous...@googlegroups.com.
> To unsubscribe from this group, send email to openhouseproje...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/openhouseproject?hl=en.

Josh Tauberer

unread,

Feb 11, 2012, 8:43:54 PM2/11/12

to openhous...@googlegroups.com, open-go...@lists.okfn.org

Last week the House Committee on House Administration (here in the U.S.)
held a conference on legislative data and transparency. Reynold
Schweickhardt, the committee’s director of technology policy, made an
interesting observation at the start of the day that policy for public
information often is framed in terms of 3 A's:

accessibility,
authenticity, and
accuracy.

I thought about that over the next few hours. They are good principles.
And yet us data geeks so often find ourselves having to start from
scratch explaining why clean data is so important. It seems
contradictory: if accuracy is a concept practitioners in government get,
and if 'clean' is a type of accuracy, then there must be some
communications failure here if we're having a hard time explaining open
data to government agencies. (To be clear, Reynold totally gets it.)

--------------------------------------------
TLDR version: Read chapter 5 of my book at:
http://opengovdata.io/2012-02/page/5/principles-open-government-data
--------------------------------------------

So I was thinking that morning, what other word do we need to add to
those 3 As to work open data in there? At first I thought about adding
"precision". Precision is one thing we're usually asking for when we ask
for open data. Precision is basically granularity. Compared to say a
PDF, XHTML is more granular because it is explicit about section
boundaries, paragraphs, identifying where in the document the important
things are like names and dollar amounts, etc. (It is more granular with
respect to the meaning of the document, though not its pagination.)

But precision is too narrow. When Congress releases its institutional
spending records, it does so in a PDF. That PDF has high precision ---
it gets down practically to line items. The problem with the PDF is that
it has low accuracy because getting it into a spreadsheet format and
de-duping names introduces errors.

But accuracy is already one of the three As. So what's missing here?

The Association of Computing Machinery’s Recommendation on Open
Government (February 2009) figured this out:

> "Data published by the government should be in formats and approaches
> that promote analysis and reuse of that data."
http://www.acm.org/public-policy/open-government

Not only is it right, but "analysis" starts with the letter A. Plus, in
order to do any useful analysis on large amounts of information, we need
automation --- another A word. That is fate if I ever saw it.

Proposing a whole 17 distinct principles of open government data (read
the chapter!) might be, let's say, overwhelming in any practical
situation. If we had to do with just four words, maybe these will do:

accessible,
authentic,
accurate, and
analyzable (using automation, because data is big these days).

Analyzable gives deeper meaning to the other three words. Accuracy is
too vague alone. You can't measure accuracy in the absence of some
process. In the computer science world, accuracy is how often something
comes out right. I think government documents people have considered
that 'something' to be if a Xerox machine copies enough pixels
correctly. That's not sufficient for analysis anymore. We can't go
hiring thousands of interns to read all of the documents governments
produce. We didn't build computers for nothing.

With analyzable added, the meaning of accuracy is that an *automated
computer process* will get it right. If someone says a document is
accurate because it is a scan, I'll say that's what accurate meant in
the 1960s. If the fourth "A" of government information is analyzable, we
can redefine accuracy for 2012.

But if you want the full 17 principles, read the rest of the chapter,
which tackles data quality (accuracy & precision), machine
processability, and other concepts in more detail. There's also a case
study on the House disbursements documents, looking at whether and how
it met the 17 principles:

http://opengovdata.io/2012-02/page/5/principles-open-government-data

Thanks,

David Robinson

unread,

Feb 11, 2012, 8:58:30 PM2/11/12

to openhous...@googlegroups.com, open-go...@lists.okfn.org

This is a great point -- and I think there's a perfect A word for it:

Adaptability.

That captures the spirit of innovation that infuses so much of this work. And if data is adaptable, it is also capable of being analyzed -- or so I would think?

--

David Robinson

Knight Law and Media Scholar

Information Society Project

Yale Law School

JD Class of 2012

David.R...@Yale.edu

(202) 657-9892

--
You received this message because you are subscribed to the Google Groups "Open House Project" group.

To post to this group, send email to openhouseproject@googlegroups.com.
To unsubscribe from this group, send email to openhouseproject+unsubscribe@googlegroups.com.

Tracey P. Lauriault

unread,

Feb 12, 2012, 8:11:03 AM2/12/12

to open-go...@lists.okfn.org

interoperability

also see iso 19115 metadata standard for the elements of data quality in geomatics

there will be different standards for different kinds of data, but the most advanced and broadly used are those found in some of the sciences and geomatics,

--
Tracey P. Lauriault
613-234-2805

"Every epoch dreams the one that follows it's the dream form of the future, not its reality" it is the "wish image of the collective".

Walter Benjamin, between 1927-1940, (http://www.columbia.edu/itc/architecture/ockman/pdfs/dossier_4/buck-morss.pdf)

Josh Tauberer

unread,

Feb 12, 2012, 6:01:37 PM2/12/12

to openhous...@googlegroups.com, open-go...@lists.okfn.org

(Some replies of course only went to one list or the other --- apologies
if I'm replying to something you didn't see.)

On 02/11/2012 08:58 PM, David Robinson wrote:
> Adaptability.
>
> That captures the spirit of innovation that infuses so much of this
> work. And if data is adaptable, it is also capable of being analyzed
> -- or so I would think?

I like that this makes the focus broader than just analysis, closer to
the meaning of transformation.

On 02/12/2012 12:57 AM, Justin Grimes wrote:
> In comparison to open source, we only ask that code be licensed to
> be open source. We don’t ask that code compiles? is well documented?
> works well or as intended? etc. Those are things that might be
> expected or desired but certainly not required of it to be ”open”.

Even in the open source world, there are dozens of popular licenses. The
minimal requirements for 'open source' aren't necessarily natural ---
they no doubt came out of balancing different views and the pragmatic
need for interoperability of licenses.

The pragmatic needs for data, and especially government data, are
different. If data is meant to serve transparency, then it is important
to be able to know what the bits mean, more so than interoperability
(for instance).

On 02/12/2012 04:12 AM, innovation institute wrote:
> There is no accuracy in absolute terms.

That's exactly what I was saying. But in my experience, many agencies
who are or want to produce data do not have a well defined sense of
accuracy, or their definition is out of date with respect to data.

On 02/12/2012 01:21 PM, Gregory Slater wrote:
> What about 'API' for the fourth 'A' ?
On 02/12/2012 04:52 PM, Javier Muniz wrote:
> "queryable"

The fear that some of us have with those sorts of recommendations is
that agencies will then skip the bulk data part, and then we'll all have
to start getting API keys and bending over backwards to get large slices
of the underlying data for a large scale analysis.

On 02/12/2012 04:52 PM, Javier Muniz wrote:
> The nice thing about these definitions is that they have real
> (already defined) meaning, and can be tested or measured. Datasets
> could be tagged with their level of normalization, for example "1NF"

"1NF" (or even 3NF) can be a useful definition and recommendation, but
it is very narrow in the types of data it would make sense for (e.g. not
documents).

- Josh Tauberer (@JoshData)
- GovTrack.us | POPVOX.com

http://razor.occams.info | www.govtrack.us | www.popvox.com

_______________________________________________

Charles Pytleski

unread,

Feb 12, 2012, 1:55:29 PM2/12/12

to openhous...@googlegroups.com, open-go...@lists.okfn.org

Thank you all for the helpful information and good strategy for the days ahead.

Best Regards,

Charles

On Sun, Feb 12, 2012 at 12:21 PM, Gregory Slater <ten...@pacbell.net> wrote:

What about 'API' for the fourth 'A' ?

On the other hand, one might argue that ease of programmatic readablity is another facet of 'Accessibility', since in the age of 'big data', data is not really accessible if it isn't formatted for programmatic access. In fact, one way of thwarting transparency is to overwhelm the user in enormous volumes of documents that effectively cannot be parsed, summarized and searched efficiently. Think of the last scene of 'Raiders of the Lost Ark'…

Anyway, I totally agree that programmatic machine readability is absolutely key for big data
Thanks for thoughts,

- Greg Slater

> --
> You received this message because you are subscribed to the Google Groups "Open House Project" group.

> To post to this group, send email to openhous...@googlegroups.com.
> To unsubscribe from this group, send email to openhouseproje...@googlegroups.com.

> For more options, visit this group at http://groups.google.com/group/openhouseproject?hl=en.
>

--
You received this message because you are subscribed to the Google Groups "Open House Project" group.

To post to this group, send email to openhous...@googlegroups.com.
To unsubscribe from this group, send email to openhouseproje...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/openhouseproject?hl=en.

--
Best regards,
Charles

Gregory Slater

unread,

Feb 12, 2012, 1:21:43 PM2/12/12

to openhous...@googlegroups.com, open-go...@lists.okfn.org

What about 'API' for the fourth 'A' ?

On the other hand, one might argue that ease of programmatic readablity is another facet of 'Accessibility', since in the age of 'big data', data is not really accessible if it isn't formatted for programmatic access. In fact, one way of thwarting transparency is to overwhelm the user in enormous volumes of documents that effectively cannot be parsed, summarized and searched efficiently. Think of the last scene of 'Raiders of the Lost Ark'…

Anyway, I totally agree that programmatic machine readability is absolutely key for big data
Thanks for thoughts,

- Greg Slater

On Feb 11, 2012, at 5:43 PM, Josh Tauberer wrote:

> --
> You received this message because you are subscribed to the Google Groups "Open House Project" group.
> To post to this group, send email to openhous...@googlegroups.com.
> To unsubscribe from this group, send email to openhouseproje...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/openhouseproject?hl=en.
>

Reply all

Reply to author

Forward