Eight Open Goverment Data Principles

2 views
Skip to first unread message

Josh Tauberer

unread,
Dec 9, 2007, 3:25:59 PM12/9/07
to openhous...@googlegroups.com
As I enjoy sadly non-free wi-fi access in Chicago Midway airport on my
way back from the 'Open Government Working Group' conference in
Sebastopol, CA, I thought I would report to this list one outcome of the
conference. (The conference was organized by Carl Malamud, sponsored by
the usual contenders, and a bunch of people on this list were there.)

Here's our official group-written (read: laboriously edited)
announcement, which is also at www.opengovdata.org (the www. is crucial
for the moment):

---

8 december 2007 - This weekend, 30 open government advocates gathered to
develop a set of principles of open government data. The meeting, held
in Sebastopol, California, was designed to develop a more robust
understanding of why open government data is essential to democracy.

The Internet is the public space of the modern world, and through it
governments now have the opportunity to better understand the needs of
their citizens and citizens may participate more fully in their
government. Information becomes more valuable as it is shared, less
valuable as it is hoarded. Open data promotes increased civil discourse,
improved public welfare, and a more efficient use of public resources.

The group is offering a set of fundamental principles for open
government data. By embracing the eight principles, governments of the
world can become more effective, transparent, and relevant to our lives.

---

The principles can be used to determine whether government data can be
considered "open", and it was suggested that we develop some sort of
branding that we all can make use of to support and point to the
principles. The principles are at:
http://wiki.opengovdata.org/index.php/OpenDataPrinciples

The discussion pages linked from some of the terms in the principles are
editable wiki pages and do need to be fleshed out with suggestions from
anyone.

Also, Dan Newman started some discussion about how to mobilize citizens
at large over transparency issues. I am eager to see how that discussion
continues--- I expect some organizing will happen on the (open) mail
list created at the conference (and linked from www.opengovdata.org;
yes, yet another mail list...).

Ok, back to waiting for my plane.

Josh Tauberer

John Wonderlich

unread,
Dec 9, 2007, 3:59:19 PM12/9/07
to openhous...@googlegroups.com
Josh,

Thanks for reporting about what was discussed.

When I tried to address similar concerns in February , I ended up with a very similar set of ideas (what I termed themes) for government information, largely as a result of the discussions on this group.

Here's what I came up with then:

Timeliness: Promptly available information is necessary for meaningful involvement.

Accessibility: The House (and related agencies) should strive to publish information in accordance with W3C recommendations for accessibility.

Format: The House should embrace structured information (XML, etc), standardize formats across different agencies (this effort has been underway for some time), and utilize non-proprietary filetypes.

Preservation: House data is historically significant, so issues regarding archiving and permanence should be considered.

Availability: Balancing the requirements for informed participation in a democracy with the privacy rights of individuals and the pragmatics of a funcitoning legislature lead to proper levels of disclosure.

Accuracy: Standards for accuracy and completeness preserve meaningful access.

Usability: Information should be presented in a fashion that encourages use.

Interactivity: Appropriate levels of user input should be considered.

The one thing I see missing from this weekend's discussion is permanence.  Digital preservation is far from automatic, and I think addressing a given data set's permanence should be an automatic part of considering whether it should be publicly available.

Even data sets that aren't publicly available have permanence concerns, which have been inadequately addressed also, given the current controversy around the CIA interrogation tapes, or the White House/RNC emails.  I'm not making any judgements on those situations, but pointing out that clear expectations about a given data set's permanence let us effectively judge whether the data is being handled appropriately.  This is not a judgement that should be made when one subset of people suspect bad acting on the part of another group. 


John


--
John Wonderlich

Program Director
The Sunlight Foundation
(202) 742-1520 ext. 234

Josh Tauberer

unread,
Dec 10, 2007, 11:41:14 AM12/10/07
to openhous...@googlegroups.com
John Wonderlich wrote:
> The one thing I see missing from this weekend's discussion is
> /permanence/. Digital preservation is far from automatic, and I think
> addressing a given data set's permanence should be an automatic part of
> considering whether it should be publicly available.

This came up, though not quite in that way. I think the consensus at the
time was that if information disappeared, it was no longer "open"
because, well, it was no longer being provided in the first place.

But, adding some notes about preservation into either, say, the
'complete' or 'accessible' principles is definitely warranted.

--
- Josh Tauberer
- GovTrack.us

http://razor.occams.info

"Yields falsehood when preceded by its quotation! Yields
falsehood when preceded by its quotation!" Achilles to
Tortoise (in "Gödel, Escher, Bach" by Douglas Hofstadter)

John Brothers

unread,
Dec 10, 2007, 11:45:52 AM12/10/07
to openhous...@googlegroups.com
The only thing I see that's missing is the idea of Consistency.  There are two types of consistency:

In-source consistency - A given Data Source could be open, permanent and structured, but if the structure keeps changing, it can be a huge mess.  (Luckily, this isn't likely to happen)

Cross-source consistency - Ideally, however, we should have some consistency between different data sources - that the thing that identifies, say, an Earmark in one data source is used correctly in another data source.    This is a much harder constraint, but far more beneficial to the consumer of the data.


Perla Ni

unread,
Dec 10, 2007, 12:43:20 PM12/10/07
to openhous...@googlegroups.com
One other thing would be helpful to add is the idea of Free or At Lowest Marginal Cost.  For those of you who have been following VoterWatch's saga to get congressional video, you know that the Library of Congress wants to pass onto us the cost from their vendor to make the copies.  And the vendor is asking for thousands of dollars for a couple of videos.  It does cost money to make copies of videos, but I think the LOC isn't really making an attempt to find a low-cost vendor.

So here's my attempt to word this:

Free or at Lowest Marginal Cost: All data should be provided free of charge or, in the case of archived data where Congress incurs substantial costs to provide the data in a currently acceptable or distributable format, Congress get quotes from multiple vendors so that they charge the lowest marginal cost for the data.

Perla NI
VoterWatch.org
--
Perla Ni
CEO GreatNonprofits
www.greatnonprofits.org
415-902-2659
per...@greatnonprofits.org

Greg Palmer

unread,
Dec 10, 2007, 1:32:29 PM12/10/07
to openhous...@googlegroups.com
In our discussions this weekend, my recollection is that we thought that "free or marginal cost" belonged in the "accessible" bucket and perhaps the "license-free" bucket. The cost issue was definitely discussed at length and considered to be a significant barrier to open government data, but to my mind if something has a huge cost associated with it, it's not accessible and there is an implied license.

One other thing we discussed but did not make it into the short form that we've released was the idea of reasonable expectations of government data being digital. I'll summarize by saying that "if any particular government data should reasonably be digital, citizens have the right to expect that it is." I'm not sure I'm capturing the whole idea there, but it was the idea was that we need to have a reasonable expectation of progress by government. Associated with that idea is the fact that digital data distribution and reproduction is often orders of magnitude cheaper than for physical artifacts.

Perla Ni

unread,
Dec 10, 2007, 2:29:40 PM12/10/07
to openhous...@googlegroups.com
Great.  Thanks for that clarification and elaboration - works for us!

Perla
voterwatch.org

james a. jacobs

unread,
Dec 10, 2007, 4:07:03 PM12/10/07
to openhous...@googlegroups.com
re: the recent discussion of permanence and digital preservation and
the related issues of openness, completeness, accessibility,
licensing, and marginal cost:

Josh and Perla and John and Greg have already addressed some of the
issues intertwined with preservation. Here is my attempt to bring
these threads together, add an explicit preservation-perspective,
enumerate the problems, and provide a starting point to solve the
problems:

1. Preservation will not just happen. Digital preservation in
particular takes planning and resources. I worry that, if
preservation is not addressed explicitly it will not be addressed
adequately.

2. Requiring "access" alone -- even open, complete, free access -- is
not enough. Without planning and funding for long-term access and
preservation, access today can turn into inadvertent loss tomorrow.

3. Relying on the government to provide the only means of long-term
preservation and access will work sometimes but fail when it is most
critical. If the government is the sole-provider and it (intentionally
or unintentionally) amends, alters, loses, abridges, or deletes
content, the content is lost for everyone for ever. Information that
is most embarrassing, most valuable, most useful to citizens in making
government accountable will be most vulnerable to intentional control,
alteration, and loss.

4. Relying on the government to provide sole means of access endangers
privacy as it allows governments to record and track use of government
information by individuals.

5. We can predict, based on what governments have done in the past,
what will happen if we allow or encourage the government to recover
costs for access (even marginal cost of distribution). There will be
two-tier access for users: some will be able to afford access and
others will not. There will be two-tier access for content: 'popular'
content will be free or less expensive, but there will be charges for
less-popular or less-used. There will be two-tier access to
functionality: one user may be able to get one page of a hearing for
free, but it will cost for citizens' groups, libraries, and others who
wish to get mass content (e.g. all hearings for a congress). The
government will rely on private sector vendors to provide access
through outsourcing and will claim that availability of content
through private vendors meets the requirements of 'access.' In an
attempt to recover costs, governments will license access to data and
in doing so, impose licensing restrictions on redistribution and use
and will apply technological locks (i.e., DRM) to enforce license
restrictions.

One possible component of a solution to the above problems is to
require that the government make available en masse and distribute
(without charge or licensing restrictions or DRM) all government
information to libraries, archives, and other memory organizations.
The existing Federal Depository Library Program (FDLP), which is
defined by U.S. Code Title 44 and administered by the Government
Printing Office (GPO), provides a starting place for such a
distribution system. The GPO has, however, been arrogating to itself
the role once given to distributed depository libraries and most
depository libraries have been reluctant to ask for the responsibility
of accepting deposit of digital government files. It will probably be
necessary to write into plans for preservation the explicit role of
government deposit and the role of depository libraries to accept and
preserve that information. With lots of copies in lots of
institutions, free of of locks and restrictions on use, it will be
harder to lose, destroy, or control access to government information.
With multiple partners preserving and providing access to the
information, there will be multiple budgets, multiple constituencies,
and multiple technical preservation solutions.

Jim Jacobs
Data Services Librarian Emeritus
University of California San Diego

Dan Manatt

unread,
Dec 10, 2007, 4:45:18 PM12/10/07
to openhous...@googlegroups.com
Gang:

Looks like the Senate is taking its cue from the Sebastopol Summit... CapNews.Net, which we've been testing in beta this fall, will videotape and post the hearing on our YouTube site, YouTube.com/CapNews.Net (our actual site is not yet live).  (BTW, I'll be contacting many of you in coming days and weeks for advice on CapNews -- we've been putting together our organizational plan to greatly expand our coverage of the House & Senate floors and hearings, and would love input).

Senate Homeland Security and Governmental Affairs Committee
E-Government Improvements
Full committee hearing on "E-Government 2.0: Improving Innovation, Collaboration, and Access."
Witnesses: Karen Evans, administrator of electronic government and information technology at the Office of Management and Budget; John Needham, manager of public sector content partnerships at Google, Inc.; Ari Schwartz, deputy director of the Center for Democracy and Technology; and Jimmy Wales, founder of Wikipedia
Location: 342 Dirksen Senate Office Building. 10 a.m.

Dan


Dan Manatt
CapNews.Net
Web Video/Audio News Service
A Service of Talk Radio News





Micah Sifry

unread,
Dec 10, 2007, 5:41:47 PM12/10/07
to openhous...@googlegroups.com

John Wonderlich

unread,
Dec 10, 2007, 5:43:54 PM12/10/07
to openhous...@googlegroups.com
tomorrow morning.

See Adam's post for more background...

Sean Moulton

unread,
Dec 10, 2007, 5:44:15 PM12/10/07
to openhous...@googlegroups.com

I believe, it is tomorrow (Tues) at 10am.

 

http://hsgac.senate.gov/index.cfm?Fuseaction=Hearings.Detail&HearingID=513

 

 

Sean Moulton
Director, Federal Information Policy
OMB Watch
1742 Connecticut Ave. NW
Washington, DC 20009
Phone: (202) 234-8494
Fax: (202) 234-8584





Combined Federal Campaign #10201

Leslie Harris

unread,
Dec 10, 2007, 5:51:17 PM12/10/07
to openhous...@googlegroups.com
Tomorrow


Leslie Harris
President/ CEO
Center for Democracy & Technology
1634 I St NW, 11th Floor
Washington, DC 20006


Micah Sifry

unread,
Dec 10, 2007, 6:51:51 PM12/10/07
to openhous...@googlegroups.com
Folks, these are all great comments and it would be great if you felt like adding them to the discussion page here: http://wiki.opengovdata.org/index.php/OpenDataPrinciples

wish to get mass content ( e.g. all hearings for a congress).  The

Dan Manatt

unread,
Dec 10, 2007, 8:45:06 PM12/10/07
to openhous...@googlegroups.com
Perla:

Hope all is well --

Per my email earlier, we're ramping up our congressional video service, and I'd love to pick your brains/sound you out on things.  Free to talk late this week?

Dan

Silona Bonewald

unread,
Dec 19, 2007, 8:14:44 PM12/19/07
to openhous...@googlegroups.com
I met with the e-govt group in New Zealand and they gave me a wonderful flyer with many compatible statements from a governmental perspective.   And proving they have truly drunk the koolaid - they also have a wiki at
http://wiki.participation.e.govt.nz/wiki/Guide_to_Online_Participation/Overview
I highly recommend it as something to recommend as a path to other governmental groups.

I mentioned all of these principles in my presentation that I gave at the ec3.org last month (a consortium of state CIO, CTO, Secr of state and State auditors) http://www.slideshare.net/silona/social-networks-and-government-application
note slides 54 and 56.

What was so amazing about the presentation is the overwhelming positive response I received over these ideas and concepts.  I think we have a bunch of people on our side in this in regards to state government.

Cheers,
Silona
Reply all
Reply to author
Forward
0 new messages