Paper: "Government Data and the Invisible Hand"

Josh Tauberer

unread,

Jun 6, 2008, 6:53:48 AM6/6/08

to openhous...@googlegroups.com, open-go...@googlegroups.com

(Cross-posting. Usual apology here.)

Open House and Open Gov Data friends,

The guys over at Princeton's new Center for Information Technology
Policy wrote a really great paper for the Yale Journal of Law &
Technology on the role data should have, compared to websites, in
government. It articulates a point that I think many of us
subconsciously have had in mind:

"The new administration should specify that the federal government’s
primary objective as an online publisher is to provide data that is
easy for others to reuse, rather than to help citizens use the data in
one particular way or another."

And they suggest an interesting way to push that forward:

"The policy route to realizing this principle is to require that
federal government websites retrieve the underlying data using the
same infrastructure that they have made available to the public. Such
a rule incentivizes government bodies to keep this infrastructure in
good working order, and ensures that private parties will have no less
an opportunity to use public data than the government itself does. The
rule prevents the situation, sadly typical of government websites
today, in which governmental interest in presenting data in a
particular fashion distracts from, and thereby impedes, the provision
of data to users for their own purposes."

I think this is a worthwhile addition to the opengovdata and
publicmarkup.org policy documents --- if not as a direct recommendation
(because I think it may be too much to ask for in a grand form) then
noted as a long-term goal or (in terms of the second paragraph I quoted)
as a benchmark, a concrete way to tell whether data is open.

The full citation is: Robinson, David, Yu, Harlan, Zeller, William P and
Felten, Edward W, "Government Data and the Invisible Hand" (2008). Yale
Journal of Law & Technology, Vol. 11, 2008

http://ssrn.com/abstract=1138083

I've gotten David, the first author (ehm, and long-time friend), to join
both of these lists, and he's interested in helping hash out good policy
recommendations with us.

--
- Josh Tauberer
- GovTrack.us

http://razor.occams.info

"Yields falsehood when preceded by its quotation! Yields
falsehood when preceded by its quotation!" Achilles to
Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)

John Wonderlich

unread,

Jun 6, 2008, 5:14:23 PM6/6/08

to open-go...@googlegroups.com, openhous...@googlegroups.com

I've been writing and rewriting a more formal response to the paper, but I'd like to share my thoughts first, especially as "The Invisible Hand" gets more coverage (with an Ars Technica article today).

First, I'd like to say that I'm delighted to see this topic addressed in an academic setting, and I also see nothing but potential for expansion from third party government information providers.

Probably the reason that I find Government Information and the Invisible Hand as provocative as I do is that the paper seems to imply that empowering third party sites will involve reducing the federal IT footprint. Specifically, when the essay lists its strategy as to "reduce the federal role in presenting important government information to citizens."

My issue with this suggestion is that government information providing sites are also service providers, and that the services they provide are often a demonstrable public good, or justified monopolies, and the regulations that govern these services' provision are reflective of their societal role, rather than the information's position in an information ecosystem.

Certainly, data based services have only to gain from better data management practices, more open policies, and the innovations of the digitally public marketplace of ideas and creative problem solving. The image, however, of government service providers as defenders of outdated inefficient turf may have some truth to it, and government IT contractors are likely just as entrenched and as dependent on public money as defense contractors.

My note of caution comes from what I see as somewhat of a conflation of government service provision and government data availabililty. The essay suggests that "elaborate" government Web sites stand in opposition to the sort of well-considered data policies that lead to creative third party sites, like govtrack.us. This strikes me as a somewhat dangerous simplification.

Framing the choice as one between overwrought government sites fraught with bureacratic complexity on one hand, and open efficient creative data sharing on the other, overlooks the importance of the restrictions and reliability that we justifiably demand of our government services.

The essay suggests that authentication is a valid concern, and that public mechanisms may develop to take this into account, using electronic signatures or trusted intermediaries to filter out trustable information. This may be a sufficient solution, but the myriad of regulations promulgated on government web sites each exist for a reason (although they may be imperfect). As creative data sharing moves into a broader public domain, the accountability mechanisms that govern those services need to be translated into the same public realm.

The handling of private information, the archiving of historical documents, and many many other concerns that have been well established within a government context, and a solution that to public administration problems needs to acknowledge that third parties don't operate under the same mechanisms of public accountability that the government does (or at least should), and never will.

All that said, I agree with the tone and focus of the Invisible Hand. I'd just like to caution against conflating data and public service provision, and to suggest also that public accountability mechanisms are indispensable.

In fact, I think those same mechanisms should be used to foster exactly the type of online creativity that the article describes, and I'm looking forward to seeing more reviews of just how the specific policy suggestions in the paper might work in government -- specifically the government employee/public citizen data parity requirement.

--
John Wonderlich

Program Director
The Sunlight Foundation
(202) 742-1520 ext. 234

Peggy Garvin

unread,

Jun 9, 2008, 4:21:06 PM6/9/08

to openhous...@googlegroups.com

Great comments, John. Government Information and The Invisible Hand deserves to be discussed widely, and I hope more people will comment. Thanks to the authors for bringing attention to the topic of how changing information economics relate to public information.

I agree with what I think is the paper’s basic proposal, which is not radical. The public would be well served if public information is made available for free and easy download in a structured format that can be re-used by third parties. Separating the data from the presentation software is smart practice.

It is my impression that, along the way to making this proposal, the authors are attacking the already weak parties—the agencies that struggle to collect the information we all want to use—and perhaps hurting their own cause in that way. The low barriers to entry for publishing globally on the web are new. Government agencies aren’t the only ones playing catch-up.

In the case of the government, there are agencies who have long done basically what the paper recommends; the U.S. Census, for example. What has changed is the economics of computing and communications that now makes individuals (rather than institutions with lots of resources – e.g., universities, large legal publishers) able to access, repurpose, and serve up the data. Expectations have changed (relatively) overnight. Change won’t happen overnight, but I don’t think that relinquishing any publishing role is going to speed up the process. Having said that, I agree that the agencies (and the Congress and OMB who oversee their operations) may need to be pushed, and perhaps The Invisible Hand paper can help with that push.

I am not sure if the authors (hello authors who are reading this!) are aware of a period that shaped the way many in the right-to-know community view the balance between government publishing public information and third parties publishing public information. I dug up something on the web that summarizes the “Reagan and OMB Circular A-130” history: http://sunsite.utk.edu/FINS/Periodicals_and_Newspapers/Fins-PaN-21.txt

Patrice McDermott also provides useful background and insight on the “old A-130” issue in chapter 2 of her book Who Needs to Know? [978-1-59888-050-2]. Take a look at it, and you’ll see why some of us get a little nervous about proposals for government to cede their publishing role to third parties.

The authors have a tough challenge in trying to address “government information” –judicial, legislative, executive -- as a monolith. Individual data sets may not be available in the free, downloadable, structured way we’d like to see for a variety of reasons. Some reasons are political, some are meant to favor commercial enterprises, some depend on standards development still in progress, and some may be just because the agency is so understaffed they can’t make the small push to get there. Oversimplifying the problem could slow down our progress toward the solution.

At this point, I hope I am not sounding too critical because I do support the public provision of public data in standard formats. But, I think the authors overestimate the capabilities of third parties for providing free, neutral, 24x7 access to authenticated public information to all for the lifetime of the data or the republic. Third parties have a big role. They can add value, meet the needs of disparate audiences, combine government data with copyrighted data, etc. The playing field among third parties should be leveled to encourage this innovation, but there is no need to level – or even hobble -- the government information providers to do so.

Peggy

pe...@garvinconsulting.com

David Robinson

unread,

Jun 9, 2008, 5:57:29 PM6/9/08

to openhous...@googlegroups.com

Peggy, thanks for your thoughtful and thorough reactions to our paper. The historical information you linked to was also of great interest.

In light of that background, it's probably important to underscore that we do not seek to prohibit the government from doing anything that private parties might do.

But there are some real "teeth" to our proposal: We urge that federal websites be required to make available for third party reuse the same underlying data they use to create their own public sites. In other words, we want to prohibit situations in which the only possible way to get some public data is through a particular government web site, and it is difficult for anyone to create a third-party alternative.

This is what we mean when we argue in the paper that "the best way to ensure that the government allows private parties to compete on equal terms in the provision of government data is to require that federal websites themselves use the same open systems for accessing the underlying data as they make available to the public at large."

Nothing in our proposal would prohibit government from operating any of the web sites that it currently does. In some instances, the weighted average of motivation and skill may make a government site the best available interface for certain data. We expect such cases to be rare, but in any event there is ample room for them to occur. Our requirement speaks only to priority: In order to display data on its own site, a government body must first make that data available in a way that could be displayed on any other site.

Your point about permanence and reliability for the life of the data is also well taken. I would point out that government sites (like THOMAS, as a recent annoying example) are just as capable as private sites are of frustrating users by moving things around and breaking old references to data that should still be available. Intuitively, I can imagine an argument that government sites are less likely to break URLs, or will do so less often, because in principle they should care more about stability than they private sector does. But it also might turn out that third party sites are simply less often broken, period. I'm not aware of any empirical research on this question -- does anyone else know of any?

Regards,

David

--
David Robinson
Associate Director

Center for Information Technology Policy

Princeton University
<citp.princeton.edu>
609.258.2175

John Wonderlich

unread,

Jun 9, 2008, 6:36:40 PM6/9/08

to openhous...@googlegroups.com

I think the question is whether the proposal would prohibit the government from doing anything that only they can do.

Two things that the government needs to do are:

selectively release information that is commercially sensitive, personally private, of national security concerns, etc (in other words, justifiably non-public), and
provide services, through their web sites, on the basis of that information

...and both with robust public accountability mechanisms.

It seems to me that the intermediary role that government web sites play between raw data sets and public access, that is, providing selective access and providing services based on that access, both must be built on non-public data systems. Government web sites serve government employees, constituents for specific areas, government officials, contracts with private companies, etc, so it seems that limiting their deployment to the same APIs that would empower the public would also limit the government's ability to provide some data or services.

Maybe that's a limiting case that could be dealt with by qualifying the proposal, but it looks like a slippery slope to me that would be used to justify the exclusive data access that most government web sites are probably built on.

Joshua Gay

unread,

Jun 9, 2008, 11:15:33 PM6/9/08

to openhous...@googlegroups.com

I think this proposal is great. I'd like to start helping the process of refining language such as "easy to reuse," to be more exact. For instance, I think ease of use should include particulars, such as: accessible, well organized, well documented, and predictable data set, as well as other criteria.

Although not everyone should be a technology expert, I think it's important that everyone concerned with this issue challenge themselves to acquire a bit more of a sophisticated understanding. We need lots of minds from many angles thinking about these problems, and we can't leave it all to the technologists :-) So, I figured I'd elucidate my statements above by giving some examples from xml.house.gov, which I believe has done a remarkable job of creating and steadily improving their XML Schema. It has become a mature and, frankly beautiful data set, and is constantly improving. I hope this is helpful and coherent (I'm writing a little low on caffeine/sleep).

#Accessible

The house XML files are accessible, as they are published by Thomas and the documents for each congress are stored all in a single directory (e.g., http://thomas.loc.gov/home/gpoxmlc110/). This makes easy work to scrape and stay on top of the data in a timely fashion (timely as long as you exclude some of the most important legislative articles, such as the patriot act, which was not available before being passed).

#Well organized

The data in the files are well organized. For instance, look at some lines from a house resolution I picked at random (http://thomas.loc.gov/home/gpoxmlc110/hr371_ih.xml -- you can view source on this file and look at the XML directly if you'd like). The title and header information is in Dublic Core, a mature XML standard and lines look like this:
    <dc:title>110 HRES 371 IH: In observance of National Physical</dc:title>
And then it is followed by well chosen XML tags like this one that indicates the legislative number:
    <legis-num>H. RES. 371</legis-num>
The bill sponsors are given a sponsor name-id tag, such as A000362
    <sponsor name-id="A000362">Mr. Altmire</sponsor>
Which corresponds to their bioguide ID: http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000362

And then Bills begin with <preamble> tags and each "Whereas" line in the resolution aptly has a <whereas> tag ;-) As well as many more tags and sections I skipped over.

But, take note, the organization is great on a number of levels. It's consistent across all XML files, and can be checked against a XML Schema and DTD file that help ensure consistent organization. Also, it has good "external" organization (for lack of a better phrase), because it has made wise decisions like choosing the Dublin core standard that many people understand, and by choosing to use the unique BioGuide ID when referring to sponsors. This kind of organization that will allow for efficient and effective systems in the government to be built on top of as well as outside of the government. In reality, it's the insight and exhaustive special knowledge of those in the House Library working closely with the data specialists that allow for such a smart system to exist.

#Well documented

The beauty of this data set is that it is largely self-documented. Furthermore xml.house.gov provides a style sheet and the DTD, as well as an example program that transforms the data from the XML to HTML, which is a superb (the best I've ever seen) XSL file (located here with files linked to it internally: http://thomas.loc.gov/home/gpoxmlc110/billres.xsl). The actual implementation is more important than just being convenient or nice, it is important in illuminating the detailed decisions made in choosing the particular tags and names of the data. Remember, you may design data to be robust for different types of display systems, but your display systems will often influence the organization of the data. When this is the case, it's important to document this with good examples like the above.

#Predictable

In this case, the data is in a sense predictable, because it must conform to the DTD and XML Schema and it must generate HTML from the XSL in a predictable way. Also, you can predict that sponsors will have the same unique ID as the bioguide. However, there are some things that are not predictable. For instance, consider the following line that starts the body of the resolution:
    <resolution-body id="H444CB208710E48D28665C616FE8E3F17" style="traditional">

We would like to assume that the ID chosen is unique, but, we can not assume as much. The government should provide the algorithm for selecting IDs of this nature. But, more importantly, they should be able to confirm and test the software to make sure that it is working in a predictable and consistent fashion, and they should be able to fix it and remove any bugs they may come across along the way. Unfortunately, in this case, the house library can not do this.

So, I hope that this helped to deepen and enrich the conversation in some way. The ideas of organizing data are as important for our governments to operate efficiently internally as they are for citizens to build software programs to use the data. The separation of data and presentation of data can be clearly defined, but the two often effect each other, as our choices on how we organize the data are effected by how we wish to use and reuse the data. Furthermore, this relationship often goes even further when we realize there is more than just storing data and displaying it, but there is also some real logic happening between the two. In programing, we often call this "logic" the "control" logic, and we create three categories: "model, view, and controller." The model defines the data and the ways of accessing the data, the view is how the data can be displayed to the user (including data entry), and the controller operates between the two. When the government creates controller systems and these controller systems effect the decisions behind how the data is organized (the model) and how it is being displayed internally (the view), it may be important for them to share that with us as well, as each of those decisions may have some gross impact on one another. The process of having an ideal seperation between model, view, and controller is often infeasible, and as such, it is often better to encourage the government to share with us the complete picture of each of these just as XML.house.gov has done!

So, thank you xml.house.gov and the House library -- you rock!

-Josh

P.S. Senate library, please catch up to the House's awesomeness.

--
Joshua Gay
m: 617.966.9792
Free Software Foundation [http://fsf.org]
Textbook Revolution [http://textbookrevolution.org]
One Laptop per Child [http://laptop.org]
Free Textbook Project [http://freetextbookproject.org]
Transparent Federal Budget [http://transparentfederalbudget.com]

Tom Bruce

unread,

Jun 10, 2008, 8:20:39 AM6/10/08

to Open House Project

I like the paper, very much. I also want to echo and underscore John
W.'s comments, which I think quite properly outline some things that
government *must* do and that private parties can't. At the LII,
we've been mashing up government data since 1992, and nobody would
welcome high quality, interoperable open streams more than we would.

I'd add that changing beliefs about what is authoritative is harder
than adding digital signatures. In court documents, for instance, the
need for parties and interested others to refer to identical,
authoritative sources is part and parcel of a system based on
precedent, and is actually very difficult (note, for example, that the
ultimate authoritative version of statutes for many Titles of the US
Code is not the Code -- it's the Statutes at Large). Digital
signatures provide mechanical guarantees of accuracy but many will not
find these as reassuring as familiar sources or other kinds of
branding. For much of the public this won't matter much -- they are
doing a different kind of research that is much more about risk
management than about the sort of thing that lawyers do. But
officials will be slow to change, and there will be a lot of folklore
to overcome. Particularly in the judiciary, there is no really
centralized mechanism for enforcing standards. We also lack a
professional class of legislative draftspeople capable of implementing
a detailed standard (though I think perhaps the various legislative-
XML efforts might move things in that direction). And that unique
design feature called "separation of powers" limits some of what can
be done.

In other words, this is an exercise in public education, education of
government officials about what the public really does, and dispelling
a lot of folklore about citizen interaction with government as much as
anything else.

As an exercise in self-aggrandizing windbaggery, let me point to a few
earlier works in this area :
a) Hank Perritt's piece, described in a blog post here:
http://blog.law.cornell.edu/blog/2008/04/10/readings-in-legal-information-pt-2/
-- this one in particular resonates with the Princeton article.
b) An old thing of mine, at http://www2.warwick.ac.uk/fac/soc/law/elj/jilt/2000_2/bruce
, which I'd now revise to include interoperable data as a highly
desirable form of self-publication.
c) Just for laughs, a piece on metadata quality that was written with
government information in mind: http://tinyurl.com/3n6mdh
d) Finally, a look at http://oai4courts.wikispaces.gov/ and at
http://voodoo.law.cornell.edu/uscxml/ might provide a couple of
interesting examples of what's being talked about here.

Finally, a cautionary note: most nonprofit organizations that publish
judicial opinions (see eg http://www.worldlii.org) are having problems
with sustainability; good funding models are hard to come by. Part of
the problem is that it's much easier to get funding for the initial
opening-up or mounting of something than it is to get funding for long-
term operation and maintenance. It's not inevitable, but you may need
something with the stability of government to guarantee the stability
of the information. The "law not com" movement started in 1992, and
it's stunning how many players have come and gone.

Feeling academic today,
Tb.

Thomas R. Bruce
Legal Information Institute
Cornell Law School

Josh Tauberer

unread,

Jun 10, 2008, 8:38:44 AM6/10/08

to openhous...@googlegroups.com

John Wonderlich wrote:
> It seems to me that the intermediary role that government web sites play
> between raw data sets and public access, that is, providing selective

> access and providing services based on that access, both /must/ be built
> on non-public data systems.

For this you could look at it the way we did in the open government data
doc--- It's not saying what should be public, but if it's going to be
public then here's how you know you did it right.

Daniel Bennett wrote:
> I think that
> it is unnecessary for the writers to create a dichotomy of functional
> sites and homepages versus structured data if they better understood
> how sites can be developed.

If a website can share data and provide an interface at once, all the
more power to the developers. But the dichotomy --- not between XML and
HTML but between allocating resources to share data and resources to
providing an interface --- is the whole point; it's what says that such
a dual use is important.

Josh Tauberer

unread,

Jun 10, 2008, 8:40:03 AM6/10/08

to openhous...@googlegroups.com

And I just realized that the two comments I responded to went to
different lists... Ah well...

--
- Josh Tauberer
- GovTrack.us

http://razor.occams.info

"Yields falsehood when preceded by its quotation! Yields
falsehood when preceded by its quotation!" Achilles to
Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)

Jennifer Bell

unread,

Jun 11, 2008, 11:10:02 AM6/11/08

to Open House Project

Great discussion! Though, like most people, I agree that governments
can't get out of the website business entirely. Clearly, there's a
division between what government websites should (must?) provide:
basic access to government-generated data, and what government
websites should (can?) not provide: forums for citizen-generated
content and analysis. The paper does a great job of highlighting the
areas where third party websites can add value and of stressing the
need for open data frameworks to make this happen.

As for sustainable funding models for 3rd-party sites, wikipedia is a
good example of where people who find something useful are willing to
donate to ensure it's survival. As I'm sure Josh Tauberer would be
quick to point out, once established, the maintenance costs for the
sites are fairly low. Moreover, websites that no one uses or cares
about enough to fund can and should die a quick death, as per the
invisible hand of the market.

How 3rd websites come to exist in the first place does need more
fleshing out, however. Traditional advocacy groups often don't have
the technical skills or the mandate to create these sorts of
websites. For technically-inclined people with ideas for websites,
applying to granting agencies to get funding for a project is a long
and daunting process. In Canada, the eventual goal of the non-profit
visiblegovernment.ca is to be an engine for the planning, funding and
implementation of 3rd-party sites, bringing together 1) people with
ideas for websites, 2) people with the skills to flesh them out, and
3) people who are willing to donate money to see these ideas realized.

Jennifer Bell
visiblegovernment.ca

> a) Hank Perritt's piece, described in a blog post here:http://blog.law.cornell.edu/blog/2008/04/10/readings-in-legal-informa...

> -- this one in particular resonates with the Princeton article.

> b) An old thing of mine, athttp://www2.warwick.ac.uk/fac/soc/law/elj/jilt/2000_2/bruce

> , which I'd now revise to include interoperable data as a highly
> desirable form of self-publication.
> c) Just for laughs, a piece on metadata quality that was written with
> government information in mind:http://tinyurl.com/3n6mdh

> d) Finally, a look athttp://oai4courts.wikispaces.gov/and athttp://voodoo.law.cornell.edu/uscxml/ might provide a couple of

> interesting examples of what's being talked about here.
>
> Finally, a cautionary note: most nonprofit organizations that publish

> judicial opinions (see eghttp://www.worldlii.org) are having problems

Tom Bruce

unread,

Jun 11, 2008, 3:12:15 PM6/11/08

to Open House Project

On Jun 11, 11:10 am, Jennifer Bell <visiblegovernm...@gmail.com>
wrote:

> As for sustainable funding models for 3rd-party sites, wikipedia is a
> good example of where people who find something useful are willing to
> donate to ensure it's survival. As I'm sure Josh Tauberer would be
> quick to point out, once established, the maintenance costs for the
> sites are fairly low. Moreover, websites that no one uses or cares
> about enough to fund can and should die a quick death, as per the
> invisible hand of the market.

Sorry, but I have to disagree with this, on two counts. First of all,
maintenance costs for the sites per se may be low, but the cost of
providing editorial services and access-enhancing technologies driven
by a revolution of rising expectations are not that low at all. For
instance, basic access to legal information in Canada (via my friends
and colleagues at CanLII) is financed by a $30/annum head tax on every
practicing lawyer in the country, and there is no surplus there
(incidentally, the average total annual cost of information access for
a Canadian lawyer is now around $2000, so thirty bucks is a pretty
good deal).

Second, one of the defining characteristics of legal information in
the United States is not so much that nobody cares, and hence is not
paying, but that a) few realize the extent of the problem and b) those
who do typically think it's someone else's problem to deal with it.
If the blessed market had perfect information, yes, invisible handling
might produce a good result. But the market is dealing in imperfect
information, upstream activity by government that sets barriers
artificially high for new entrants, and a practically endless history
of market distortions, as well as private-sector services that could
well be characterized as (until quite recently) government-subsidized
duopoly.

Then there's the fact -- easily dismissed as whining by the likes of
me -- that many care but few pay. There are many free riders, I fear.

Best,
Tb.

> > d) Finally, a look athttp://oai4courts.wikispaces.gov/andathttp://voodoo.law.cornell.edu/uscxml/might provide a couple of

Jennifer Bell

unread,

Jun 12, 2008, 3:15:33 PM6/12/08

to Open House Project

On Jun 11, 3:12 pm, Tom Bruce <t...@cornell.edu> wrote:
> On Jun 11, 11:10 am, Jennifer Bell <visiblegovernm...@gmail.com>
> wrote:
>
> > As for sustainable funding models for 3rd-party sites, wikipedia is a
> > good example of where people who find something useful are willing to
> > donate to ensure it's survival. As I'm sure Josh Tauberer would be
> > quick to point out, once established, the maintenance costs for the
> > sites are fairly low. Moreover, websites that no one uses or cares
> > about enough to fund can and should die a quick death, as per the
> > invisible hand of the market.
>
> Sorry, but I have to disagree with this, on two counts. First of all,
> maintenance costs for the sites per se may be low, but the cost of
> providing editorial services and access-enhancing technologies driven
> by a revolution of rising expectations are not that low at all.

You're right. There's a lot of variability in terms of maintenance
costs -- it all depends on the design of the site.

It's important, from a sustainability perspective, that 3rd-party
sites get people involved in the tool as more than just passive
information consumers. Looking at wikipedia again, while I have no
data on this, I suspect that it's the people that spend a lot of time
pruning and tending their entries that also tend to donate the most to
the site, to ensure the survival of their creations. Sites that give
people a meaningful way to express themselves, share information, and
contribute to the greater good are more likely to attract self-
sustaining donations. It's a happy coincidence that site designs that
encourage participation via collaborative filtering and crowd-sourced
analysis are also the ones that will likely add the most value.

While I agree that the people who care tend to think it's someone
else's problem, it's possible that they just haven't had the right
tools for getting involved. :-) John Wonderlich had a very nice riff
on tapping into the public mass of attention in a recent blog post:

"Even if only a small amount of leisure time gets connected to
politics and government online, and it is well connected to the
substance of oversight and legislation, of politics and elections,
then democracy is going to go through a fundamental change. TV can’t
compete, and the sheer amount of human attention moving online and
getting involved in participatory media has enough weight to shift
both politics and government."

http://www.theopenhouseproject.com/2008/04/27/mass-of-attention/

Of course, online tool development would have to be matched with a
corresponding public education and networking effort, to raise
awareness and expand the pool of people who are willing to
participate.

Thank your for the examples from CanLII. I admit, I'm a bit biased on
this topic.

Jennifer Bell
visiblegovernment.ca

Joshua Gay

unread,

Jun 12, 2008, 4:44:54 PM6/12/08

to openhous...@googlegroups.com, Samuel Klein

While I agree that the people who care tend to think it's someone
else's problem, it's possible that they just haven't had the right
tools for getting involved. :-)

I thought I'd share some of my experience in the area of people coming together to work on Internet-based, community driven grassroots projects that I thought resonates well with Jennifer's post. Over the past couple of years, I've run or helped run several events that focused on getting people involved in these types of projects. Most of these events are "jams" or "hackathons" where we are coming together as a purposeful and focused group to accomplish some small goal that is part of a larger project with larger goals.

Usually a good mix of people come. Some come because they have some useful expertise, like wiki knowledge, etc.; others come because they care about the cause; some come to help their friends who care about the cause; and so forth. Lots of different people. But, this diversity doesn't just happen, and recruitment is not magical. If I had to share one key to my recruitment success it would be choosing your recruiters. Pick the organizers of the event that have different personality types and temperments (look into http://en.wikipedia.org/wiki/Myers-Briggs_Type_Indicator or http://en.wikipedia.org/wiki/Keirsey_Temperament_Sorter etc) and have them do the recruiting. You'll find that the extroverts in rock bands tend to have very different event pitches and tactics than interovert computer scientists like myself :-) Also, make sure people can join you remotely (like on IRC or in phone conference) -- you can get a lot more people attending if you do this (I tend to be better at IRC recruitment than rock stars).

However, the most important elements of successful events and projects I've found pertain to some pretty basic aspects of human nature. It's not the tools or the goal (although both are necessary) at hand, but instead, it is the inherent satisfaction of coming with others learn and create. Experts teaching newcomers about wikis or version control systems. People shouting out benchmarks and goals. People feeling present and comfortable enough to make flaws and mistakes. And, ultimately, knowing that the philosophical goals discussed are being realized, even if it's just in some small way.

Seymour Papert has a child-centered education theory called constructionism, which is rooted in the idea of learning by doing. I like to think of the work I do with communities and projects as being "constructionism in every day life." Software, the internet, and data are all vital and necessary for our success in being able to be productive, however, when we measure the effectiveness of these tools, we should also consider the learning and growing process that individuals go through and how this transforms them and the process they are taking part in. And ultimately, how part of what these people are learning in this new "constructionist learning environment" is a cycle that builds on itself in which people learn how to use new tools to connect with people in new ways and make change, and then learn how to make newer and better tools, ad infinitum.

So, to conclude, the tools and the people are inseperable, as they are both how we communicate and connect as well as the means through which we are trying to learn, effect change, and make new tools.

Joshua Gay

Douglas Galbi

unread,

Jun 22, 2008, 5:38:36 PM6/22/08

to Open House Project

From http://purplemotes.net/2008/06/22/industrial-organization-for-government-communication/

Concern about too much government control over technologically limited
and costly communication channels has been enormously significant
historically. With the Internet revolution, governments can own and
control communication channels without significantly lessening the
opportunities for non-governmental bodies to do so. Governments that
broadly disseminate government-created content do not preclude others
from broadly disseminating other content. Vertically integrated
government communication now carries much less political risk for the
over-all communications industry. This fundamental change, it seems to
me, favors more vertical integration in government communication with
the public.

A draft of a new scholarly article makes the opposite argument. It
declares:

If the next Presidential administration really wants to embrace
the potential of Internet-enabled government transparency, it should
follow a counter-intuitive but ultimately compelling strategy: reduce

the federal role in presenting important government information to

citizens. ... Rather than struggling, as it currently does, to design
sites that meet each end-user need, we argue that the executive branch
should focus on creating a simple, reliable and publicly accessible
infrastructure that exposes the underlying data. Private actors,
either nonprofit or commercial, are better suited to deliver
government information to citizens....[1]

The idea essentially is to have more vertical disintegration in
government communication. Government would focus on providing a large
amount of detailed, machine-interpretable data that other
organizations' technologies would search, aggregate, re-organize, and
re-use. The anticipated benefit is more rapid innovation in the
provision of information services to citizens.

Some efforts to promote vertical integration clearly are silly. The
Yale Journal of Law and Technology (YJOLT) will publish the draft
article quoted above in Fall 2008. The draft is freely and publicly
available from the websites of SSRN and YJOLT. Yet on the top of every
page of the article appears the bolded imperative "Do NOT cite." That
literally implies that everyone can read the draft article but no one
can discuss it. Many blogs have simply ignored the draft's pagely
imperative (see, e.g., here, here, here, and here). One sheepishly
declared: "it kindly asks us not to cite the draft, but - since it's
out there for everyone to read - I assume a little quoting in a blog
post like this is in order."

Wanting to respect the authors' wishes, I emailed them to ask if they
would mind if I were to discuss their paper, cite it as a draft, and
link to it. One of the author's responded graciously. He thanked me
for my note, explained that YJOLT required the header, and welcomed me
to discuss the draft and link to it. That's a good response. Allowing
persons to discuss what they read increases the value of the time they
spend reading. Moreover, the value of publishing an article in YJOLT
isn't reduced by allowing discussion of the draft. Attempting to deny
readers the freedom to cite a publicly available draft is an absurd
product of an organizational silo-mentality. Fortunately, the specific
issue is relatively easy to deal with in practice.[2]

The more general and important issue concerns supply incentives. With
respect to government data, more important than the allocation of
resources between government data infrastructure and government
provision of data to individual end-users is the extent of investment
in producing, cleaning, organizing, maintaining, and studying data.
Government data collection typically is initiated to serve a narrow
political purpose. Concern about specific statistics and the use of
the data to produce specific reports drives investment in ensuring
accurate reporting, finding and resolving data inconsistencies, and
maintaining the data over time. A data collection effort that expands
over time to serve diverse political interests within government has a
better chance of enduring. To the extent that government data
collection mainly serves non-governmental information intermediaries,
governments will invest less in collecting data and ensuring high data
quality.

Governments have significant advantages as suppliers of web content
and services to end-users. Most adults know the names of the
governments to which they are subject, have experience with those
governments' services, and are concerned to make those services
better. Governments typically spend little on user acquisition (many
even aggressively discourage immigration) and relatively little on
advertising and promoting themselves and their services. For example,
U.S. federal government expenditure amounts to about 20% of GDP, but
U.S. government advertising spending probably amounts to less than 1%
of total U.S. advertising spending. Governments have a highly
differentiated position within the space of user trust, and
governments generate distinctive information flows. Eliminating
governments from the ecology of end-user web content and services
would waste their special institutional advantages.[3]

Stimulating end-user demand for government information is likely to
make more government data available through information
intermediaries. In academia, scholars who generate and share large
amounts of data typically get relatively little academic credit,
prestige, and status. Not surprisingly, only a small number of heroic
academics pursue this unpropitious path. Even initiatives to require
scholars to share data and algorithms necessary to replicate their
published results have not been widely successful. However, the small
share of scholars whose results attract considerable attention
naturally generate demand for the data that they used. Moreover, these
scholars then have some interest in ensuring that the data they used
are widely available. The same dynamic is likely to be operative for
governments. But the information flow is likely to be larger, because
governments have a greater responsibility to supply demands for data
and are less capable of controlling access to it.

Useful government data will get out one way or another. More important
is to ensure that governments have an incentive to generate it.

Notes:

[1] From abstract of Robinson, David, Yu, Harlan, Zeller, William P.
and Felten, Edward W., "Government Data and the Invisible Hand" . Yale
Journal of Law & Technology, Vol. 11, 2008 Available at SSRN:
http://ssrn.com/abstract=1138083

[2] I didn't try to contact YJLOT and get permission from YJLOT to
cite the paper. When I'm not wearing my bureaucratic hat, I'm more
concerned to respect the desires of human persons than those of
corporate persons. That's particularly true when those desires seem to
me silly or not in the public interest.

[3] As bright discussion of id. has highlighted, the distinctive
characteristics of government also include distinctive forms of end-
user political accountability.

Josh Tauberer

unread,

Jun 22, 2008, 7:24:06 PM6/22/08

to openhous...@googlegroups.com

Douglas Galbi wrote:
> Some efforts to promote vertical integration clearly are silly. The
> Yale Journal of Law and Technology (YJOLT) will publish the draft
> article quoted above in Fall 2008. The draft is freely and publicly
> available from the websites of SSRN and YJOLT. Yet on the top of every
> page of the article appears the bolded imperative "Do NOT cite." That
> literally implies that everyone can read the draft article but no one
> can discuss it.

I think you've misunderstood the point of that imperative.

Putting my PhD student hat on for a change, it is common to see that in
drafts of scholarly publications. My understanding is that it is
directed at academics warning them that the draft is subject to revision
before it is published in the journal, and that one should not cite the
draft in other academic works as if it were the final publication
because the final publication could differ in crucial ways from the
draft. If that happens, the citation would be incorrect.

I've never thought it was intended to stifle discussion.

Unless you were putting the emphasis on "literally", and in which case I
would not only have to put my student hat on, but my linguistics hat as
well. :)

Reply all

Reply to author

Forward