OMB: no more feeds...

4 views
Skip to first unread message

Joseph Lorenzo Hall

unread,
Apr 6, 2009, 7:22:26 PM4/6/09
to openhous...@googlegroups.com
Erik Wilde (@dret) pointed this out via twitter:

http://www.recovery.gov/?q=node/317

--
Joseph Lorenzo Hall
ACCURATE Postdoctoral Research Associate
UC Berkeley School of Information
Princeton Center for Information Technology Policy
http://josephhall.org/

Eric Kansa

unread,
Apr 8, 2009, 3:11:24 PM4/8/09
to Open House Project
Yep. It's a move toward "black-box transparency". Here's more
discussion:

http://www.alexandriaarchive.org/blog/?p=207

and

http://dret.typepad.com/dretblog/2009/04/new-recovery-act-guidance.html

Micah Sifry

unread,
Apr 8, 2009, 3:31:16 PM4/8/09
to openhous...@googlegroups.com

Thomas Lord

unread,
Apr 8, 2009, 4:12:58 PM4/8/09
to openhous...@googlegroups.com, Eric Kansa
On Wed, 2009-04-08 at 12:11 -0700, Eric Kansa wrote:
> Yep. It's a move toward "black-box transparency". Here's more
> discussion:
>
> http://www.alexandriaarchive.org/blog/?p=207
>
> and
>
> http://dret.typepad.com/dretblog/2009/04/new-recovery-act-guidance.html
>
>


I've listened to hundreds of people around these
issues. Believe it or not, I really do listen more
than I talk. I've listened to people from the government
side and from outside, from many perspectives in both
cases. On the basis of what I've heard and pieced
together:

The main reasons that actual policy implementations
tend towards "black-box" (centralized, limited)
transparency are three-fold:

1. Legitimate legal concerns. Everything some
part of the government says in an official capacity
must be viewed in light of applicable law and
jurisprudence. Mistakes here are hugely costly
to the individual people who make them so there is
a strong culture of avoiding them. "CYA" is the
main credo.

2. Legitimate cost concerns. Steering a bureaucracy,
whether public or private sector, is like trying
to alter the course of a fully loaded super-tanker:
it takes a long time to make much difference and it
burns up a lot of resources in proportion to how fast
you try to do it. If you need to radically alter
the course very quickly, you may as well just blow up
the tanker. Budget managers in gov't bureaucracies
are enormously constrained and, by the way, little bugs
them more than when the law (see problem (1)) leaves
them with an unfunded mandate.

3. Corruption. People drag their feet either
from simple laziness or in order to try to keep
some fraud hidden.

The opengov folks, it seems to me, by outward
appearences, have a lot of good folks in many
corridors all on board for changing course. There
is plenty of problem (3) in the water, but problems
(1) and (2) are the main obstacles. And, of course,
when problem (3) ("corruption") is practiced by
some bad actor, they are quite likely to disguise their
obstruction as a problem of type (1) or (2).

The way to fix problems (1) and (2) ("legal" and "budget"
limitations) is with persistent caution, deliberation,
and gentle pressure backed by broad-based citizen demand.

One of the better ways to fight (3) ("corruption")
is to work towards eliminating the excuses and
disguises afforded by problems (1) and (2).

Everyone seems to know that but people don't often
state it so directly.

Now, here is the current situation, I think:

The Obama campaign differentiated itself by picking
up on themes of transparency and making them elements
of the platform.

That was a huge advance for the open government
movement because it made it politically acceptable
for would-be gov't employee allies of the movement
to step forward and directly work with movement
representatives. Thus, recently, there has been
a lot of progress and activity as new web sites are
set up and so forth.

But the Obama admin's ownership of "transparency"
is also a bit of a set-back. He did not say, for
example, "Sunlight are the clear leaders here" or
anything like that. Rather, the new administration
simply defined the issue as their own. They
stepped out in front of the movement, made a
plausible claim to be its leaders, said "follow us",
and then started walking in the right direction, mostly,
but very, very slowly. They have bigger fish
to fry, from their perspective, and the slower
this stuff goes without going backwards the more
content they are that they don't have to worry about
it (yet still get credit for any upside).

The social dynamic and political trick there should
be familiar to anyone who has worked in a large firm
during a period of significant, cross-cutting change:
those radicals who seemed to have initiated the change
are frequently marginalized while the established
powers that be pwn the emerging movement. Founders
get forced out of their start-ups. Blue-chip
hard-fighting mgt. helps to install new executives
only to then be quickly frustrated that once in
the corner office the execs seem to abandon them.
We all know this social dynamic - it's ordinary.

Absent strokes of mad luck, I don't think the
obstacles will be overcome quickly. If the open
government movement continues to treat the emerging
transparency policies as a *debate* - they'll still
be having the same debates in 8 years. That is
*why* the administration would step out in front
in such a manner: to dominate the terms of the debate.

Well, it's not a debate. This is too important
a set of issues to treat as debate club. It's
not a debate: it's a chess game.

And although I've caught some flack for saying so
before, perhaps my point is clearer and more acceptable
this time: the best move in this chess game is to
work on the open government movement's "pawn structure", so
to speak, by turning greater attention to the demand
side - to broad citizen involvement.

It's not very interesting or challenging to the
existing practices in government to have a modest
amount of data, vetted through a centralized agency,
newly released in forms that are consumed by a few
experts and a diminishing number of professional
journalists. That's a very small constituency of
people who care directly about transparency and
thus while its influence has recently peaked, it
will now reliably diminish. You'll all be stuck in
an ineffectual debate club.

What does it take to build a larger constituency?
One that can put pressure on Congress and the Executive
(and for that matter, the Judiciary) so that the
emerging "prudent" approach can be sped up and be
implemented more meaningfully?

I think it takes a situation where angry citizens
who today just kvetch or read the kvetching of others
feel more empowered to become data sleuths and start
consuming this raw data in larger quantities than they
do. Better tools for citizens and better promotion
to citizens is needed.

Well, that's a can of worms, isn't it? I mean,
what is a "better tool"? Isn't that something
that takes a lot of debate?

No, not to get started. It's not that hard.
Observe:

The complaint coming up here about recovery.gov
is that data publication by the government is
trending towards centralization and therefore an
attendant and inevitable political bias. Essentially,
the emerging policies use the excuse of problems
(1) and (2) ("legal" and "budget" limitations) to
create a wonderful haven for problem (3) ("corruption").

Yet, if we do an end-to-end check on the private
sector open government movement so far we note that it
measures its successes largely in terms of what new,
fairly expensive web services can be set up.

If you think about it, you realize that the centralization
which creates a tendency towards corruption originated
in the private sector, with the open government movement.

Congress has reflected that ethos in its "deals" with
YouTube, Facebook, etc. They are actually getting
quite fast and loose on the issues of problem (1) ("legal")
using the excuse of problem (2) ("budget") which
arguably is a telltale sign of problem (3) ("corruption").

That kind of systemic problem will continue until
there is *investment* in the technology of distributed
and decentralized consumption and reaction to government
data - success not measured in page hits on a few web
sites but in the wide-spread adoption of *private*
means of consumption.

Now, think how that can feed back into the larger
game. "recovery.gov" and similar will continue on
their current course and the debate with opengov folks
will succeed in making minor but not earthshaking course
corrections. Yet, some data is far more free than before -
the movement hasn't utterly failed, only stalled.

If a wide, privacy-respected audience is consuming and
responding in the public sphere to that data, some of
those citizens will form consensus around what are the
important *gaps* in what the government publishes.
It's not up to the debate club to name those gaps:
it's a democratic question.

I don't think we find those gaps by debating them or
by holding an unscientific poll via Wired magazine.
Those techniques can only find the politically safe
low-hanging fruit.

Rather, we find gaps when some constituency or other
is pouring over the data that *is* available and they
reach an impasse where they are saying "Wait a minute...
*this* bit of data doesn't make sense. Something is
fishy there. We want to see what's behind this."

At that point even such blunt instruments as FOIA
come to hand as useful tools for creating political
pressure to accelerate the move towards government
transparency in meaningful, democratic ways.

Until then, the movement and the individual projects
such as this one are likely to continue swiftly down
the path of becoming somewhere between marginalize and
corrupt.

Summary:

Yes, it is trending towards the ironic
coinage "black-box transparency".

No, there's nothing that can be done to stop
that given the current political landscape.

The landscape needs to become more broadly
inclusive.

Technologically, that means working on consumption
and reaction to government data in other than
these expensively centralized ways - getting
the open gov movement to lose its own "black box
transparency" habits.

Socially, that then means a campaign to get
a much broader range of citizens interested in
and directly using the raw data, on their own.

>From that can come political pressure to
resist the "black box transparency" momentum
within government.


-t

Eric C. Kansa

unread,
Apr 8, 2009, 5:38:28 PM4/8/09
to openhous...@googlegroups.com
Thanks Thomas. Points well taken.

I do think the good folks at OMB and many other agencies honestly want
to do right. There are some challenging issues (see Erik Wilde's post),
and Thomas's points about regulator issues are also well taken.

This does require some clear articulated vision for a "transparent
architecture" where access to data / processes is a more natural outcome
of government processes, and not something grafted on. Being grafted on
(where Recovery.gov is seen as just another OMB reporting requirement),
you get the foot-dragging and crappy quality of compliance. The graft is
also something that can be easily removed. I see centralized
transparency strategies as something vulnerable to disappearing once
political attention changes to new issues.
-Eric
--
---------------------------------
Eric C. Kansa, PhD.
Executive Director
Information and Service Design Program
Adjunct Professor
UC Berkeley, School of Information
http://isd.ischool.berkeley.edu/
Office: (510) 643-4757
Mobile: (415) 425-7380
Fax: (510) 642-5814
---------------------------------

Thomas Lord

unread,
Apr 8, 2009, 5:59:23 PM4/8/09
to openhous...@googlegroups.com
On Wed, 2009-04-08 at 14:38 -0700, Eric C. Kansa wrote:
> Thanks Thomas. Points well taken.

I was always pretty good at debate :-)

(I do actually have more specific technical ideas
for the data consumption / response side but there
is a severe impedance mismatch in the interfaces I've
found so far to the open government folks who
could conceivably help and, I'm a person of extremely
modest means so I have some difficulty forcing my
foot in the door with demos and such.)

More seriously: thanks.

-t

Gary Bass

unread,
Apr 8, 2009, 6:55:54 PM4/8/09
to openhous...@googlegroups.com
Eric, I appreciate your comments about the OMB guidance heading in the wrong direction. Since this is an important topic, I want to make sure we are all operating from the same set of facts before assessing whether you are correct or not. This requires reviewing some of the OMB guidance and clarifying the Coalition for an Accountable Recovery’s (CAR) position. (OMB Watch is co-chairing the CAR coalition.)

Keep in mind that we are dealing with two different reporting systems under the Recovery Act. The first is what the federal agencies have put out the door; that is how much money they have spent. The CAR coalition and OMB Watch argue that federal agencies should be reporting that information through their websites, through feeds, and through the Recovery.gov website. With the proper feeds, everyone wins. (And if this works, this radically alters reporting for USASpending.gov for future appropriations.)

The second reporting system is how recipients of Recovery Act funds are using the money. There is no adequate system in place today to get information from recipients of federal funding. As a result, we miss out on performance data that can help us understand how to improve federal spending and minimize waster, fraud and abuse. Sending data to state and/or federal agencies invites delays and introduces the possibility of errors (as we have seen from the data that is used on FedSpending.org, a website very similar to USASpending.gov).

To be clear, the CAR coalition called for a centralized reporting system for the recipients of the funds, not for what the federal agencies report. There will be thousands of entities (contractors, grant recipients, etc.) that will be reporting on how they are using the Recovery Act money. We want accurate, timely information. The best way we could think of getting that is from recipients directly, not passed through lots of companies and government agencies, which is the traditional method today. In this case, the central reporting system should have feeds going to the public and government agencies so they (and we) can use the data directly.

On the first reporting system – what federal agencies are putting out the door -- my understanding is that the agency feeds are in fact required. The OMB guidance states: “For all Major Communications, Funding Notifications, and Financial and Activity Reports, agencies ARE REQUIRED to provide a feed….” (page 68, emphasis added). However, Erik Wilde states in the blog post you referenced that the new OMB guidance would make agency feeds optional. I am not sure how Erik Wilde arrives at the conclusion that “the document as a whole makes it very clear that feeds are optional.” Perhaps the feeds to which Wilde refers are not the ones specified on page 68 of the guidance?

Calling the latest OMB guidance a move toward “black-box transparency” may be overstating the shortcomings of the guidance. In your blog post, you state that the information flowing “from the agencies, to OMB, to Recovery.gov will be opaque to the public.” However, my understanding is that OMB is saying that even as it passes the information to OMB and to Recovery.gov, the agencies must disclose the information directly to the public. The guidelines state: “The required data can either be supplied in the feed, or the feed can point to a file at the agency using the convention noted below. If an agency is immediately unable to publish feeds, the agency should post each near term data flow to a URL directory convention suggested below: www.agency.gov/recovery/year/month/date/reporttype.

This is not to say the OMB guidance is all good or that implementation of the Recovery.gov scheme has been everything we hoped. Perhaps the real problems rest with the type of feed or the specification for the content of the feed? If that is the case, then we should focus on that – and submit comments raising specific concerns and solutions.

For the record, and to clarify your blog post, at no time did OMB Watch ever support only sending information to OMB to build a single database. OMB Watch has always supported comprehensive machine readable feeds (APIs and syndications) from agencies. I also believe that is OMB’s intent based on our reading of the guidance.

Finally, I want to stress the importance of the Recovery.gov scheme being able to support different types of users. Many on this list are tech-savvy and will want machine readable information. But many community groups, activists, and journalists will want one-stop information from a single website. The CAR vision was to accommodate both through the use of feeds, while the government uses those feeds to build an aggregated website.

--------------
Gary D. Bass
Executive Director
OMB Watch
1742 Connecticut Ave., NW
Washington, D.C. 20009
TEL: (202) 234-8494; FAX: (202) 234-8584

-----Original Message-----
From: Eric Kansa [mailto:eka...@ischool.berkeley.edu]
Sent: Wednesday, April 08, 2009 3:11 PM
To: Open House Project
Subject: [openhouseproject] Re: OMB: no more feeds...


Yep. It's a move toward "black-box transparency". Here's more
discussion:

http://www.alexandriaarchive.org/blog/?p=207

and

http://dret.typepad.com/dretblog/2009/04/new-recovery-act-guidance.html





On Apr 6, 4:22 pm, Joseph Lorenzo Hall <joeh...@gmail.com> wrote:
> Erik Wilde (@dret) pointed this out via twitter:
>
> http://www.recovery.gov/?q=node/317
>
> --
> Joseph Lorenzo Hall
> ACCURATE Postdoctoral Research Associate UC Berkeley School of
> Information Princeton Center for Information Technology
> Policyhttp://josephhall.org/




Combined Federal Campaign #10201


Thomas Lord

unread,
Apr 8, 2009, 7:43:16 PM4/8/09
to openhous...@googlegroups.com
On Wed, 2009-04-08 at 18:55 -0400, Gary Bass wrote:

> The second reporting system is how recipients of Recovery Act funds are using the money. There is no adequate system in place today to get information from recipients of federal funding. As a result, we miss out on performance data that can help us understand how to improve federal spending and minimize waster, fraud and abuse. Sending data to state and/or federal agencies invites delays and introduces the possibility of errors (as we have seen from the data that is used on FedSpending.org, a website very similar to USASpending.gov).
>


In many cases, aren't local citizens the best ones to
get at that data? They are the ones with the political
connections to know where to look.

For (hypothetical) example: suppose some local
big shot in Berkeley gets a few million. You
can get officially required numbers all day, if
you like, but a much more efficient way to suss out
corruption or waste in many cases is to ask the
locals. You can tell some stuff about how the money
is spent by looking at the reports but you can also
"see a lot by looking" and its the silent majority
of locals close to funding recipients who have the
best perspective on facts on the ground.

The end around of the bureaucratic obstacles of
gathering reporting from recipients is to recruit
direct observers of recipients.

As for your argument that the new regulatory
guidance might not be so bad since the agencies
are supposed to communicate via feeds -- I think
that misses the point that was raised. The
suggested problem is that the process that generates
those feeds is opaque. Rather than transparency
being pervasive and open-ended - it is being redefined
as "conforming to a finite API". There's a black-box
that generates the feed, the feed goes through a filter
to recovery.gov. The question is what's in that box.
How credible is the feed itself? Where is the capability
to drill down, upon demand, into details?

In a small town government of the Mayberry sort - that
kind of idealized thing - if Aunt May smells a rat she
wanders down to city hall and gets in a snoot with
the clerks until they spill the beans. That can
scale up (but the demand-side needs work :-)

-t
-----------------
"It was a large room. Full of people. All kinds. And they had all
arrived at the same building at more or less the same time. And they
were all free. And they were all asking themselves the same question:
What is behind that curtain? You were born. And so you're free. So happy
birthday." - Laurie Anderson: Born, Never Asked

Death and taxes, man. :-)



Eric C. Kansa

unread,
Apr 8, 2009, 8:08:38 PM4/8/09
to openhous...@googlegroups.com
Hello Gary,

Points taken, and I'm point an update on my blog post with your comments
to show OMB Watch's views on centralized / versus decentralized
approaches. My apologies for misinterpreting the report!

I do disagree about the place of feeds in the new OMB guidelines however.

These guidelines do as you indicate on Page 68, initially say that feeds
are required. But, a little further down on Page 68, the guidelines say:
"If an agency is immediately unable to publish feeds, the agency should
post each near term data flow to a URL directory convention suggested
below..."

Basically, this means an agency is required to publish a feed, except
when it feels unable to do so. Now, I'm not very experienced in parsing
OMB guidelines, but I take this to mean feeds are, in fact, optional.
There's no deadline or mechanism in place to make sure an agency that
does not feel capable of making a feed gets up to speed on the technology.
I think this is the most reasonable reading of this document. The lack
of any further instruction and guidance on how to actually implement
feeds reinforces this point. (Feeds get short discussion in Appendix 1,
and there is no other discussion, even in Appendix 2 which should have
pointers on facilitating feed discovery). All this makes it very clear
to me that feeds are turning into nothing more than an afterthought.

Finally, I had a conversation about your last point about tech-savy
versus other types of users. I absolutely agree that most people will
not want to play with raw XML. However, it's not an either or, the XML
makes possible a whole range of different interfaces, visualizations,
and forms of presentation that can make these data intelligible to many
types of users. But it is much harder to go the other direction (from
"human-readable" to "machine readable"). Currently, we have lots of
Excel spreadsheets that are difficult to parse and aggregate and
preciously few feeds to let us find updated collections of those
spreadsheets.

Anyway, great discussion!
-Eric

Gary Bass

unread,
Apr 8, 2009, 11:38:14 PM4/8/09
to Eric C. Kansa, openhous...@googlegroups.com
We'll be sure to mention this issue in our comments to OMB. Totally
agree about the Excel spreadsheets vs feeds.

However, let's be clear that this isn't black box transparency. OMB was
attempting to ensure that no agency can get away with non-disclosure by
saying that can't do a feed. OMB said, well, if in fact you can't do a
feed then provide the information in a structured format so that we can
all scrape the information from the websites.

BTW, we also need to better understand whether this guidance is in
addition to the first guidance or replaces the first one. It seems the
first one was clear that there were to be feeds. This guidance seems to
supplement that, not supplant it.

And thanks for updating your blog about OMB Watch's positions.

On another content matter, I am concerned that the first guidance called
only for summaries of contracts. I was hoping that this guidance would
broaden that to provide the complete contract (redacted where necessary)
along with the RFP. But there is no mention of that in this version.
This is an example of the type of content I'm worried we will not be
getting.

1742 Connecticut Ave., N.W.
Washington, D.C. 20009
TEL: 202-234-8494; FAX: 202-234-8584

Thomas Lord

unread,
Apr 9, 2009, 12:09:12 AM4/9/09
to openhous...@googlegroups.com, Eric C. Kansa, Gary Bass
I don't mean to be a pill but,

On Wed, 2009-04-08 at 23:38 -0400, Gary Bass wrote:
> We'll be sure to mention this issue in our comments to OMB. Totally
> agree about the Excel spreadsheets vs feeds.
>
> However, let's be clear that this isn't black box transparency. OMB was
> attempting to ensure that no agency can get away with non-disclosure by
> saying that can't do a feed. OMB said, well, if in fact you can't do a
> feed then provide the information in a structured format so that we can
> all scrape the information from the websites.

That is orthogonal to the black-box issue.

The format of the data is not the issue. I don't
know how to say it more clearly. And I don't mean
to "own" the black-box issue but I think it is perfectly
easy to understand what the people who raised it are saying.


You can say that OMB's regulations here aren't making
the black box problem more than incrementally worse -
that's ultimately uncontroversial. I don't think you
can say that this isn't black box transparency, though.

-t

Greg Elin

unread,
Apr 9, 2009, 9:41:48 AM4/9/09
to openhous...@googlegroups.com, Eric C. Kansa
My interpretation of the Updated Guidance parallels Gary's.  The alternative to feeds is there only for those who cannot provide the feeds.

On the centralization issue, OMB's language is "collected centrally".
"Recipient reporting required by Section 1512 of the Recovery Act will be  collected centrally." - page 2, Section 1.5

"OMB intends to oversee the development a central collection system for the information required to be reported by Section 1512 of the Act." - page 24, Section 2.14

Nebors was careful to use the same "collection" phrasing in his testimony.

IMHO, this language choice is significant implying the use of aggregation tools--picking up or "collecting" the information--instead of obligated entities "reporting" the information. The data just has to end up in one place, but there could be a variety of routes in which the data gets there, including data feeds and published web pages.

Also see Section 2.17, page 26:

For those programs where the State is the primary recipient of Recovery Act funds, Federal
agencies should provide States the flexibility to determine the optimal approach for collecting
and transmitting to the Federal government data required by Section 1512 of the Recovery Act. 
For example, a State may prefer to create a central point of contact responsible for transmitting
all Section 1512 data to the Federal government’s central collection solution (or to the individual
Federal agency, if appropriate).  Alternatively, a State may prefer to have individual State
agencies or recipients separately report to the Federal government rather than relying on a single
point of contact to consolidate the information centrally for transmission.
   In all cases, however,
Federal agencies should expect the State to assign a responsible office to oversee Section 1512
data collection to ensure quality, completeness, and timeliness of data submissions.   This State
office should play a critical role in assisting Federal agency efforts to obtain quality, complete,
and timely data submissions.  (emphasis added)

I read that again as trying to avoid a single black-box solution, but pushing people toward web-based solutions as much as possible.

Greg
--

Greg Elin
Chief Evangelist for Sunlight Foundation (http://sunlightfoundation.com)
Sunlight Labs (http://sunlightlabs.com)
ge...@sunlightfoundation.com
gr...@fotonotes.net
http://twitter.com/gregelin
skype: fotonotes
aim: wiredbike
cell: 917-304-3488

Erik Wilde

unread,
Apr 9, 2009, 1:07:39 PM4/9/09
to openhous...@googlegroups.com
hello greg.

> *From:* Greg Elin <ge...@sunlightfoundation.com
> <mailto:ge...@sunlightfoundation.com>>


> My interpretation of the Updated Guidance parallels Gary's. The
> alternative to feeds is there only for those who cannot provide the
> feeds.

for me, the important issues are that (a) the only reliable
communications channel right now is email to OMB, (b) the feed
guidelines have not been changed or clarified at all, and they still are
not eve required to be discoverable, let alone carry data in some
well-defined format, and (c) the updated guidelines now say they are
working on a web-based submission form, whoich would basically improve
the email submission process and further dilute the idea of feeds.

> On the centralization issue, OMB's language is "collected centrally".

yes, but only they can do the collecting because nobody else knows what
is out there. i manually searched all known sites
(http://isd.ischool.berkeley.edu/stimulus/feeds/agencies.html) for feeds
and did not find too many
(http://isd.ischool.berkeley.edu/stimulus/feeds/feeds.html). without
radically improved guidelines centering on feeds, the feeds are really
of no practical use. we tried, but we haven given up. my personal
favorite: NASA's unifeeds (one feed per report):
http://www.nasa.gov/recovery/reports/weekly/index.html

> IMHO, this language choice is significant implying the use of
> aggregation tools--picking up or "collecting" the information--instead
> of obligated entities "reporting" the information. The data just has
> to end up in one place, but there could be a variety of routes in
> which the data gets there, including data feeds and published web pages.

sure, and in theory that could still happen. but from an architectural
point of view, the worst things that could happen is to establish
redundant channels, ending up with having to reconcile reports available
through more than one channel. the initial and updated guidance both
establish redundant channels, email and feeds. since the feed guidance
was very weak in the initial guidance and was not changed at all, while
the email guidance was used to do the actual collection and now talks
about being augmented with a web form, i think it is reasonable to
assume that feeds may still be allowed, but are just a side-effect.

in my opinion, the only viable way to make feeds work and to expose them
as a reliable and robust source of information would be to require
recovery.gov to also so their collection via feeds. this would be a very
strong incentive to make feeds work.

> I read that again as trying to avoid a single black-box solution, but
> pushing people toward web-based solutions as much as possible.

we were very excited when we saw the initial guidance and saw feeds
being mentioned. however, now that we have two documents and can compare
how recovery.gov worked over the past few weeks, how the guidance was
updated, and what kind of data is available via recovery.gov and agency
feeds, i don't think that the feeds will go anywhere. i'd love to be
proven wrong, but given the current trajectory, we'll end up with opaque
data collection and recovery.gov being the only entity having access to
the reporting channels in the back-end.

i still think that even though feeds are very likely out as the way of
how reporting is done, they should be used by recovery.gov for
publishing data. this is what we currently provide by republishing
scraped data, we have feeds for individual agencies
(http://isd.ischool.berkeley.edu/stimulus/feeds/usda/weekly.atom), and a
feed for all weekly reports scraped from the recovery.gov site
(http://isd.ischool.berkeley.edu/stimulus/feeds/weekly-site.atom). we
generate these by parsing excel and republishing the data as
feed-packaged XHTML/XML. (my apologies for the latter being so big; we
still have to implement feed paging...)

cheers,

erik wilde tel:+1-510-6432253 - fax:+1-510-6425814
dr...@berkeley.edu - http://dret.net/netdret
UC Berkeley - School of Information (ISchool)

Charlie

unread,
Apr 9, 2009, 3:01:44 PM4/9/09
to openhous...@googlegroups.com
Gary & All: I read the CAR National System for Collection and Dissemination of Government Spending Data and the Interim Recovery.gov Data Reporting Architecture. Consider this as a continuation of the thread here

http://groups.google.com/group/openhouseproject/browse_thread/thread/6aee9315a40eaa31?pli=1

which is a call to action for an open architecture for an open government. Apologies to Eric K. and other folks on that thread for not getting back to them yet.

1. The language in the CAR documents should be strengthened throughout to clarify that reporting publicly by all recipients at the source today is a requirement to achieve the transparency objectives of the President's Transparency and Open Government directive.

2. As a corollary to #1, this means the recipients use THE WEB as their reporting mechanism. If the recipients use THE WEB as their reporting mechanism, the government doesn't require centralized reporting, just collection and reconciliation. On THE WEB centralized reporting not only contradicts transparency and open government, it introduces very high coordination costs.

3. Introduce a requirement in the CAR documents that collection and reconciliation are done by bots, yes BOTs. Every intermediary in a financial reporting system introduces the risk of inaccuracy and/or corruption. A system that asks an agency with a fiduciary responsibility to monitor itself violates a basic accounting principle called separation of duties. If you ran a 7/11, you wouldn't hire the same person to be a cashier and your accountant.

4. Introduce a requirement in the CAR documents that the source code for the BOT collection and reconciliation system goes into open source. That way we can all watch the BOTs do our work for us. Despite what Ray Kurzweil and Co might think, BOTs cannot become corrupt.

Frankly, from much of the other information I've seen, including the Virginia Draft Architecture, this is all very much black-box, old school government behavior. And just like Kundra says, for the biggest IT spend in the world citizens should get more for their money. Statements like Virginia's "Transparency should be as close as possible to the system of record" just don't cut it. And Release #5 Conceptual Solution Architecture (mid to long term) for prime recipients to disclose at the source is way too little and way too late. This sounds like government-speak for we really never plan to get this done, let's just placate the naieve public.

On Wed, Apr 8, 2009 at 11:38 PM, Gary Bass <gb...@ombwatch.org> wrote:

Greg Elin

unread,
Apr 9, 2009, 3:06:56 PM4/9/09
to openhous...@googlegroups.com
+1 on Charlie's suggestions, especially the #2 and #3. 

Greg

Eric C. Kansa

unread,
Apr 9, 2009, 3:48:31 PM4/9/09
to openhous...@googlegroups.com
Wow. This thread has turned into a torrent. Glad the architecture issues
get lots of attention.
@Charlie: Great stuff. Point 2 is spot on.
-Eric
> <http://www.agency.gov/recovery/year/month/date/reporttype.>”
> <mailto:joeh...@gmail.com>> wrote:
> >>
> >>
> >>> Erik Wilde (@dret) pointed this out via twitter:
> >>>
> >>> http://www.recovery.gov/?q=node/317
> >>>
> >>> --
> >>> Joseph Lorenzo Hall
> >>> ACCURATE Postdoctoral Research Associate UC Berkeley School of
> >>> Information Princeton Center for Information Technology
> >>> Policyhttp://josephhall.org/ <http://josephhall.org/>
> >>>
> >>>
> >>
> >>
> >> Combined Federal Campaign #10201
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
>
> --
> -----------------
> Gary D. Bass
> Executive Director
> OMB Watch
> 1742 Connecticut Ave., N.W.
> Washington, D.C. 20009
> TEL: 202-234-8494; FAX: 202-234-8584
>
>
>
>
>
> >


Gary Bass

unread,
Apr 9, 2009, 8:25:46 PM4/9/09
to openhous...@googlegroups.com
It's a very helpful torrent....

While CAR will be submitting comments to OMB, I hope those of your with detailed specific concerns also submit comments. Or feel free to join the CAR coalition and provide your comments there. There will be an online discussion starting Mon and there will also be a conference call to coordinate our comments.

To join the CAR Google Group, visit http://www.coalitionforanaccountablerecovery.org/.
Combined Federal Campaign #10201


Erik Wilde

unread,
Apr 9, 2009, 9:44:32 PM4/9/09
to openhous...@googlegroups.com
hello.

> To join the CAR Google Group, visit http://www.coalitionforanaccountablerecovery.org/.

http://www.coalitionforanaccountablerecovery.org/sites/default/files/OMB_Watch_CAR_Recovery_Data_Architecture-Final.pdf
(dated march 5) is still the most recent version, it seems.

i am concerned that the architecture presented on page 11 is actually
less transparent (at least in the way it's shown) than the architecture
proposed in the recovery act architecture document, because it portrays
recovery.gov as the only way to get to any data. in my personal
terminology, this image is all about "openness", and not at all about
"transparency". the vision we had (and the one outlined in the recovery
act architecture document) has transparency as the primary goal, making
the data sources available directly to anybody interested in them. which
means that any report produced at any level should be directly
accessible to anybody interested. now, this could still mean that
recovery.gov could act as a hosting platform for these reports, but that
would be a pure implementation detail. logically speaking, data entered
by any recipient would be exposed directly to anybody interested in it.

i know we had some discussion around this in the break-out session in
the meeting in washington, and it was agreed that the way the
information flow was depicted in that figure was a bit unfortunate. i
think it would be worthwhile the effort to clearly split a logical view
of the information flow and the implementation view of the information
flow, so that it becomes clear that recovery.gov should not be the
centralized system that it looks like in this system.

it was my understanding that in theory, anybody should be able to
completely replicate the functionality of recovery.gov by directly
tapping into the information sources providing reports. in such a case,
recovery.gov would just play a role such as amazon's S3, just providing
hosting for data that is produced and then made available in a robust
and secure way. they would only host feeds that would be populated by
agencies and contractors publishing reports.

i also would be careful when asking for APIs (section 3.2). this is
getting technical again, but since you want services, you should also be
clear about what kind of service, and i think we really don't want APIs
(http://dret.typepad.com/dretblog/2009/02/apis-considered-harmful.html).
asking for APIs is like asking for SOAP services or SPARQL endpoints,
and i think we should not ask for either of these. i think we should
specifically ask for feeds and meaningful services built around these
feeds, because feeds are the only web service that almost any person can
directly use (by using a feed reader).

i'll try to write up some blog post about the "transparency vs.
openness" issue, but i think this is what makes the current CAR draft a
bit imprecise: it does not clearly separate between how reporting should
be done and what guidelines should exist for that; and what kind of
services oversight groups and normals citizens might want to use based
on that reporting architecture, and how those services should be provided.

cheers,

dret.

Greg Elin

unread,
Apr 10, 2009, 11:13:54 AM4/10/09
to openhous...@googlegroups.com
Erik,

1. As you say, the diagram on page 11 of Car's Architectural proposal is meant to be a logical representation. I would not worry about it as I think everyone is in the process of updating their documents.

2. Your API-considered-harmful post is excellent insight into some of the risks that are happening, but also a bit esoteric for the conversation most parties are having. I think your post really falls into the how to do a good API.

I think the technical crowd here needs to try and write up some plain language around the issues with which are concerned.

Greg

Eric Kansa

unread,
Apr 10, 2009, 12:03:37 PM4/10/09
to openhous...@googlegroups.com, openhous...@googlegroups.com
My stab at this:

The data that you need, in the format that you need, is available through no more fuss and bother than simply following a hyperlink.

-eric

Sent from my iPhone, sorry for typos!
Reply all
Reply to author
Forward
0 new messages