Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
How would you legislative data to be made available?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  13 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Rob Pierson  
View profile  
 More options Oct 2 2007, 6:35 pm
From: "Rob Pierson" <piers...@gmail.com>
Date: Tue, 2 Oct 2007 18:35:38 -0400
Local: Tues, Oct 2 2007 6:35 pm
Subject: How would you legislative data to be made available?

As the Library of Congress (i.e. thomas/LIS) completes their work converting
the legislative summaries into XML, they are doing research into what system
the new legislative database might be made available. I'm going to be
working on ensuring that non-hill folks have access to a bill search system
that is as capable as what is available to congressional staff, but for the
moment I'd like to address the question of the raw data.

Rather than waiting for LoC to produce a proposal of how that legislative
data should be made available, I think it makes sense for this group to
preemptively offer ideas about the way that raw legislative data should be
provided for repurposing on other websites. We should consider the needs of
sites which will repurpose the data, but at the same time the database
format recommended by this group must minimize the webserver impact of sites
like Govtrack.

Questions:

(please answer other important questions even if I don't know that I should
pose them) :)

   - What file formats/system would you recommend? Is a complete dump of
   the entire database necessary?

   - How could sites be made aware of changes to the system? Rather than
   accessing every bill record every night, is there a way that sites could
   only access records that had been updated (i.e. new cosponsors, bill
   action, etc).

   - Is it important that RSS feeds be made available for search terms?
   For example, an RSS feed for all new bills that contain the word Iraq in the
   text.

   - What's the work around for this need today?

Thanks everyone!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Josh Tauberer  
View profile  
 More options Oct 4 2007, 10:34 am
From: Josh Tauberer <taube...@govtrack.us>
Date: Thu, 04 Oct 2007 10:34:06 -0400
Local: Thurs, Oct 4 2007 10:34 am
Subject: Re: [openhouseproject] How would you legislative data to be made available?

Rob Pierson wrote:
>     * What file formats/system would you recommend?

HTTP REST-based (i.e. GET) API for getting records, or just HTTP/FTP
access to files directly. No SOAP, web services, or whatever. Simple
simple simple.

 >       Is a complete dump

>       of the entire database necessary?

It would be a good idea, for sure. Considering THOMAS has records for
somewhere in the ballpark of 200,000 bills (probably around 1GB of data,
based on my own database), if you want all of it, no one is going to be
happy with 200,000 uncompressed HTTP requests (esp. at their current
maxmimum permitted rate of one per second). If you're trying to get a
new project going, you might want the whole database.

>     * How could sites be made aware of changes to the system? Rather
>       than accessing every bill record every night, is there a way that
>       sites could only access records that had been updated (i.e. new
>       cosponsors, bill action, etc).

That's an absolute must. That's one of the biggest problems I have with
GovTrack. Not all bill updates are reflected in the Daily Digest, and
there's no other way to get a list of changed bills. (The D.D. is also
not machine readable...)

That could be done simply by updating a file with the last modified time
of each record any time a record is modified, or by making a dynamic
page that gives all modified records within a given time frame.
(Critically, these pages should at the very least cover 7 days of
changes in one request and not require paging through 1-50, 51-100, etc.
That's so annoying.) This *could* be done in RSS, which would sort of
make use of standard date formats and things, so long as it refers to
records unambiguously, and that might give it a dual use for
individuals. But, that might be unnecessary.

>     * Is it important that RSS feeds be made available for search terms?
>       For example, an RSS feed for all new bills that contain the word
>       Iraq in the text.

This shouldn't be a point that slows down anything else. RSS feeds by
LIV terms (as I do) is a good starting place, but certainly full text
search feeds would be nice. Not sure if it's computationally/cost
realistic though.

--
- Josh Tauberer

http://razor.occams.info

"Yields falsehood when preceded by its quotation!  Yields
falsehood when preceded by its quotation!" Achilles to
Tortoise (in "Gödel, Escher, Bach" by Douglas Hofstadter)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris Baker  
View profile  
 More options Oct 4 2007, 11:00 am
From: "Chris Baker" <ign...@gmail.com>
Date: Thu, 4 Oct 2007 11:00:34 -0400
Local: Thurs, Oct 4 2007 11:00 am
Subject: Re: [openhouseproject] How would you legislative data to be made available?

Rob,

First off, thanks for asking.

Excuse me if this is something that has already been dealt with. I'm new to
the existing open government projects and players, so I'm still trying to
get myself up to speed. I'm coming from a perspective of someone focused on
the local, on the ground aspects, of the political process and trying to
bridge that to what's going on in Washington.

In my perfect world representatives from government, watchdog groups and the
media would form a working group to create standards for organizing
legislative data so that it is easily processed using automated tools. I see
the need for standardized ontologies (OWL), vocabularies (SKOS) and RDF
Schemas that don't just apply to Congress, but to the entire process of
government itself.

The biggest problem I keep hearing is that people don't know what's going
on... that Representatives don't know the content of the bills they are
passing, and that the public doesn't know where their tax dollars are being
spent... there's information overload. This to me points to as much a
metadata problem as a data problem.

The data coming out needs to make it as easy as possible for people, both
inside and outside the government to build tools. IMNSHO RSS is simply to
vague a format. I'd like to see the data as plastic as possible, and to me
that means RDF.

This raisies the following questions:

* How much duplication of concerns is there between state legislative
activities and federal so that we don't have to solve the problem over and
over again?
* What existing standards exist in the financial world so that budget
reporting can learn from existing efforts?

On 10/2/07, Rob Pierson <piers...@gmail.com> wrote:

> * What file formats/system would you recommend? Is a complete dump
> of the entire database necessary?

No. Ideally I'd like to see a SPARQL interface.

> * How could sites be made aware of changes to the system? Rather than
>   accessing every bill record every night, is there a way that sites could
>   only access records that had been updated (i.e. new cosponsors, bill

action, etc).

Annotating the data with RDF it should be possible to easily create
interfaces that would  allow users to subscribe to updates via RDF or RSS.

* Is it important that RSS feeds be made available for search terms? For
example,
  an RSS feed for all new bills that contain the word Iraq in the text.

Personally, RSS is useful for many, but it is not enough for easy use by
automated tools. You should have to scrape and parse the text for key words.
The text should be annotated using defined specifications and vocabularies.

> * What's the work around for this need today?

A lot of crude munging.

Thanks again for everyone's work. You guys are really an inspiration.

Chris Baker
http://semanticcaucus.blogspot.com/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Josh Tauberer  
View profile  
 More options Oct 4 2007, 11:22 am
From: Josh Tauberer <taube...@govtrack.us>
Date: Thu, 04 Oct 2007 11:22:06 -0400
Local: Thurs, Oct 4 2007 11:22 am
Subject: Re: [openhouseproject] Re: How would you legislative data to be made available?

Chris Baker wrote:
> I'd like to see the data as plastic as
> possible, and to me that means RDF.

Oh, so been there, done that!

Lately, because of the rate at which these open data things are
improving, and the fact that the LOC people that I talked to don't even
seem to have any interest in public open data, my take is that the best
hope for seeing progress is to suggest the simplest way to go forward.
That means XML, REST, etc.

(Have you see these?
http://www.govtrack.us/sparql.xpd
http://www.govtrack.us/source.xpd )

(And, btw, I take friendly issue with your blog entry that the WaPo is
leading the way in 21st century democracy with their votes database.
::grin::)

--
- Josh Tauberer

http://razor.occams.info

"Yields falsehood when preceded by its quotation!  Yields
falsehood when preceded by its quotation!" Achilles to
Tortoise (in "Gödel, Escher, Bach" by Douglas Hofstadter)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris Baker  
View profile  
 More options Oct 4 2007, 11:53 am
From: "Chris Baker" <ign...@gmail.com>
Date: Thu, 4 Oct 2007 11:53:49 -0400
Local: Thurs, Oct 4 2007 11:53 am
Subject: Re: [openhouseproject] Re: How would you legislative data to be made available?

On 10/4/07, Josh Tauberer <taube...@govtrack.us> wrote:

> Chris Baker wrote:
> > I'd like to see the data as plastic as
> > possible, and to me that means RDF.

> Oh, so been there, done that!

> Lately, because of the rate at which these open data things are
> improving, and the fact that the LOC people that I talked to don't even
> seem to have any interest in public open data, my take is that the best
> hope for seeing progress is to suggest the simplest way to go forward.
> That means XML, REST, etc.

I can certainly understand this tactically. I'm more of a Utopian idealist
than a step in the right direction man. I think that there will be a demand
for this data no matter what, so if legislators don't offer it up eventually
outside groups will do it themselves and thus control the data feeds.

Unfortunately, it will be a hard sell until there are tools that make the
case, so we're trapped in a chicken waiting for the egg waiting for the
chicken holding pattern. This is why my focus is on building tools for use
by people on the ground. The data set is smaller, and it's easier to apply
to real world situations.

(Have you see these?

No, and very cool!

(And, btw, I take friendly issue with your blog entry that the WaPo is

> leading the way in 21st century democracy with their votes database.
> ::grin::)

OK... that's a fair cop 8-)

Let me quantify things. For me, one of the driving forces for technological
political innovation needs to be traditional print media. As information
grows out of control I see a real market for trusted non-partisan sources
that can divine the semantic tea leaves. As they get more and more
marginalized my hope is that they'll see this and run with it. The
Washington Post is doing that.

Chris
http://semanticcaucus.blogspot.com/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Derek Willis  
View profile  
 More options Oct 4 2007, 12:17 pm
From: Derek Willis <dwil...@gmail.com>
Date: Thu, 04 Oct 2007 16:17:35 -0000
Local: Thurs, Oct 4 2007 12:17 pm
Subject: Re: How would you legislative data to be made available?
On Oct 4, 11:53 am, "Chris Baker" <ign...@gmail.com> wrote:

Well, we're *trying* to do that. But Josh got there first, and I hope
he knows that how many folks in the media appreciate his efforts.

As to the questions raised, I'd prefer an entire database dump, or at
least sections of the database that can be regularly updated, sort of
the way the Federal Election Commission does with its data. Plenty of
other things can be extended from that, including RSS and other stuff,
so if requiring the LoC to have it delays things, then I second Josh's
recommendation. Simplicity works best.

Derek Willis
washingtonpost.com


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Brothers  
View profile  
 More options Oct 5 2007, 11:14 am
From: "John Brothers" <joh...@gmail.com>
Date: Fri, 5 Oct 2007 11:14:55 -0400
Local: Fri, Oct 5 2007 11:14 am
Subject: Re: [openhouseproject] How would you legislative data to be made available?

Hello all,

   I'm fairly new to this group, but I figure I should throw my 2 cents in.
I'm the CTO of the Sunlight Foundation, with long experience in open source
and data processing.

   File System/Formats -

     Like Derek, I'd generally prefer everything to be in an actual
database, with timestamps on records and such.   That's the ultimately
flexible format that allows for us (the consumers of the data) to create
feeds, to sort and group the data, and to run statistics with ease.

     SparQL looks pretty neat, if verbose and somewhat cumbersome, but I
don't think it particularly buys us anything over a solid open source
database infrastructure.

    Notifications

      A proper database structure would handle this easily.

    RSS Feeds

      Let third parties (Josh & GovTrack, Sunlight, etc) provide the RSS
feeds, alerts, twitter interfaces and all that other stuff.   Keep it simple
at the source.

    Workaround

       Sorry, I don't know the current situation, so I can't comment.

Contingency plans

   - We all get together and someone agrees to be the one group that
   pulls from the existing data, and creates the database, and the others pull
   from that manufactured database?   Sunlight, for example, certainly has the
   resources to do this, if necessary.

--
CTO @ SunlightFoundation.com - 678 467 3504
Agile Development Blog: IndefiniteArticles.com
Stone Magic: Stonemagic.Picobusiness.com


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
james a. jacobs  
View profile  
 More options Oct 5 2007, 1:27 pm
From: "james a. jacobs" <jamesajac...@mac.com>
Date: Fri, 5 Oct 2007 10:27:24 -0700
Local: Fri, Oct 5 2007 1:27 pm
Subject: Re: [openhouseproject] Re: How would you legislative data to be made available?
as we think about these issues, there are two things that i think are  
useful to consider. these are things i have found in over 20 years of  
dealing with legacy data, legacy software, legacy formats and trying  
to use data today that was created years ago:

1. (and this has been mentioned here before)  "simple" gets  
implemented and "complex" does not. that's an oversimplification, of  
course, but we've seen examples:  html being so simple that it  
enabled the web, but with rdf being complex and much slower to create  
the semantic web; the government's "GILS" (Government Information  
Locator Service) being well-thought out, but mostly un-implemented; etc.

2. separation of data from applications is *always* better for  
preservation purposes. when agencies instantiate their information in  
databases it is just too easy for them to leave out data that doesn't  
fit the database design and too tempting to use software-specific  
functionality that gets lost in translation to any other system.

this leads me only to generic suggestions, not specific ones:

a. software-neutral and OS-neutral formats for distribution and  
preservation (i.e., xml)

b. "minimal-level" rules for mark-up and metadata to help ensure that  
some core information *always* gets produced and saved -- even if it  
is *possible* to produce much more complex and demanding and  
expensive information.

c. flexibility:  standards should allow for complete, comprehensive  
markup without limitations (field size, character-encoding, etc.) and  
should allow for change over time as our needs change.

James A. Jacobs
Data Services Librarian Emeritus
University of California San Diego


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "article on blogging" by Michael Stern
Michael Stern  
View profile  
 More options Oct 5 2007, 1:42 pm
From: "Michael Stern" <stern8...@cox.net>
Date: Fri, 5 Oct 2007 13:42:32 -0400
Local: Fri, Oct 5 2007 1:42 pm
Subject: article on blogging

You may be interested in this article on beltway blogging.  David All is
mentioned.

Beltway
<http://beltwayblogroll.nationaljournal.com/archives/2007/10/national_...
.php>  Blogroll: National Journal's Cover Story On Blogs

Mike Stern


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "How would you legislative data to be made available?" by John Wonderlich
John Wonderlich  
View profile  
 More options Oct 10 2007, 1:13 am
From: "John Wonderlich" <johnwonderl...@gmail.com>
Date: Wed, 10 Oct 2007 01:13:10 -0400
Local: Wed, Oct 10 2007 1:13 am
Subject: Re: [openhouseproject] Re: How would you legislative data to be made available?

To what degree should legislative metadata standards coordination efforts
(ie, re-envisioning THOMAS with enhanced functionality and public database
access)...  To what degree should this take into account existing
archivalmetadata standards?

I've been looking through this LOC website for the Encoded Archival
Description <http://www.loc.gov/ead/>, which looks like a set of DTDs and
schemas for archiving.  In what ways standardizing legislative information
formats different from standardizing archive descriptions?  In what ways in
THOMAS different from an archive?

I expect we could gain something from understanding the development of this
project, well described here <http://www.loc.gov/ead/eaddev.html>, or
perhaps also be aware of the sort of standardizations discussions LOC is
having publicly on a listserv
here<http://listserv.loc.gov/cgi-bin/wa?A2=ind0710&L=ead&T=0&P=55>.

Are there many similar specialized discussions with relevance to this
issue?  GODORT <http://www.ala.org/ala/godort/godort.htm> is the only other
I'm aware of (oh, and GOVDOC-L <http://govdoc-l.org/>.... which is really
fascinating because the conversations are archived on a google group since
1991, making for some interesting discussions about issues like adjusting to
CDs as storage devices, or imagining how government will be affected by the
then nascent Internet.

John

On 10/5/07, james a. jacobs <jamesajac...@mac.com> wrote:

--
John Wonderlich

Program Director
The Sunlight Foundation
(202) 742-1520 ext. 234


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sarah  
View profile  
 More options Oct 10 2007, 12:12 pm
From: Sarah <paigele...@yahoo.com>
Date: Wed, 10 Oct 2007 09:12:35 -0700
Local: Wed, Oct 10 2007 12:12 pm
Subject: Re: How would you legislative data to be made available?
 - What file formats/system would you recommend? Is a complete dump of
   the entire database necessary?

    -A web service with XML.

   - How could sites be made aware of changes to the system? Rather
than
   accessing every bill record every night, is there a way that sites
could
   only access records that had been updated (i.e. new cosponsors,
bill
   action, etc).

   -They don't need to be made aware of changes, they just need to
develop a system for re-checking the web service and identifying new
content.
    Queries don't need to be babysat--- if only some records were
available at any time, there would enevitably be a time when you'd
need the old ones that
    were no longer available.

   - Is it important that RSS feeds be made available for search
terms?
   For example, an RSS feed for all new bills that contain the word
Iraq in the
   text.

   Not really--I can understand why a small slice of the population
would want it, but I don't think it's really needed by the broader
population. If you have a web
   service, you have everything, and you can tease out terms,
simplifying information for others.

   - What's the work around for this need today?

   There's a couple of good web services created by a few states
already.  They made it in house and are willing to share it freely
with others.   LoC doesn't need to pay out a bunch of money for this---
it's freely available to them now, and they just need to make a few
changes to suit their needs.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Pierson  
View profile  
 More options Oct 11 2007, 5:50 pm
From: "Rob Pierson" <piers...@gmail.com>
Date: Thu, 11 Oct 2007 17:50:37 -0400
Local: Thurs, Oct 11 2007 5:50 pm
Subject: Re: [openhouseproject] Re: How would you legislative data to be made available?

Thanks for all of the excellent technical recommendations everyone.

I've been told that folks at LoC are following our conversations here and
have been finding them quite useful. Perhaps the next step in developing a
recommendation on the community's technical and functional requirements
would be a conference call and then collaboration on a google doc?

I'm also looking at holding a discussion with Congressional staff about what
changes they would like to see in LIS and Thomas. I'd invite LIS and Thomas
staff to brief offices on their plans for the future and we could then
discuss what features staff would like to see. I've already spoken to
staffers who want to be able to display cosponsors and other bill data
through some sort of official web service / api, and providing a forum for
those requests could help make that a reality.

Sarah raised a great point, and I was also hoping we could point out to LoC
some concrete examples of legislative databases that were implemented in a
really useful way. Do any state or foreign governments have particularly
good implementations of web services and/or ways of making their raw
legislative database available?

On 10/10/07, Sarah <paigele...@yahoo.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
sim9on@gmail.com  
View profile  
 More options Oct 14 2007, 7:18 pm
From: "sim...@gmail.com" <sim...@gmail.com>
Date: Sun, 14 Oct 2007 23:18:02 -0000
Local: Sun, Oct 14 2007 7:18 pm
Subject: Re: How would you legislative data to be made available?
UW ITS http://www.its.washington.edu/ does a great job of serving data
for a variety of DOT systems in the NorthWest.  One of the cooler
applications created from the data is the now defunct bus monster..
http://www.busmonster.com/

On Oct 11, 2:50 pm, "Rob Pierson" <piers...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »