format of the data

14 views
Skip to first unread message

Mason Alexander Porter

unread,
Dec 25, 2007, 9:22:46 PM12/25/07
to hansard-...@googlegroups.com, Mason Alexander Porter
Hi,

I just joined the discussion group by invitation of Edward Wood.

My interest in this data once it is set up is to study it using certain
mathematical and computational techniques. Other scientists have wanted
to do this for years (a couple of decades, actually), and the lack of
online availability in a good format has been prohibitive towards doing
so. Basically, optimizing the format would make it so a lot of
interesting research by a lot of people would happen pretty fast.

Such data has been collected for the United States Congress, and that
provides a good way for how to organize voting data so that it can be
analyzed. What would be great is to not just put the information online
but to also include it in an appropriate compactified form so that it
could be downloaded and analyzed in great detail.

To see how the data is organized, you can go to
http://voteview.com/dwnl.htm and to download an example dataset directly,
you can click on http://voteview.com/house109.htm

The thing that would be really nice to have would be a "matrix" for each
session of parliament (one each for Commons and Lords) that for each
measure would include how each MP voted. For example, if MP 5 voted
'yeah' on measure 23, than the entry in the spot marked (5,23) would be
+1; if he/she voted nay, it would be -1, and if he/she abstained or were
absent, it would be 0.

I'm not sure how coherent my description was, so I'd be very pleased to
clarify anything that I mentioned above.


-----
Mason

-----------------------------------------------------------------------------
Mason A. Porter
University Lecturer (and Tutorial Fellow, Somerville College)
Oxford Centre for Industrial and Applied Mathematics
Mathematical Institute, University of Oxford

Homepage: http://www.maths.ox.ac.uk/~porterm, IM: tepid451
Blog: http://masonporter.blogspot.com/
-----------------------------------------------------------------------------
"His beard alone has experienced more than a lesser man's entire body."

--- from a commercial for Dos Equis
-----------------------------------------------------------------------------


Robert

unread,
Jan 2, 2008, 2:36:51 PM1/2/08
to Hansard Prototype
Hello Mason - thanks for joining!

First, the bad news: I'm expecting that it's not going to be that easy
for us to extract divisions (votes) data in a 'complete' manner. We
currently see the following issues:

- separated ayes and noes columns
- missing columns
- missing division titles
- ambiguous division titles

Currently we're not able to estimate even the percentage of divisions
we've correctly identified, let alone the ones we're missing.

Now the good news: the format you've described looks pretty
straightforward. We'll certainly look at making divisions separately
downloadable.

I don't suppose you're on Twitter?

On Dec 26 2007, 2:22 am, Mason Alexander Porter
<port...@maths.ox.ac.uk> wrote:
> Hi,
>
> I just joined the discussion group by invitation of Edward Wood.
>
> My interest in this data once it is set up is to study it using certain
> mathematical and computational techniques.  Other scientists have wanted
> to do this for years (a couple of decades, actually), and the lack of
> online availability in a good format has been prohibitive towards doing
> so.  Basically, optimizing the format would make it so a lot of
> interesting research by a lot of people would happen pretty fast.
>
> Such data has been collected for the United States Congress, and that
> provides a good way for how to organize voting data so that it can be
> analyzed.  What would be great is to not just put the information online
> but to also include it in an appropriate compactified form so that it
> could be downloaded and analyzed in great detail.
>
> To see how the data is organized, you can go tohttp://voteview.com/dwnl.htmand to download an example dataset directly,
> you can click onhttp://voteview.com/house109.htm

Mason Alexander Porter

unread,
Jan 2, 2008, 2:49:43 PM1/2/08
to Hansard Prototype
> I don't suppose you're on Twitter?

I'm not on twitter, though I just heard about this yesterday (and it
sounds like I'm behind the times on that, given what I read). I can
join if it makes life easier for these discussions. What handles should I
look for on there?

If the data can be organized in a format that's easy to upload into
programs like Matlab, that would really allow people to make great leaps
forwward in analyzing this stuff.

Robert

unread,
Jan 2, 2008, 3:31:05 PM1/2/08
to Hansard Prototype
http://twitter.com/robertbrook

We should talk about formats. You currently in Oxford?

On Jan 2, 7:49 pm, Mason Alexander Porter <port...@maths.ox.ac.uk>
wrote:
> > I don't suppose you're on Twitter?
>
> I'm not on twitter, though I just heard about this yesterday (and it
> sounds like I'm behind the times on that, given what I read).  I can
> join if it makes life easier for these discussions.  What handles should I
> look for on there?
>
> If the data can be organized in a format that's easy to upload into
> programs like Matlab, that would really allow people to make great leaps
> forwward in analyzing this stuff.
>
>
>
>
>
> > On Dec 26 2007, 2:22 am, Mason Alexander Porter
> > <port...@maths.ox.ac.uk> wrote:
> >> Hi,
>
> >> I just joined the discussion group by invitation of Edward Wood.
>
> >> My interest in this data once it is set up is to study it using certain
> >> mathematical and computational techniques.  Other scientists have wanted
> >> to do this for years (a couple of decades, actually), and the lack of
> >> online availability in a good format has been prohibitive towards doing
> >> so.  Basically, optimizing the format would make it so a lot of
> >> interesting research by a lot of people would happen pretty fast.
>
> >> Such data has been collected for the United States Congress, and that
> >> provides a good way for how to organize voting data so that it can be
> >> analyzed.  What would be great is to not just put the information online
> >> but to also include it in an appropriate compactified form so that it
> >> could be downloaded and analyzed in great detail.
>
> >> To see how the data is organized, you can go tohttp://voteview.com/dwnl.htmandto download an example dataset directly,

Mason Alexander Porter

unread,
Jan 2, 2008, 3:59:48 PM1/2/08
to Hansard Prototype
At the moment, I am physically in California (my original home). I return
to Oxford on the 11th.

Robert

unread,
Jan 2, 2008, 4:39:20 PM1/2/08
to Hansard Prototype
If you've a couple of hours, that might be useful...!

On Jan 2, 8:59 pm, Mason Alexander Porter <port...@maths.ox.ac.uk>
> >>>> To see how the data is organized, you can go tohttp://voteview.com/dwnl.htmandtodownload an example dataset directly,

Mason Alexander Porter

unread,
Jan 2, 2008, 4:44:40 PM1/2/08
to Hansard Prototype
> If you've a couple of hours, that might be useful...!

Sure, though it will be better to do this once I return to the UK. (As it
stands, I already doubt I'll finish what I need by the conference.) If I
can give any input so that the format is amenable to subsequent analysis,
I am very pleased to do this.

-----
Mason

-----------------------------------------------------------------------------

Robert

unread,
Jan 3, 2008, 7:03:00 AM1/3/08
to Hansard Prototype
( Yes - I was meaning in the UK! )

On Jan 2, 9:44 pm, Mason Alexander Porter <port...@maths.ox.ac.uk>
> >>>>>> To see how the data is organized, you can go tohttp://voteview.com/dwnl.htmandtodownloadan example dataset directly,
Reply all
Reply to author
Forward
0 new messages