How should vacant seats be handled?

17 views
Skip to first unread message

Dane

unread,
Jun 23, 2011, 6:14:45 PM6/23/11
to Open State Project
I just noticed as I'm getting familiar with the Missouri legislator
scraper that three seats are vacant. Should this scraper somehow catch
these seats and note that they are vacant? The district information is
still valid and maybe useful, like the term/chamber/disctrict
information.

Thanks...

Dane

unread,
Jun 24, 2011, 10:56:44 AM6/24/11
to Open State Project
I guess for now I'll differentiate them within the scraper (in case
they're wanted later) and only publish non-vacant seats...

James Turk

unread,
Jun 24, 2011, 11:02:38 AM6/24/11
to fifty-sta...@googlegroups.com
Vacant seats can be safely excluded, we store the information by
legislator as opposed to by seat so there really isn't room for vacant
seats.

-James

> --
> You received this message because you are subscribed to the Google Groups "Open State Project" group.
> To post to this group, send email to fifty-sta...@googlegroups.com.
> To unsubscribe from this group, send email to fifty-state-pro...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/fifty-state-project?hl=en.
>
>

Dane

unread,
Jun 24, 2011, 11:54:38 AM6/24/11
to Open State Project
Thanks for the info, I'll then exclude them...

Ray Kiddy

unread,
Jun 25, 2011, 6:32:03 PM6/25/11
to fifty-sta...@googlegroups.com

I actually find the way that OpenState has decided to model this to be incomplete and it leads to this sort of problem. It is an understandable design, but it is legislator-centric. The more important entity in the legislature is a legislative seat. I think it makes sense to model legislators as being attached to and unattached from seats, since this is exactly what happens when people are elected and then leave office or transition to another office. This is obviously a "next major revision" kind of proposal and I may have to put it on the issues list as such.

For example, back in 2010, we had a legislature who resigned to become Lieutenant Governor and someone else, later, won a special election for the seat. This would be modeled as:

Senate seat 15 ->

legislator001: name: Abel Maldonado; start: Dec 1, 2008; end: Apr 27, 2010;

legislator999: name: Vacancy; start: Apr 27, 2011; end: Aug 23, 2010;

legislator002: name: Sam Blakeslee; start: Aug 23, 2010; end: NULL;

As far as I can tell, in OpenStates system, this would be modeled as something like this:

leg: leg_id: legislator001; name: Abel Maldonado; old_roles: [{chamber: upper; district: 15; term: 2009-2010; start_date: NULL; end_date: NULL}]

leg: leg_id: legislator002; name: Sam Blakeslee; roles: [{chamber: upper; district: 15; term: 2009-2010; start_date: NULL; end_date: NULL}]

It is just not possible to figure out what has happened here. Perhaps if the start_date or end_date fields were not always NULL, this would be clearer. Actually, as I look at it, I am realizing that if the start_date or end_date fields were always accurate, the OpenStates design would be sufficient.

Of course, one would still want to have:

leg: leg_id: legislator999; name: Vacancy; roles: [{chamber: upper; district: 15; term: 2009-2010; start_date: Apr 27, 2010; end_date: Aug 23, 2010}]

If one only has the two legislators above, if the start_date and end_date fields are properly filled in, the lack of a legislator for Apr - Aug could be an error. It is hard to turn the negative fact: "there was no legislator listed for the seat for the time period" to the positive fact: "there was a vacancy for the time period".

I have filter methods that I am using that try to characterize the data and the results of this, for data pulled on 6/21/2011, for the roles are below. If the dates are going to be filled in, there is obviously work to do.

cheers - ray

state: ak, checkLegislatorRolesHaveDates found roles # 475, having dates # 0 and missing dates # 475
state: az, checkLegislatorRolesHaveDates found roles # 425, having dates # 0 and missing dates # 425
state: ca, checkLegislatorRolesHaveDates found roles # 1092, having dates # 0 and missing dates # 1092
state: dc, checkLegislatorRolesHaveDates found roles # 60, having dates # 0 and missing dates # 60
state: fl, checkLegislatorRolesHaveDates found roles # 398, having dates # 0 and missing dates # 398
state: in, checkLegislatorRolesHaveDates found roles # 659, having dates # 0 and missing dates # 659
state: la, checkLegislatorRolesHaveDates found roles # 372, having dates # 0 and missing dates # 372
state: md, checkLegislatorRolesHaveDates found roles # 1089, having dates # 0 and missing dates # 1089
state: mi, checkLegislatorRolesHaveDates found roles # 465, having dates # 0 and missing dates # 465
state: mn, checkLegislatorRolesHaveDates found roles # 995, having dates # 0 and missing dates # 995
state: nc, checkLegislatorRolesHaveDates found roles # 465, having dates # 0 and missing dates # 465
state: nj, checkLegislatorRolesHaveDates found roles # 673, having dates # 0 and missing dates # 673
state: nv, checkLegislatorRolesHaveDates found roles # 390, having dates # 0 and missing dates # 390
state: oh, checkLegislatorRolesHaveDates found roles # 656, having dates # 0 and missing dates # 656
state: pa, checkLegislatorRolesHaveDates found roles # 1376, having dates # 0 and missing dates # 1376
state: sd, checkLegislatorRolesHaveDates found roles # 406, having dates # 0 and missing dates # 406
state: tx, checkLegislatorRolesHaveDates found roles # 1478, having dates # 0 and missing dates # 1478
state: ut, checkLegislatorRolesHaveDates found roles # 352, having dates # 0 and missing dates # 352
state: va, checkLegislatorRolesHaveDates found roles # 1251, having dates # 0 and missing dates # 1251
state: vt, checkLegislatorRolesHaveDates found roles # 539, having dates # 0 and missing dates # 539
state: wa, checkLegislatorRolesHaveDates found roles # 622, having dates # 0 and missing dates # 622
state: wi, checkLegislatorRolesHaveDates found roles # 1204, having dates # 0 and missing dates # 1204


James Turk

unread,
Jun 27, 2011, 3:43:56 PM6/27/11
to fifty-sta...@googlegroups.com
There are certainly two ways to model this, and to directly find
vacancies it is true that the current system is not ideal.

A seat-based system would present its own trade-offs as the more
natural unit of organization is "how did this legislator's career
evolve" and organizing it by legislator makes that a much easier
question to answer.

In our own usage and that of most people we've talked to the
legislator-centric model is more suited to quickly answering those
questions. Especially when you notice that many legislators hold
different seats over the course of their career (redistricting and
changes between chamber).

Organizing by seat is also quite challenging when you take into
account multi-seat districts, some districts elect 2-5 people with no
distinction between them, so saying "District 8 is currently served by
Bob Smith" is less correct than saying "Bob Smith currently serves in
district 8."

We're unlikely to switch to a seat-based model for these and other reasons.

You're correct however in noticing that begin/end_date are not well
used but exist for the purpose of helping to show resignations/etc.
At the moment there aren't good resources that we've identified to
help find these dates, as often official sites leave old legislators
up for days or even months after they leave office due to resignation
or death.

The approach we are taking is simple for now, when we're notified that
someone we have as active is no longer in office, we'll attempt to
find the date they left office and mark them as such. This has worked
decently at the federal level but will require a bit of effort to
scale it to the state level.

-James

Ray Kiddy

unread,
Jun 28, 2011, 12:32:43 AM6/28/11
to fifty-sta...@googlegroups.com
On Jun 27, 2011, at 12:43 PM, James Turk wrote:

There are certainly two ways to model this, and to directly find
vacancies it is true that the current system is not ideal.

A seat-based system would present its own trade-offs as the more
natural unit of organization is "how did this legislator's career
evolve" and organizing it by legislator makes that a much easier
question to answer.

In our own usage and that of most people we've talked to the
legislator-centric model is more suited to quickly answering those
questions.  Especially when you notice that many legislators hold
different seats over the course of their career (redistricting and
changes between chamber).

Organizing by seat is also quite challenging when you take into
account multi-seat districts, some districts elect 2-5 people with no
distinction between them, so saying "District 8 is currently served by
Bob Smith" is less correct than saying "Bob Smith currently serves in
district 8."

We're unlikely to switch to a seat-based model for these and other reasons.

Well, it does come down to a preference. It is possible to do a complete job with seat-based system. And it is possible to do a complete job with a legislator-based system. I think the legislator-based system would be fine if dates were put in to clarify things.

You're correct however in noticing that begin/end_date are not well
used but exist for the purpose of helping to show resignations/etc.
At the moment there aren't good resources that we've identified to
help find these dates, as often official sites leave old legislators
up for days or even months after they leave office due to resignation
or death.

When I was building up my own database on CA legislators, I noted that there is a set start and end date for every session. They are not always easy to find. Nobody advertises them. At one point, I had to call a legislative aide and ask when someone was actually sworn in. For example, the 2011-2012 session in CA started on December 6, 2010. The 2009-2010 session ended on November 30, 2010. So, technically, the legislators were unemployed for a week.

I would figure that each session/legislator combination would have its own role dictionary. There is space for a start and stop date, so there you are.

The approach we are taking is simple for now, when we're notified that
someone we have as active is no longer in office, we'll attempt to
find the date they left office and mark them as such.  This has worked
decently at the federal level but will require a bit of effort to
scale it to the state level.


Yes, the legislatures do not try to make some of this easy. Perhaps there may be a way for us, the ones outside the OS organization, to contribute to some of the manual work that is needed.

- ray


-James

On Sat, Jun 25, 2011 at 6:32 PM, Ray Kiddy <r...@ganymede.org> wrote:

On Jun 23, 2011, at 3:14 PM, Dane wrote:

I just noticed as I'm getting familiar with the Missouri legislator
scraper that three seats are vacant. Should this scraper somehow catch
these seats and note that they are vacant? The district information is
still valid and maybe useful, like the term/chamber/disctrict
information.

Thanks...


I actually find the way that OpenState has decided to model this to be incomplete and it leads to this sort of problem. It is an understandable design, but it is legislator-centric. The more important entity in the legislature is a legislative seat. I think it makes sense to model legislators as being attached to and unattached from seats, since this is exactly what happens when people are elected and then leave office or transition to another office. This is obviously a "next major revision" kind of proposal and I may have to put it on the issues list as such.

<snip>

Gregory Combs

unread,
Jun 28, 2011, 12:33:12 AM6/28/11
to fifty-sta...@googlegroups.com
Legislator-centric works well for me, since I tie in connections to nimsp and votesmart directly ... I wind up constructing a separate district/seat schema myself, since districts don't change but once every ten years (unless you're feeling frisky in TX), I just use clues from open states to give me the legislator id with a district matching the one I'm inquiring about ... It's relational, but if there's no legislator for that seat, I can still hit up the district to check it's map boundaries, etc. Granted, I'm not particularly concerned about who once had the seat, but rather who, if anyone, has it now.

Implementing the start and end dates might be tricky in a few states (where it isn't obvious the change occurred) and necessitate a manual remunge, unless you just rely on the run dates of the scrapers and flag changes whenever there is different legislator bio info ... Then you get into multiple DB table snapshots and fuzzy matching. I'm all in favor of fuzzifying matches (given that it's sort of black magic to me), but I'd really like to see that put to use on transparency data contributors.

James Turk

unread,
Jun 29, 2011, 4:31:53 PM6/29/11
to fifty-sta...@googlegroups.com
I was giving some thought to this, and I think we might be able to
come up with a metadata extension that would list all possible seats
that would create a sort of middle ground (but would also be useful
for validation purposes). Some states leave old legislators up and so
we end up with 2-3 people occupying the same seat until we're notified
of it, but if we had canonical list of seats (and how many members are
in each seat) we could automate this process.

I'm thinking something like

# simple listing of all seats, useful for mapping districts, etc.
upper_chamber_seats = ['1','2','3','4','5'...]

# if the value is a dict, names are a mapping of seat name to # of
simultaneous holders
upper_chamber_seat_occupancy = {'1': 2, '2': 2, '3': 3}
# if the value is a number, assume all seats in this chamber have the
same number of occupants
upper_chamber_seat_occupancy = 2

Compiling these lists will take some time (and I'll be honest, I'm
inclined to delay doing a full sweep of all 50 until redistricting
takes place since we're so close) but would they be useful?

-james

On Tue, Jun 28, 2011 at 12:33 AM, Gregory Combs <gco...@gmail.com> wrote:
> Legislator-centric works well for me, since I tie in connections to nimsp and votesmart directly ... I wind up constructing a separate district/seat schema myself, since districts don't change but once every ten years (unless you're feeling frisky in TX), I just use clues from open states to give me the legislator id with a district matching the one I'm inquiring about ... It's relational, but if there's no legislator for that seat, I can still hit up the district to check it's map boundaries, etc. Granted, I'm not particularly concerned about who once had the seat, but rather who, if anyone, has it now.
>
> Implementing the start and end dates might be tricky in a few states (where it isn't obvious the change occurred) and necessitate a manual remunge, unless you just rely on the run dates of the scrapers and flag changes whenever there is different legislator bio info ... Then you get into multiple DB table snapshots and fuzzy matching. I'm all in favor of fuzzifying matches (given that it's sort of black magic to me), but I'd really like to see that put to use on transparency data contributors.
>

> --
> You received this message because you are subscribed to the Google Groups "Open State Project" group.

> To view this discussion on the web visit https://groups.google.com/d/msg/fifty-state-project/-/oK85wQ73sAsJ.

Reply all
Reply to author
Forward
0 new messages