question of preference for users of US Open Civic Data IDs for legislative boundaries

102 views
Skip to first unread message

James Turk

unread,
Mar 3, 2014, 3:42:18 PM3/3/14
to open-civic-data
A question  has come up quite a bit off list and we wanted to get some feedback on how people felt it should work.

Typically when things are coterminous (like in examples of cities/counties that are a single functional unit) we only have one boundary.  Generally this makes sense, as otherwise there'd be confusion which ID to use for a given entity.

The question that's arisen is how to handle coterminous state legislative districts (i.e. when the Senate and House share a set of district boundaries, typically with house districts being multi-member districts).  (SLDL/SLDU are the terms we use for lower and upper districts)

Option A- create SLDL identifiers even where the SLDU is coterminous
Option B- create only SLDU identifiers for coterminous SLDL/SLDUs

Advantages to Option A:
* closer to "expected" behavior, looking for list of lower districts doesn't turn up an empty list just because districts are currently coterminous
* future-proof- just because districts are currently coterminous (or once were) doesn't mean they always will be, this saves people from having to change IDs if they split them at a redistricting
* code that is expecting to use SLDL/SLDU won't need exceptions built in for states with combined districts

Advantages to Option B:
* arguably more consistent with how we handle other types
* (open to further input)

Obviously our preference (here at Sunlight) is to use Option A, so apologies if it is over-represented here.  I'll let others chime in with their preference as I know there are arguments for both sides.

Thanks,

-James

James McKinney

unread,
Mar 3, 2014, 3:57:48 PM3/3/14
to open-ci...@googlegroups.com
I'd go with Option A. The OCD IDs are for the division, not the boundary. The SLDU and SLDL boundaries may be coterminous, but the divisions are distinct. (Abolishing the upper house should not destroy the lower house's divisions, for example.)

On the other hand, in some city-county cases, not only are the boundaries coterminous, but the divisions are indistinct as well, and it makes sense to have only one division and one boundary.

Joe Germuska

unread,
Mar 3, 2014, 3:58:32 PM3/3/14
to open-ci...@googlegroups.com
I’d cast a very strong vote for option A. I haven’t been using the IDs for anything yet, so “more consistent” isn’t compelling, whereas I started to say “Option A especially because of …” and couldn’t even pick one.

When modeling, it’s best to conform with real world conceptual entities. If the real world says that SLDL and SLDU are different things, don’t be fooled that they sometimes are the same shape.

Joe
--
Joe Germuska
J...@Germuska.com * http://blog.germuska.com * http://twitter.com/JoeGermuska

"Learn to fear any church that fears drums." --Regie Gibson

Aaron Strauss

unread,
Mar 3, 2014, 3:59:17 PM3/3/14
to open-civic-data
We discussed this type of issue as a community last summer and I feel strongly that we should not be entering two ocd identifiers for a single coterminous districts (Option B). There are actually several analogies to the sldl/sldu case and so far we have not entered "duplicate" ids for any of them. Here are some examples:

  • City/county as James mentioned. A good example is Denver, which is both a city and county.
  • At large congressional districts. Seven states currently have at large congressional districts, which we don't have ocd district identifiers for because the state identifiers serve this purpose. I think this is the best analogy to the sldl/sldu case. People might expect us to have an AK-AL district, but we don't. Rather, we put "state:ak/cd:1" into the exceptions file, which developers can programmatically grab. Also, even better for developers, we have mapping files which list all the congressional districts, including the at large cases. We have these mapping files for sldl and sldu.
  • Place: mayor and at-large city council districts. We don't include at large city council districts because they are coterminous with place.
  • City council and school board districts. Some cities use the same districts for city council and school board. To date, I have only been adding one district. (The Google API has multiple offices connected to that one district.)
  • City council district and wards. Some cities use electoral wards for city council districts. I've only been entering the wards into the repo avoid data mismatches.
My thoughts on James' three advantages to Option A: (1) developers should *never* query on ocdid type and expect a current, comprehensive, non-overlapping list of districts. With the recent inclusion of historical districts (thanks to sunlight for that), the list will never be "just the current districts". The list may not be comprehensive because of issues such as congressional at large districts. And districts may not be non-overlapping because of cases such as New Hampshire's floterial districts. For all those reasons (and maybe more), I would never want to give developers the expectation that they can query on type. (2) I don't want to add identifiers in anticipation of changes that may never come -- that's just confusing. (3) such code wouldn't work with congressional districts and that's why we have mapping files.

I think I've gone on long enough -- those are some of the reasons I support Option B, and I'd curious for others' thoughts.

Aaron

Adam Yan

unread,
Mar 3, 2014, 4:07:29 PM3/3/14
to open-ci...@googlegroups.com
Hi All!

(I just joined in the hopes of getting my state of California into this API.  I hope I do not overstep any bounds with my preference here.)

As I understand it Mr. Germuska makes the best point, do not confuse what is the same today with what is always the same.  I do feel like for my purpose option "B" would make the OCD API incomplete.  I do not know why "B" would make sense- state assembly and state senate will not be same thing. 

What if a "SDLU" was the same as a county one day?  Would we then get rid of County?  (I hope by mere suggestion this does not happen!)

-Adam

Eric Mill

unread,
Mar 3, 2014, 4:11:07 PM3/3/14
to open-civic-data
I'm not involved here really, but making SDLU represent both SLDL and SLDU would be very strange and inconsistent. Maybe you could make a hybrid identifier there, like "SLDU-SLDL"?

From an integrator's perspective, it sounds like Option B would lead to every integrating developer having to understand exceptions that are more serious than just looking up IDs in an exceptions file. The way you process legislative districts would be completely different for states whose SLDL identifiers are "hidden" by SLDU.
--

Aaron Strauss

unread,
Mar 3, 2014, 4:20:17 PM3/3/14
to open-civic-data
(To be clear, Option B would not combine sldu and sldl. The option is to merely not include sldl identifiers in the repo when the two districts are legislatively dictated to be the same.)

Currently, in the Google Civic Information API (which uses the Option B ocd division ids), there are mappings between district->office->office holders. These relationships are one-to-many and have fairly clear distinctions. The choice of Option A can be self-consistent, and the community may make that decision, but I just want to throw out the warning that we would be blurring the line between district and office. The following ocd divisions would need to be added to the repo to stay consistent with the principles of Option A:

  • At large congressional districts
  • At large city/county council districts
  • School board districts, water districts, park districts, etc when they are coterminous with city council districts
  • The above, plus city council districts, when they are coterminous with ward
  • Several judicial districts that are coterminous with county
Those are the ones off the top of my head; there are certainly others. We have been using Option B so far and, at least on our side, the API has worked great. There are no confusions about which district is which, and it's difficult to mismatch data. I personally see very little reason to disturb the working status quo.

Aaron



Anthea Watson Strong

unread,
Mar 3, 2014, 4:22:22 PM3/3/14
to open-civic-data
Hi everyone,

Like Aaron has already mentioned, we solicited feedback on this issue last summer, and from Google's perspective, this issue was settled.  The Google Civic Information API returns the highest level of OCDID for all coterminous jurisdictions.  

It would require eng time on our part to adjust the response we return through the API to reflect option A. 

I recognize there are cogent arguments to be made for why A is better, but once decisions are made, unless there are compelling reasons to change our mind, it's important for us to stick with them. 

Although I was agnostic at the time we were discussing this, now that development work is in progress, I strongly throw my support behind option B.  In the future, if we decide to publish a new release, we can revisit the topic. 

Anthea



On Mon, Mar 3, 2014 at 4:11 PM, Eric Mill <er...@sunlightfoundation.com> wrote:



--
____________________
Anthea Watson Strong
Google Politics and Elections

James McKinney

unread,
Mar 3, 2014, 4:28:57 PM3/3/14
to open-ci...@googlegroups.com
I think if the type were "sd" then people would be more happy with Option B. Naming a lower house district "upper house district" is bizarre, and that's why I think many people are opting for Option A. If the upper and lower house districts are currently mandated/legislated to be identical in some jurisdictions, why not call them "sd" to eliminate that issue?

Paul Tagliamonte

unread,
Mar 3, 2014, 4:36:45 PM3/3/14
to open-ci...@googlegroups.com
On Mon, Mar 03, 2014 at 04:20:17PM -0500, Aaron Strauss wrote:
> (To be clear, Option B would not combine sldu and sldl. The option is to
> merely not include sldl identifiers in the repo when the two districts are
> legislatively dictated to be the same.)

This may be true (in that the districts are coterminous), but the thing
is OCD IDs are conceptual beasts -- they're the mapping of the conceptual
idea of a SLDL / SLDU district -- the fact we're also mapping these to a
shapefile which is coterminous is coincidence, even if legislatively
mandated for now.

> Currently, in the Google Civic Information API (which uses the Option B
> ocd division ids), there are mappings between district->office->office
> holders. These relationships are one-to-many and have fairly clear
> distinctions. The choice of Option A can be self-consistent, and the
> community may make that decision, but I just want to throw out the warning
> that we would be blurring the line between district and office.

Well, technically speaking, the mapping of person to post is outside the
scope of a geo-id. The post may relate to the ocd-id -- I'm not sure
what we're bluring - would you mind clarifying that?

> The
> following ocd divisions would need to be added to the repo to stay
> consistent with the principles of Option A:
>
> * At large congressional districts
> * At large city/county council districts
> * School board districts, water districts, park districts, etc when they
> are coterminous with city council districts
> * The above, plus city council districts, when they are coterminous with
> ward
> * Several judicial districts that are coterminous with county
>
> Those are the ones off the top of my head; there are certainly others. We
> have been using Option B so far and, at least on our side, the API has
> worked great.

Just because a shape-file is coterminous doesn't mean they're logically
the same entity. I think it's perfectly fair to put someone's post as
sldl:at-large rather then state:ak -- in fact, this is how we do things
currently for OpenStates and how they're used throughout pupa and other
OCD-centric tools.

> There are no confusions about which district is which, and
> it's difficult to mismatch data. I personally see very little reason to
> disturb the working status quo.

The status quo on our side seems to be option A.

> Aaron


Cheers,
Paul


--
Paul Tagliamonte
Software Engineer
signature.asc

James Turk

unread,
Mar 3, 2014, 4:39:29 PM3/3/14
to open-civic-data
I'd also take issue with the idea that this was settled in the opposite direction in the past or that it'd take hours of engineering time to convert if it was currently assuming Option B was settled truth. 

Any system could choose to convert sldl ids to sldu ones if it chose to (in a single line of code), just because an ID exists doesn't mean someone has to use it.

James Turk

unread,
Mar 3, 2014, 4:50:31 PM3/3/14
to open-civic-data
I was just asked off-list to elaborate on the point of districts changing over time and how that affects this, it might be illustrative here.

Maryland has semi-coterminous districts, House & Senate districts overlap but some house districts are subdivided (SLDU:5 might be divided into SLDL:5A and SLDL:5B, but SLDU:6 and SLDL:6 might be the same)  This can change every 10 years, and indeed has changed as recently as this past election.  If we were to adopt Option B now there'd be no clear way to handle this.  There'd be a partial set of SLDLs and you'd have to go to SLDUs for the periods of time when a district is unified, and then back to SLDLs when it is not again.  

Other states like Nevada have had similar changes to their redistricting rules, treating SLDL/SLDU like equivalent entities is just asking for trouble.

Aaron Strauss

unread,
Mar 3, 2014, 4:50:56 PM3/3/14
to open-civic-data
To clarify my "blurring" point: in my reading of Option A is that on a city level, there would be separate districts for: the city (place type), at-large city council districts (council_district type), at-large school board (school_board_district type)...but not for the city's elected attorney general (who I assume in folk's APIs would be an office connected to the place: type). That setup blurs lines to me -- school board district gets its own district because it's a separate branch of government? What about the water board's at large districts? It's unclear to me where district identifiers end and offices within identifiers begin.


On Mon, Mar 3, 2014 at 4:36 PM, Paul Tagliamonte <pau...@sunlightfoundation.com> wrote:

Aaron Strauss

unread,
Mar 3, 2014, 4:57:13 PM3/3/14
to open-civic-data
Re: districts changing over time, there's no clear way to handle this for other districts (e.g., congressional), outside of this current issue. I think the changing-over-time needs another thread (perhaps after this one closes). What I mean by "no clear way": currently the repo has outdated congressional districts in it, so the only way for a developer to know what the current districts are is to maintain the current list on their end -- which would be the case regardless of whether we pick Option A or Option B here. There are many ways to handle this and I think it needs a new thread.

James Turk

unread,
Mar 3, 2014, 5:00:30 PM3/3/14
to open-civic-data
Changing over time is indeed a bigger issue we can discuss in more detail but there are many examples that are pertinent here.  To add to the MD example, until 2004 we would have had SD in the SLDL/SLDU bucket, but a court then broke apart two of the combined districts necessitating the addition of SLDLs.  If a situation like that were to occur again (as it no doubt will given the intricacies of redistricting) would you only introduce the 4 new SLDL ids and keep the others unified? 

Aaron Strauss

unread,
Mar 3, 2014, 5:02:22 PM3/3/14
to open-civic-data
Yes -- courts creating new districts is a perfect example of when the mostly-stable list of sldu's and sldl's would have to change.

James McKinney

unread,
Mar 3, 2014, 5:05:24 PM3/3/14
to open-ci...@googlegroups.com
I think the issues raised around whether we need to create "at-large" divisions are different from the sldu/sldl issues.

Paul wrote:
> Just because a shape-file is coterminous doesn't mean they're logically
> the same entity. I think it's perfectly fair to put someone's post as
> sldl:at-large rather then state:ak -- in fact, this is how we do things
> currently for OpenStates and how they're used throughout pupa and other
> OCD-centric tools.

In Canada, I haven't been creating and assigning "at-large" type IDs. I just use the parent OCD ID, like "state:ak" in this example. I don't see why an "at-large" type ID is ever appropriate. That conflates posts/positions with divisions. There is no such thing as an "at-large" division that is distinct from the parent division; they are identical political geographies. "At-large" is a concept that is entirely within the realm of organizational structure and people's positions within it. We shouldn't have any "at-large" divisions.


Aaron wrote:
> The following ocd divisions would need to be added to the repo to stay consistent with the principles of Option A:
>
> • At large congressional districts
> • At large city/county council districts
> • School board districts, water districts, park districts, etc when they are coterminous with city council districts
> • The above, plus city council districts, when they are coterminous with ward
> • Several judicial districts that are coterminous with county

I don't see any reason to add all those at-large divisions. The upper house is not the lower house, and the upper house divisions are not the lower house divisions. There are historical examples where boundaries that were once conterminous are not now. If you can present an example where an at-large state district was not the same as the state itself, then there would have an argument in favor of adding at-large districts. Until then, there's no reason to add those at-large districts. Same goes for the other examples.

James

Tom Lee

unread,
Mar 3, 2014, 5:05:12 PM3/3/14
to open-ci...@googlegroups.com
Just because a shape-file is coterminous doesn't mean they're logically
the same entity.

This point from Paul, and the related one just made by Aaron, get at the heart of the matter, I think. 

Both sides have solid reasons related to existing implementations for favoring one style of identifier over the other; with the terminological correction that James M suggests, I suspect either scheme could be made livable.

It's the futureproofing that seems most important to me. Perhaps the practical political considerations surrounding jurisdictional levels is the right way of thinking about this (though it would probably make it impossible to draft clear-cut rules). The frequency with which water board districting schemes are substantially altered seems probably-lower to me than state legislative districts (as James T points out).  Playing games with district boundaries is one of the things state legislators enjoy most, after all. Maybe more to the point, I think we can expect demand for water district data to be lower; the case for consolidating it into records where it'll be convenient to find and manipulate becomes stronger. I don't think that's as big of a worry for state legislative data.


Paul Tagliamonte

unread,
Mar 3, 2014, 5:08:03 PM3/3/14
to open-ci...@googlegroups.com
On Mon, Mar 03, 2014 at 04:50:56PM -0500, Aaron Strauss wrote:
> To clarify my "blurring" point: in my reading of Option A is that on a
> city level, there would be separate districts for: the city (place type),
> at-large city council districts (council_district type), at-large school
> board (school_board_district type)...but not for the city's elected
> attorney general (who I assume in folk's APIs would be an office connected
> to the place: type). That setup blurs lines to me -- school board district
> gets its own district because it's a separate branch of government? What
> about the water board's at large districts? It's unclear to me where
> district identifiers end and offices within identifiers begin.


So the core misunderstanding here is where is the line. That's fair. To
me, it's always been logically and fundamentally distinct entities.

sldu vs sldl are *not* logically nor fundamentally the same, just currently
legislated, and subject to change.

Conversely, I'd say that fundamentally the legislature represents the
state:ak entity.


We've seen districts that were identical break apart (South Dakota in
2005[1]), and I don't think sldu and sldl should be considerred the same
just because they're coterminous -- they're just usually done in-step
with the other district(s).

There's a line here, sure, but I don't think this is an intractable
problem.

Cheers,
Paul


[1]: http://en.wikipedia.org/wiki/South_Dakota_State_Legislature
signature.asc

Jonathan Tomer

unread,
Mar 3, 2014, 5:08:56 PM3/3/14
to open-ci...@googlegroups.com
Regardless of the immediate consequences for Google or our users, I think the correct answer is option B here.

Division IDs (as distinct from jurisdiction IDs) are meant to represent geography, not governing bodies; it's important, in my opinion, to keep a single ID for districts that are identical by design. This is where the exceptions file comes in extremely handy; it serves as a reference list of potential aliases for districts, for canonicalization purposes.

If someone releases data (of any kind -- say, air quality measurements) that they want to tag with a geographic ID, it's confusing if they have to decide whether to use the sldl or sldu district when in fact those are identical. By contrast, if someone has a reasonable guess for what some district should be called because of an association with some governing body (e.g. data published about state legislators), the exceptions file makes it easy to canonicalize those to the unique name for a geography.

James Turk

unread,
Mar 3, 2014, 5:15:44 PM3/3/14
to open-civic-data
Division IDs are actually not tied to geography directly though that's a common misconception (see debate from months ago on the difference between divisions/boundaries).

Also, cases like MD and SD show that identical by design and identical in practice are two different things.  I'd like to hear a proposed resolution to the SD issue that doesn't involve changing 28 unaffected ids when a court invalidates two thus rendering the SLDL/SLDU combination void.

Also tying air quality measurements to either SLDL/SLDU is a bit over-contrived, you wouldn't want to tie measurements to such a fragile division.

Aaron Strauss

unread,
Mar 3, 2014, 5:21:33 PM3/3/14
to open-civic-data
Maybe I'm being dense, but I don't understand the "doesn't involve changing 28 unaffected ids" -- no one is advocating this. See how the repo handled MD (until this past week) for how SD would be handled.

I'm curious: for the folks in favor of Option A, would you add in the at-large U.S. congressional districts (they are currently not in the repo)? This would help me understand "where the line is" (Paul's email) from various POVs.

James McKinney

unread,
Mar 3, 2014, 5:22:02 PM3/3/14
to open-ci...@googlegroups.com
Divisions *are* geographies, but they are not *boundaries*. That's how we defined it here: https://github.com/opencivicdata/ocd-division-ids

I think the challenge is that some want to be able to say "divisions are geographies entirely separate from governing bodies," but divisions are by definition *political* geographies, and so you will not find a clean break from governing bodies. The SLDU (State Legislative District Upper) and SLDL types have governing bodies written all over them.

I think we need criteria to determine when merging divisions is appropriate or not. Until now, it seems to have been the ad-hoc policy to merge everything, and there is now tension because there is a case where the merging is not desired by some. I think this may be a fruitful direction to work in. I'll try to draft some criteria if anyone agrees.

Adam Yan

unread,
Mar 3, 2014, 5:23:30 PM3/3/14
to open-ci...@googlegroups.com
Why is Google so vested in option B?  

Paul Tagliamonte

unread,
Mar 3, 2014, 5:25:33 PM3/3/14
to open-ci...@googlegroups.com
On Mon, Mar 03, 2014 at 05:21:33PM -0500, Aaron Strauss wrote:
> I'm curious: for the folks in favor of Option A, would you add in
> the at-large U.S. congressional districts (they are currently not in the
> repo)? This would help me understand "where the line is" (Paul's
> email) from various POVs.

I was thinking (as James M pointed out) about a district which was
numbered and also happened to be at-large. I can't remember offhand if
we make a geo-id for the at-large district.

At-large is clearly more of a toss-up, and I think marking an at-large
district as tied to the state isn't unfair, just as I'd tie a
legislature to the state.

For the case in which an at-large district is at-large via legislated
means (such as sldl:2 is at-large) should have a new entity created.


Hope that clarifies,
signature.asc

Jonathan Tomer

unread,
Mar 3, 2014, 5:25:46 PM3/3/14
to open-ci...@googlegroups.com
On Mon, Mar 3, 2014 at 5:23 PM, Adam Yan <adam...@gmail.com> wrote:
Why is Google so vested in option B?  

I don't know about vested; we've just discussed it at length internally a long time ago, so we have pretty strong opinions :)

James McKinney

unread,
Mar 3, 2014, 5:26:07 PM3/3/14
to open-ci...@googlegroups.com
Aaron: I stated my opinion on at-large districts in this message: https://groups.google.com/d/msg/open-civic-data/G70l60xqO3A/kZx85gOrdD4J

I'm not necessarily in favor of Option A - I think Option B works fine if people use the exceptions file appropriately. I also think using the type "SLD" instead of "SLDU" or "SLDL" in cases where the two are identical *by design* would be an Option C.

Adam Yan

unread,
Mar 3, 2014, 5:26:25 PM3/3/14
to open-ci...@googlegroups.com
Thanks Johnathan- But everyone else thinks no right?

Paul Tagliamonte

unread,
Mar 3, 2014, 5:27:28 PM3/3/14
to open-civic-data
On Mon, Mar 3, 2014 at 5:25 PM, Paul Tagliamonte
<pau...@sunlightfoundation.com> wrote:
> On Mon, Mar 03, 2014 at 05:21:33PM -0500, Aaron Strauss wrote:
>> I'm curious: for the folks in favor of Option A, would you add in
>> the at-large U.S. congressional districts (they are currently not in the
>> repo)? This would help me understand "where the line is" (Paul's
>> email) from various POVs.
>
> I was thinking (as James M pointed out) about a district which was
> numbered and also happened to be at-large. I can't remember offhand if
> we make a geo-id for the at-large district.

(if it's called at-large, sorry, need to read twice :) )

>
> At-large is clearly more of a toss-up, and I think marking an at-large
> district as tied to the state isn't unfair, just as I'd tie a
> legislature to the state.
>
> For the case in which an at-large district is at-large via legislated
> means (such as sldl:2 is at-large) should have a new entity created.
>
>
> Hope that clarifies,
> Paul
>
> --
> Paul Tagliamonte
> Software Engineer



--
Paul Tagliamonte
Software Developer | Sunlight Foundation

James McKinney

unread,
Mar 3, 2014, 5:28:37 PM3/3/14
to open-ci...@googlegroups.com
I'm not reading this as Google vs non-Google - at any rate, that's not a fruitful direction in which to resolve these issues.

James Turk

unread,
Mar 3, 2014, 5:28:37 PM3/3/14
to open-civic-data
It'd add more districts unless you wanted to be inconsistent just because the order that things were created.  In every other state where districts aren't *always* coterminous we have SLDLs and SLDUs.  If one or two are coterminous I haven't heard an argument in favor of combining those.  If SD suddenly became  non-coterminous a year from now (i.e. we were discussing this in 2004 instead of 2005) we'd be left with the decision to either let SD be inconsistent or introduce SLDLs all around.  Also, once a court splits districts from coterminous to not, it is likely redistricting will keep it that way (1972 in MD, 2005 in SD)

& no at-large districts aren't needed.  I agree with James that they are a separate issue.

Aaron Strauss

unread,
Mar 3, 2014, 5:28:50 PM3/3/14
to open-civic-data
McKinney's proposed Option C would work for me: "I also think using the type "SLD" instead of "SLDU" or "SLDL" in cases where the two are identical *by design* would be an Option C."

Jonathan Tomer

unread,
Mar 3, 2014, 5:30:29 PM3/3/14
to open-ci...@googlegroups.com
It seems to be the more popular opinion on this thread so far; I hope when we arrive at a consensus it's because people are convinced that one or the option is decidedly better.

Anthea's right that some of our users may be surprised by a sudden change in division IDs returned for some queries, but we'll conform to settled standards.

James Turk

unread,
Mar 3, 2014, 5:30:15 PM3/3/14
to open-civic-data
I'd be fine with Option C if it addressed the issue of things becoming non-coterminous at arbitrary points.  Suddenly growing 60 SLDLs and SLDUs when they split seems unwise.

Aaron Strauss

unread,
Mar 3, 2014, 5:33:58 PM3/3/14
to open-civic-data
If we use the SLD type whenever the upper and lower districts are coterminous (e.g., including MD and SD) then we'd only have to grow 60 SLDLs/SLDUs when a state's districts underwent a huge change and completely decoupled the upper and lower districts. If a court broke up only one or two, then we'd only have to add a few districts.

James Turk

unread,
Mar 3, 2014, 5:36:16 PM3/3/14
to open-civic-data
So you'd have SLD:1-27 and then SLDL:28A SLDL:28B and SLDU:28

Is that really preferable to treating them as separate political entities from the start with the understanding that eventually those splits will indeed happen and cause inconsistent/unpredictable ids?

Aaron Strauss

unread,
Mar 3, 2014, 5:39:37 PM3/3/14
to open-civic-data
As I mentioned before, since the repo is not a listing of currently active IDs (e.g., it includes old cong districts), developers will have to do the work of tracking changes like this anyway. Thus, I don't think redistricting splits will cause many extra headaches at all. (We may, later, decide to track a current list of IDs for developers, which I think will alleviate your concern ... but that's for the thread on timing issues.)

James McKinney

unread,
Mar 3, 2014, 5:50:06 PM3/3/14
to open-ci...@googlegroups.com
On Mon, Mar 03, 2014 at 05:21:33PM -0500, Aaron Strauss wrote:
  I'm curious: for the folks in favor of Option A, would you add in
  the at-large U.S. congressional districts (they are currently not in the
  repo)? This would help me understand "where the line is" (Paul's
  email) from various POVs.

I was thinking (as James M pointed out) about a district which was
numbered and also happened to be at-large. I can't remember offhand if
we make a geo-id for the at-large district.

At-large is clearly more of a toss-up, and I think marking an at-large
district as tied to the state isn't unfair, just as I'd tie a
legislature to the state.

For the case in which an at-large district is at-large via legislated
means (such as sldl:2 is at-large) should have a new entity created.

Hmmm, if the division is numbered (at the municipal level, most at-large positions are not numbered), then there may be an argument in favor of having an OCD-ID for that division, even if it happens to be coterminous with its parent division.

What this makes me think is that all the OCD IDs in the exceptions should be considered to be full-fledged OCD IDs, just like all other IDs. The exceptions just happen to map (at least temporarily) to other OCD IDs.

Treating the identifiers in the exceptions file as full-fledged OCD IDs maybe diffuses some of the tension. In this understanding, you would be able to use an sldl identifier, and it wouldn't be incorrect, even if the exceptions file would promote it to an sldu identifier. Either identifier is correct.

I think this only requires implementations/systems to care about the exceptions file, which I don't think is a big burden. And this way, we can have and use both SLDU and SLDL identifiers in all cases, although some will only be listed in the exceptions file and not the main identifiers file. I may be missing something, but this seems workable.

James

James Turk

unread,
Mar 3, 2014, 5:51:04 PM3/3/14
to open-civic-data
I like the idea of treating exceptions that way- perhaps a better name would be aliases?

Aaron Strauss

unread,
Mar 3, 2014, 5:55:05 PM3/3/14
to open-civic-data
This is an interesting idea, though be aware that I've used the exceptions file for "common mistakes" as well, such as this entry:

ocd-division/country:us/state:co/place:aurora/ward:i -> ocd-division/country:us/state:co/place:aurora/ward:1

But, in general, I totally agree with the idea that if someone is using an ocdid that is present in the exceptions file, APIs/code should not error but rather gracefully replace the ocdid with the canonical entry. Thanks James.

James McKinney

unread,
Mar 3, 2014, 5:55:49 PM3/3/14
to open-ci...@googlegroups.com
I like aliases. This way people can use the identifiers that make sense to them, and they don't need to care if the identifier is currently aliased or not.

If we go with that, I would argue that neither identifier in the alias relationship is "primary", as the primacy will often be context-dependent.

Aaron Strauss

unread,
Mar 3, 2014, 5:55:55 PM3/3/14
to open-civic-data
And I'm happy with the name "aliases" -- thanks other James.

Jonathan Tomer

unread,
Mar 3, 2014, 5:57:01 PM3/3/14
to open-ci...@googlegroups.com
On Mon, Mar 3, 2014 at 5:50 PM, James McKinney <ja...@opennorth.ca> wrote:
Hmmm, if the division is numbered (at the municipal level, most at-large positions are not numbered), then there may be an argument in favor of having an OCD-ID for that division, even if it happens to be coterminous with its parent division.

What this makes me think is that all the OCD IDs in the exceptions should be considered to be full-fledged OCD IDs, just like all other IDs. The exceptions just happen to map (at least temporarily) to other OCD IDs.

When you say "considered full-fledged OCD IDs", do you mean that people should think of them that way, or that we should also add them to the master OCD ID list? If you mean the former, I agree.
 
Treating the identifiers in the exceptions file as full-fledged OCD IDs maybe diffuses some of the tension. In this understanding, you would be able to use an sldl identifier, and it wouldn't be incorrect, even if the exceptions file would promote it to an sldu identifier. Either identifier is correct.

This is pretty close to how I think of things; there's a canonical name for a given geography, and other well-known names, and the exceptions file describes the equivalence.
 
I think this only requires implementations/systems to care about the exceptions file, which I don't think is a big burden.

Agreed -- and this is almost essential in any system that will accept OCD IDs in untrusted input anyway, since the criterion for an alias going into the exceptions file is that people are expected to commonly use it.

Aaron Strauss

unread,
Mar 3, 2014, 6:03:47 PM3/3/14
to open-civic-data
So, there's no need for the SLD type then, right? The exception file's name gets changed to aliases, the sldl->sldu entries are re-inserted, and everything in the alias file is now a valid ocdid. Objections to this apparent consensus?

James Turk

unread,
Mar 3, 2014, 6:05:39 PM3/3/14
to open-civic-data
That makes sense, if neither is prime we keep them in the main mapping file then and people can consult aliases to know that they are functionally identical?

One hiccup (and I'd hate to derail this progress with it) but does this mean that the aliases file is only valid for a given point in time whereas the rest of the ids can be historical?

Aaron Strauss

unread,
Mar 3, 2014, 6:09:03 PM3/3/14
to open-civic-data
Correct...though maybe the structure of main file changes to show aliases...this might help with database joining...but I'm not passionate about the answer.

And can we wait until the thread about historical/current to figure out the answer to your second question?

James Turk

unread,
Mar 3, 2014, 6:10:32 PM3/3/14
to open-civic-data
Sure- sounds reasonable (& if we're ok with changing the structure of the main file I think that informs my suggested answer to the temporal stuff)

James McKinney

unread,
Mar 3, 2014, 6:14:57 PM3/3/14
to open-ci...@googlegroups.com
For clarity, what would be the new structure of the main file?

What this makes me think is that all the OCD IDs in the exceptions should be considered to be full-fledged OCD IDs, just like all other IDs. The exceptions just happen to map (at least temporarily) to other OCD IDs.

When you say "considered full-fledged OCD IDs", do you mean that people should think of them that way, or that we should also add them to the master OCD ID list? If you mean the former, I agree.

What would be the consequences of adding the aliases to the master list? If they are both correct, I would expect to see them both in the master list, but maybe I'm not considering something. As Aaron pointed out, some of the equivalencies in the current exceptions file are in fact corrections, where an incorrect identifier is mapped to a correct one. I think we should have separate files for aliases and corrections. Incorrect identifiers should not appear in the master list.

So, there's no need for the SLD type then, right? The exception file's name gets changed to aliases, the sldl->sldu entries are re-inserted, and everything in the alias file is now a valid ocdid. Objections to this apparent consensus?

Yes, I would retract my Option C (adding SLD). I think we should add a new top-level directory "aliases", since this is not a US-specific issue and there should be a predictable way to find these aliases. However, if we're changing the format of the main file, this point may be moot.

James

Aaron Strauss

unread,
Mar 3, 2014, 6:17:57 PM3/3/14
to open-civic-data
Inline:

On Mon, Mar 3, 2014 at 6:14 PM, James McKinney <ja...@opennorth.ca> wrote:
For clarity, what would be the new structure of the main file?

We need to work this out, but I think we have solid principles to guide the technical decision.
 

What this makes me think is that all the OCD IDs in the exceptions should be considered to be full-fledged OCD IDs, just like all other IDs. The exceptions just happen to map (at least temporarily) to other OCD IDs.

When you say "considered full-fledged OCD IDs", do you mean that people should think of them that way, or that we should also add them to the master OCD ID list? If you mean the former, I agree.

What would be the consequences of adding the aliases to the master list? If they are both correct, I would expect to see them both in the master list, but maybe I'm not considering something. As Aaron pointed out, some of the equivalencies in the current exceptions file are in fact corrections, where an incorrect identifier is mapped to a correct one. I think we should have separate files for aliases and corrections. Incorrect identifiers should not appear in the master list.

SGTM
 

So, there's no need for the SLD type then, right? The exception file's name gets changed to aliases, the sldl->sldu entries are re-inserted, and everything in the alias file is now a valid ocdid. Objections to this apparent consensus?

Yes, I would retract my Option C (adding SLD). I think we should add a new top-level directory "aliases", since this is not a US-specific issue and there should be a predictable way to find these aliases. However, if we're changing the format of the main file, this point may be moot.

I would rather incorporate the structure of the aliases into the main file and, as you say, obviate the need for another top level directory.
Reply all
Reply to author
Forward
0 new messages