Ticket #8425 and USStateField (again)

3 views
Skip to first unread message

James Bennett

unread,
Dec 22, 2009, 12:29:07 PM12/22/09
to django-d...@googlegroups.com
I've previously brought up some issues with the removal of certain
options from the choices on localflavor's USStateField[1] as a result
of ticket #8425[2] and, with feature freeze for 1.2 approaching and
perhaps more time soon to be available for such things, I'd like to
call attention to it again since it just bit me pretty hard.

Real-world use case: I'm importing data from a feed provided by the US
Centers for Disease Control. The data's classified according to state
and region (using the US Department of Health and Human Services'
standard regional breakdown[3]), and so for parts of it I'm using a
USStateField.

But HHS and CDC -- like the US Post Office and every other US federal
agency, considers Palau, the Marshall Islands and the Federated States
of Micronesia to be valid "US" areas for data-gathering and reporting
purposes. And this causes a data importer based on USStateField to
required ugly workarounds, since those are not valid choices for the
field.

Any chance of getting the choices fixed so we can actually make use of
USStateField with this sort of data?


[1] http://groups.google.com/group/django-developers/browse_thread/thread/6b896421e63b6f9e/
[2] http://code.djangoproject.com/ticket/8425
[3] http://www.hhs.gov/about/regionmap.html


--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

Russell Keith-Magee

unread,
Dec 23, 2009, 10:44:48 PM12/23/09
to django-d...@googlegroups.com
On Wed, Dec 23, 2009 at 1:29 AM, James Bennett <ubern...@gmail.com> wrote:
> I've previously brought up some issues with the removal of certain
> options from the choices on localflavor's USStateField[1] as a result
> of ticket #8425[2] and, with feature freeze for 1.2 approaching and
> perhaps more time soon to be available for such things, I'd like to
> call attention to it again since it just bit me pretty hard.
>
> Real-world use case: I'm importing data from a feed provided by the US
> Centers for Disease Control. The data's classified according to state
> and region (using the US Department of Health and Human Services'
> standard regional breakdown[3]), and so for parts of it I'm using a
> USStateField.
>
> But HHS and CDC -- like the US Post Office and every other US federal
> agency, considers Palau, the Marshall Islands and the Federated States
> of Micronesia to be valid "US" areas for data-gathering and reporting
> purposes. And this causes a data importer based on USStateField to
> required ugly workarounds, since those are not valid choices for the
> field.
>
> Any chance of getting the choices fixed so we can actually make use of
> USStateField with this sort of data?

I'm +1 to fixing this general mess. The thread you pointed to contains
two workable solutions that I can see:

1) Ship a bunch of state subsets:

LOWER_48 = (...)
NON_CONTIGUOUS = (hi, ak)
PROTECTORATES = (...)
MILITARY_DROPS = (...)

STATE_CHOICES = LOWER_48 + NON_CONTIGUOUS
USPS_SERVICE = US_STATES + NON_CONTIGUOUS + PROTECTORATES + MILITARY_DROPS

This is completely backwards compatible as long as we keep
"STATE_CHOICES" to the same subset that exists today. It's a little
more fiddly to use, but it's completely consistent with every other
choice-based field in Django.

2) Given that this there is a limited collection of subsets, expose
these specific choices as booleans on the USStateField definition:

USStateField(lower_48=True, non_contiguous=True, military_drops=False, ...)

This is also backwards compatible, as long as the default values in
the field definition match the currently available choices. It's
explicit and easy to use, but it does leave the question of what to do
if you specify choices=[..] AND you tweak the state parameters (at a
guess, manually specified choices trumps flags).

I could live with either approach existing in the codebase. I won't
express a preference, though - I'll leave the decision of which
approach is preferable to those that will actually have to use it.

Yours,
Russ Magee %-)

Richard Laager

unread,
Dec 24, 2009, 2:30:15 AM12/24/09
to django-d...@googlegroups.com
On Thu, 2009-12-24 at 11:44 +0800, Russell Keith-Magee wrote:
> This is completely backwards compatible as long as we keep
> "STATE_CHOICES" to the same subset that exists today.

Yikes, that's really restrictive. You want that list to remain static
until Django 2.0?

I ask because the Canadian province list includes *incorrect*
abbreviations, which we discovered when trying to do a simple
state/province choices form field.

Richard

signature.asc

James Bennett

unread,
Dec 24, 2009, 3:49:04 AM12/24/09
to django-d...@googlegroups.com
On Wed, Dec 23, 2009 at 9:44 PM, Russell Keith-Magee
<freakb...@gmail.com> wrote:
> I could live with either approach existing in the codebase. I won't
> express a preference, though - I'll leave the decision of which
> approach is preferable to those that will actually have to use it.

Honestly, given both the controversy which prompted the original
change and the varying real-world needs, I think the two-field
solution is more appropriate:

One field -- USStateField -- should only do things which are actually
US states, plus the District of Columbia. Throw in a flag which
defaults to False and which restricts it to the 48 contiguous states +
DC. This avoids the political upheaval because it doesn't include
anything exotic or "not really US" like territories, protectorates,
COFA nations. It also provides for a couple extremely common use
cases: companies which will do business with you only if you're in an
actual US state, and companies which will do business with you only if
you're in the "lower 48".

The other field -- USPostalCodeField -- should accept any abbreviation
the US Post Office accepts. This allows the broader use case of
shipping to anywhere the Post Office can handle, and also lines up
with the abbreviations used for data reported by the US federal
government.

The actual sets of choices can be built up as you've described, or in
some other fashion (I don't particularly care, but would lean toward
breaking them into logical sets so people can do more fine-grained
stuff if they want).

Russell Keith-Magee

unread,
Dec 24, 2009, 6:08:47 AM12/24/09
to django-d...@googlegroups.com
On Thu, Dec 24, 2009 at 3:30 PM, Richard Laager <rla...@wiktel.com> wrote:
> On Thu, 2009-12-24 at 11:44 +0800, Russell Keith-Magee wrote:
>> This is completely backwards compatible as long as we keep
>> "STATE_CHOICES" to the same subset that exists today.
>
> Yikes, that's really restrictive. You want that list to remain static
> until Django 2.0?

No - I want the list of choices that currently contains the lower 48 +
HI, AK and DC to be called STATE_CHOICES - just like it is now - for
the forseeable future. For other uses, we can provide other helpful
groupings, like POSTAL_REGIONS (which contains anything USPS
recognizes); then, the user can choose
USStateField(choices=POSTAL_REGIONS) and get an extended set of
options.

> I ask because the Canadian province list includes *incorrect*
> abbreviations, which we discovered when trying to do a simple
> state/province choices form field.

If this is the case, then it is a bug, which should be logged and fixed.

I'm not saying we shouldn't fix bugs, or that state lists must be
fixed for all time. What I'm saying that any Django-based project that
is currently deployed using and uses USStateField is currently
rejecting Puerto Rico and American Armed Forces Middle East. These
project should continue to operate as-is without the need for any code
changes on the part of the developer. What we *can* do is open up
options that make it easy for a developer to opt into allowing other
regions as acceptable to USStateField.

Yours,
Russ Magee %-)

Russell Keith-Magee

unread,
Dec 24, 2009, 6:22:40 AM12/24/09
to django-d...@googlegroups.com
On Thu, Dec 24, 2009 at 4:49 PM, James Bennett <ubern...@gmail.com> wrote:
> On Wed, Dec 23, 2009 at 9:44 PM, Russell Keith-Magee
> <freakb...@gmail.com> wrote:
>> I could live with either approach existing in the codebase. I won't
>> express a preference, though - I'll leave the decision of which
>> approach is preferable to those that will actually have to use it.
>
> Honestly, given both the controversy which prompted the original
> change and the varying real-world needs, I think the two-field
> solution is more appropriate:

My concern with having two fields is that it introduces a false
dichotomy. There aren't just 2 options here - potentially any
permutation of the following list is possible:

Lower 48
DC
AK + HI
US Protectorates
US Military Drops

So as soon as you introduce a USPostalField that includes all of
these, I guarantee that someone will ask for a field that has all the
choices *except* the US Military drops, and then someone will find
some reason to why Guam should be on the list, but the American
Marianas shouldn't, until eventually you have a dozen US*Field fields.

The reason this discussion is happening in the first place is that we
couldn't come up with a single set of options that would keep everyone
happy. Increasing the number of options to 2 decreases the number of
disparate groups by 1 - but that doesn't mean we're left with 2
disparate groups.

Adopting USStateField(choices=POSTAL_REGIONS) or
USStateField(lower_48=True, ...) means that end users can mix and
match whatever combinations they need - we provide a single field, and
let them mix whatever combination of options they want.

Yours,
Russ Magee %-)

James Bennett

unread,
Dec 24, 2009, 2:55:45 PM12/24/09
to django-d...@googlegroups.com
On Thu, Dec 24, 2009 at 5:22 AM, Russell Keith-Magee
<freakb...@gmail.com> wrote:
> My concern with having two fields is that it introduces a false
> dichotomy. There aren't just 2 options here - potentially any
> permutation of the following list is possible:

While this is true, there are three common cases, which can be handled
adeptly by two fields:

1. "Any US state"

2. "Any of the lower 48 states"

3. "Anything the US Post Office recognizes"

> So as soon as you introduce a USPostalField that includes all of
> these, I guarantee that someone will ask for a field that has all the
> choices *except* the US Military drops, and then  someone will find
> some reason to why Guam should be on the list, but the American
> Marianas shouldn't, until eventually you have a dozen US*Field fields.

No, this is why you make the choices available in a fine-grained
manner; the built-in fields cover the common use cases above, and then
the ability to pick and choose the set of choices you want for your
own custom stuff covers everything else.

> The reason this discussion is happening in the first place is that we
> couldn't come up with a single set of options that would keep everyone
> happy. Increasing the number of options to 2 decreases the number of
> disparate groups by 1 - but that doesn't mean we're left with 2
> disparate groups.

No, the reason for this discussion is that "USStateField" included a
bunch of things which weren't states, some of which weren't even part
of the US, and people had problems with them being subsumed under that
name. So let's not do that.

> Adopting USStateField(choices=POSTAL_REGIONS) or
> USStateField(lower_48=True, ...) means that end users can mix and
> match whatever combinations they need - we provide a single field, and
> let them mix whatever combination of options they want.

Except your argument turns on its head here -- if your logic above
were correct, you'd be against this as well, since obviously we'd need
to start adding ever more fine-grained arguments to the field to tune
the choices (palau=False, micronesia=True, armed_forces=True,
virgin_islands=False, etc.). But you've recognized that covering the
common case with fields and covering the rest with flexible choice
sets works, so you really ought to be agreeing with me here ;)

Russell Keith-Magee

unread,
Dec 24, 2009, 6:38:17 PM12/24/09
to django-d...@googlegroups.com
On Fri, Dec 25, 2009 at 3:55 AM, James Bennett <ubern...@gmail.com> wrote:
> On Thu, Dec 24, 2009 at 5:22 AM, Russell Keith-Magee
> <freakb...@gmail.com> wrote:
>> My concern with having two fields is that it introduces a false
>> dichotomy. There aren't just 2 options here - potentially any
>> permutation of the following list is possible:
>
> While this is true, there are three common cases, which can be handled
> adeptly by two fields:
>
> 1. "Any US state"
>
> 2. "Any of the lower 48 states"
>
> 3. "Anything the US Post Office recognizes"
>
>> So as soon as you introduce a USPostalField that includes all of
>> these, I guarantee that someone will ask for a field that has all the
>> choices *except* the US Military drops, and then  someone will find
>> some reason to why Guam should be on the list, but the American
>> Marianas shouldn't, until eventually you have a dozen US*Field fields.
>
> No, this is why you make the choices available in a fine-grained
> manner; the built-in fields cover the common use cases above, and then
> the ability to pick and choose the set of choices you want for your
> own custom stuff covers everything else.

Ok - I wasn't clear that your proposal still included making all the
various choices (and choice subsets) available in a manner consumable
by end users.

>> The reason this discussion is happening in the first place is that we
>> couldn't come up with a single set of options that would keep everyone
>> happy. Increasing the number of options to 2 decreases the number of
>> disparate groups by 1 - but that doesn't mean we're left with 2
>> disparate groups.
>
> No, the reason for this discussion is that "USStateField" included a
> bunch of things which weren't states, some of which weren't even part
> of the US, and people had problems with them being subsumed under that
> name. So let's not do that.

The issue was that people who wanted "just states" (for any one of a
couple of definitions of "just states") couldn't get that out of the
USStateField. Once we have the flexibility issue sorted, I'm not
particularly concerned by people that want to make some sort of
Sapir-Whorf argument about class naming.

>> Adopting USStateField(choices=POSTAL_REGIONS) or
>> USStateField(lower_48=True, ...) means that end users can mix and
>> match whatever combinations they need - we provide a single field, and
>> let them mix whatever combination of options they want.
>
> Except your argument turns on its head here -- if your logic above
> were correct, you'd be against this as well, since obviously we'd need
> to start adding ever more fine-grained arguments to the field to tune
> the choices (palau=False, micronesia=True, armed_forces=True,
> virgin_islands=False, etc.). But you've recognized that covering the
> common case with fields and covering the rest with flexible choice
> sets works, so you really ought to be agreeing with me here ;)

How could I have been so mistaken! :-)

I can accept that the palau=False approach is limited in the way you
describe. It's also limited in that you can't control the exact
ordering of choices (e.g., I want protectorates after military drops).
I withdraw that suggestion.

With that option removed, I don't think we're actually disagreeing
that much. If you're proposing that USStateField have a choices
argument, and we provide a bunch of constituent sets that can be used
to build your own "US-statey things including Guam but not Palau"
field, then the only real difference between your proposal and mine is
that you're suggesting including USPostalField() as a shortcut for
USStateField(choices=POSTAL_REGIONS), and adding a flag to
USStateField(lower_48=True) as a shortcut for
USStateField(choices=LOWER_48).

Personally, I'm a bit meh on having the shortcuts (especially the
lower_48 flag shortcut), but if you think there is value in adding
them, I'm not going to get bent out of shape over the issue. After
all, you're the one who will have to use it. AUStateField() works fine
:-).

Russ %-)

Reply all
Reply to author
Forward
0 new messages