Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Case-insensitive string compare?

495 views
Skip to first unread message

Fredrik Lundh

unread,
Sep 4, 2008, 6:21:59 PM9/4/08
to pytho...@python.org
Robert Dailey wrote:

> I currently have a dictionary object that I'm doing the following with:
>
> if lib not in stage_map:
> # ... do stuff ...
>
> However, this will perform a case-sensitive comparison between lib and
> each key in stage_map. Is there a way to make this do a case-insensitive
> comparison instead?

dictionary lookups use the exact value. to make a case-insensitive
lookup, use key.lower() instead of key when creating the dictionary, and
then do

if lib.lower() not in state_map:
...

</F>

Maric Michaud

unread,
Sep 4, 2008, 6:38:03 PM9/4/08
to pytho...@python.org
Le Friday 05 September 2008 00:14:37 Robert Dailey, vous avez écrit :
> Hi,

>
> I currently have a dictionary object that I'm doing the following with:
>
> if lib not in stage_map:
> # ... do stuff ...
>
> However, this will perform a case-sensitive comparison between lib and each
> key in stage_map. Is there a way to make this do a case-insensitive
> comparison instead? Yes, I realize I could do this:
>
> if lib not in [key.lower() for key in stage_map]
>

This is the normal python way to compare string without case, and the idiom is
not surprising. You''ll often see for loops written like this :

for i in (e for e in iterable if predicate(e)) :
...

> However, I don't want to do this because it'll quickly end up getting
> messy. I want this solution to be an absolute last resort. I'm not a pro
> with python, so I'm hoping there are cleaner alternatives out there.

If you fell hard to read you can break it on two lines :

lowered_keys = (key.lower() for key in stage_map)
if lib.lower() not in lowered_keys :
....

BTW, it's string manipulation, it's normal it takes more than one brain cycle
to achieve. Try the regexp solution and make your choice, more than often, I
keep the straightforward, "read as pseudo-code", python expression.

--
_____________

Maric Michaud

Chris Rebert

unread,
Sep 4, 2008, 6:47:00 PM9/4/08
to Robert Dailey, pytho...@python.org, Fredrik Lundh
On Thu, Sep 4, 2008 at 3:37 PM, Robert Dailey <rcda...@gmail.com> wrote:
> On Thu, Sep 4, 2008 at 5:21 PM, Fredrik Lundh <fre...@pythonware.com>
> wrote:

>>
>> Robert Dailey wrote:
>>
>>> I currently have a dictionary object that I'm doing the following with:
>>>
>>> if lib not in stage_map:
>>> # ... do stuff ...
>>>
>>> However, this will perform a case-sensitive comparison between lib and
>>> each key in stage_map. Is there a way to make this do a case-insensitive
>>> comparison instead?
>>
>> dictionary lookups use the exact value. to make a case-insensitive
>> lookup, use key.lower() instead of key when creating the dictionary, and
>> then do
>>
>> if lib.lower() not in state_map:
>> ...
>
> So you're saying to ensure that stage_map's keys are initially lower-case to
> begin with? Well, I can't do this either since the case of the keys is
> actually valuable later on. It's only for the purposes of this specific
> comparison operation that the case should be ignored.

Then store the string in its original case in the value part of the
key-value pair:

stage_map[key.lower()] = (key,whatever)

- Chris

>
> --
> http://mail.python.org/mailman/listinfo/python-list
>

--
Follow the path of the Iguana...
http://rebertia.com

Maric Michaud

unread,
Sep 4, 2008, 6:59:16 PM9/4/08
to pytho...@python.org
Le Friday 05 September 2008 00:47:00 Chris Rebert, vous avez écrit :
> On Thu, Sep 4, 2008 at 3:37 PM, Robert Dailey <rcda...@gmail.com> wrote:
> > On Thu, Sep 4, 2008 at 5:21 PM, Fredrik Lundh <fre...@pythonware.com>
> >
> > wrote:
> >> Robert Dailey wrote:
> >>> I currently have a dictionary object that I'm doing the following with:
> >>>
> >>> if lib not in stage_map:
> >>> # ... do stuff ...
> >>>
> >>> However, this will perform a case-sensitive comparison between lib and
> >>> each key in stage_map. Is there a way to make this do a
> >>> case-insensitive comparison instead?
> >>
> >> dictionary lookups use the exact value. to make a case-insensitive
> >> lookup, use key.lower() instead of key when creating the dictionary, and
> >> then do
> >>
> >> if lib.lower() not in state_map:
> >> ...
> >
> > So you're saying to ensure that stage_map's keys are initially lower-case
> > to begin with? Well, I can't do this either since the case of the keys is
> > actually valuable later on. It's only for the purposes of this specific
> > comparison operation that the case should be ignored.
>
> Then store the string in its original case in the value part of the
> key-value pair:
>
> stage_map[key.lower()] = (key,whatever)
>

"premature optimization is the root of all evil"

I don't recall the OP wanted a (a bit) faster solution to his problem in
counterpart of memory loss and syntax complication.

If the OP's proposal seems already messy, how about ths one :
if lib.lower() not in ( e[0] for e in stage_map.items() ) :
...

--
_____________

Maric Michaud

Fredrik Lundh

unread,
Sep 5, 2008, 2:24:29 AM9/5/08
to pytho...@python.org
Maric Michaud wrote:

> "premature optimization is the root of all evil"

So is use by that statement by people who don't have the slightest idea
about what it actually means.

The full version is

"We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil."

Note the use of "small efficiencies". That is, in Python, things like
local binding, inlining, slots, lazy generation of short sequences, etc.
That is, things that takes time to write and reduces maintainability.

It's not about having an excuse for writing crappy code with large
inefficienies.

And it's definitely not about programmers intentionally picking a dumb
solution so they can optimize it later.

> If the OP's proposal seems already messy, how about ths one :
> if lib.lower() not in ( e[0] for e in stage_map.items() ) :
> ...

Given that your solution is just a remarkably inefficient way to write
"lib.lower() not in stage_map", and thus doesn't solve the OP:s problem,
I suspect you're taking the "evil" part of Hoare's dictum a few bits too
literally.

</F>

Fredrik Lundh

unread,
Sep 5, 2008, 2:30:44 AM9/5/08
to pytho...@python.org
Maric Michaud wrote:

> You''ll often see for loops written like this :
>
> for i in (e for e in iterable if predicate(e)) :
> ...

luckily, I don't. most people, when faced with that problem, writes it
in the obvious way:

for i in iterable:
if predicate(i):
...

to avoid an extra iterator and save typing.

</F>

Maric Michaud

unread,
Sep 5, 2008, 7:16:40 AM9/5/08
to pytho...@python.org
Le Friday 05 September 2008 08:30:44 Fredrik Lundh, vous avez écrit :
> Maric Michaud wrote:
> > You''ll often see for loops written like this :
> >
> > for i in (e for e in iterable if predicate(e)) :
> > ...
>
> luckily, I don't. most people, when faced with that problem, writes it
> in the obvious way:
>
> for i in iterable:
> if predicate(i):
> ...

So do I, most often, but this construct is common, I think because it makes
clear what the for loop is iterating over, also it comes naturally once used
to the more elaborated

for i in (expr(e) for e in iterable if predicate(e)) :

This one is truly a gain for readability IMO, compared to :

for i in iterable:
i = expr(i)
if predicate(i):

In the latter, the reader need more effort to figure out what finally "i" is
in the loop.

>
> to avoid an extra iterator and save typing.
>

and at the cost of an extra indentation level.

> </F>

Maric Michaud

unread,
Sep 5, 2008, 7:31:40 AM9/5/08
to pytho...@python.org
Le Friday 05 September 2008 08:24:29 Fredrik Lundh, vous avez écrit :
> Maric Michaud wrote:
> > "premature optimization is the root of all evil"
>
> So is use by that statement by people who don't have the slightest idea
> about what it actually means.
>
> The full version is
>
> "We should forget about small efficiencies, say about 97% of the time:
> premature optimization is the root of all evil."
>
> Note the use of "small efficiencies". That is, in Python, things like
> local binding, inlining, slots, lazy generation of short sequences, etc.
> That is, things that takes time to write and reduces maintainability.
>
> It's not about having an excuse for writing crappy code with large
> inefficienies.
>
> And it's definitely not about programmers intentionally picking a dumb
> solution so they can optimize it later.
>

Yes, I'm aware of that, thanks, but Chris Rebert was proposing to alter the
data structure in order to achieve this simple comparison, I don't think it
fails into your concern.

> > If the OP's proposal seems already messy, how about ths one :
> > if lib.lower() not in ( e[0] for e in stage_map.items() ) :
> > ...
>

Sorry, should be read :

if lib.lower() not in ( e[0] for e in stage_map.values() ) :

and it was the expression for comparison Chris's solution lead to, not a
proposal.

> Given that your solution is just a remarkably inefficient way to write
> "lib.lower() not in stage_map", and thus doesn't solve the OP:s problem,
> I suspect you're taking the "evil" part of Hoare's dictum a few bits too
> literally.
>

I suspect you are coming to conclusions a bit quickly, without taking the pain
of understanding the whole discussion.

J. Clifford Dyer

unread,
Sep 5, 2008, 8:33:22 AM9/5/08
to pytho...@python.org
On Thu, 2008-09-04 at 18:48 -0500, Robert Dailey wrote:
> Thanks everyone for your help. I'm not opposed to using [key.lower()
> for key in stage_map] at all, I was just curious to see if there were
> any cleaner alternatives. It looks like that is what I'll be using.
> I'm not familiar with how python works internally, but coming from C++
> it seems like "remaking" the map would be slow. However, speed is not
> my main concern in my particular situation, I'm just curious to learn
> more about python.

You should be opposed to that particular solution. You have just taken
a dictionary lookup (very fast) and turned it into a list traversal
(slow). Even if speed isn't your main concern, this is an unnecessary
de-optimization. You are deliberately slowing down your program to
avoid a slightly more verbose lookup later. Your data structure might
as well be a list of tuples to begin with, to avoid creating a new list.
You have effectively made your keys useless as keys.

If your lookups need to be case insensitive, make the key lower case,
and store the cased version in the value, whether as a tuple or a dict
(depending on whether you want named access).

d = {
'foo': {'key': 'Foo', 'value': 'val1'}
'spam': {'key': 'sPAm', 'value': 'val2'}
}

search = 'FOO'.lower()
if search in d:
result = d[search]
key = result['key']
value = result['value']

The only reason I wouldn't use this solution is if you expect to have
keys that will be identical when made lowercase, but if you're doing
case-insensitive lookup, you obviously don't expect this to be an issue.

Cheers,
Cliff


J. Cliff Dyer

unread,
Sep 5, 2008, 10:00:39 AM9/5/08
to pytho...@python.org
Please keep the discussion on-list.

On Fri, 2008-09-05 at 15:36 +0200, Maric Michaud wrote:

> The OP has already said the keys are case-sensitive, so this is not an option,
> the only way to do it fast is to index upon insertion all keys in another
> dict, so you get in final :
> d = { "kEy1" : 1, "Key1" : 2}
> indexes = { "key1" : ["kEy1", "Key1" ] }
>

That does not get the same behavior as the OP's original solution. The
OP's solution retrieves the first matching value only. Mine gets one
arbitrary matching value, determined by the algorithm used for
constructing the search dict (not specified in my solution). What the
OP actually said was that he can't throw away the case. You assume that
this means there would be multiple entries distinguished only by case.
That might be true, but given his initial solution, it seemed to me more
likely that he wants to keep the case for display purposes, as in a
dictionary like this:

{
"Michaud": "Maric",
"Dyer": "Cliff",
"D'Amato": "Alphonse",
}

To the OP: If you are looking for multiple matches for a given
case-insensitive key, check out the recent thread on "Python multimap",
for further discussion elaborating on how to map one key to many values
in a dictionary-like object.

At that point, it's probably worth factoring out your solution to a
separate class, so you can work with a clean interface, rather than
repeating the ugly details every time you use it.

Cheers,
Cliff

> > Cheers,
> > Cliff
> >
> >
> > --
> > http://mail.python.org/mailman/listinfo/python-list
>
>
>

Maric Michaud

unread,
Sep 5, 2008, 10:06:00 AM9/5/08
to pytho...@python.org

> Cheers,
> Cliff
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list

--
_____________

Maric Michaud

Maric Michaud

unread,
Sep 5, 2008, 10:30:50 AM9/5/08
to pytho...@python.org
Le Friday 05 September 2008 16:00:39 J. Cliff Dyer, vous avez écrit :
> Please keep the discussion on-list.
>

Sorry for the private email, I sent it again to the list..

> On Fri, 2008-09-05 at 15:36 +0200, Maric Michaud wrote:

> That does not get the same behavior as the OP's original solution. The
> OP's solution retrieves the first matching value only. Mine gets one
> arbitrary matching value, determined by the algorithm used for
> constructing the search dict (not specified in my solution).

string.lower() in indexes gives the same result, but your right, a set of
lowered keys could suffice for the exact OP's requirement.

> What the
> OP actually said was that he can't throw away the case. You assume that
> this means there would be multiple entries distinguished only by case.
> That might be true, but given his initial solution, it seemed to me more
> likely that he wants to keep the case for display purposes, as in a
> dictionary like this:
>
> {
> "Michaud": "Maric",
> "Dyer": "Cliff",
> "D'Amato": "Alphonse",
> }
>

I didn't want to make gratuitous extrapolations on the OP's data structure,
but I still think the OP's question was "how to write cleanly a
case-insensitive comparaison versus all the keys of a dict". Maybe I'm wrong.

--
_____________

Maric Michaud

Fredrik Lundh

unread,
Sep 5, 2008, 1:36:56 PM9/5/08
to pytho...@python.org
Maric Michaud wrote:

> I suspect you are coming to conclusions a bit quickly, without taking the pain
> of understanding the whole discussion.

I'm pretty sure I was the first one to post an answer in this thread,
and I understand Python design and performance issues very well, thank you.

(but given your talk about "the cost of whitespace" in a response to a
comment about performance in that other subthread, it's obvious that
you're just here to provide noise. plonk plonk.)

</F>

Ethan Furman

unread,
Sep 5, 2008, 2:42:13 PM9/5/08
to pytho...@python.org

Actually, the OP said:
-- So you're saying to ensure that stage_map's keys are initially
-- lower-case to begin with? Well, I can't do this either since the
-- *case of the keys is actually valuable* ***later on***. It's only for
-- the purposes of this specific comparison operation that the case
-- should be ignored.

In other words, the key (as a key) is case-insensitive, and the key (as
a value) is case-sensitive -- making Clifford's comments and solution
perfectly acceptable.

~Ethan~

Maric Michaud

unread,
Sep 15, 2008, 8:42:43 AM9/15/08
to pytho...@python.org
Le Friday 05 September 2008 19:36:56 Fredrik Lundh, vous avez écrit :
> Maric Michaud wrote:
> > I suspect you are coming to conclusions a bit quickly, without taking the
> > pain of understanding the whole discussion.
>
> I'm pretty sure I was the first one to post an answer in this thread,
> and I understand Python design and performance issues very well, thank you.
>

What is that ? Some sort of arguments ? If it matters, I know who you are and
I don't try to compare my understanding of python with yours.

> (but given your talk about "the cost of whitespace" in a response to a
> comment about performance in that other subthread,

It's silly that you continue to fight on unrelated arguments, I've already
said that I misunderstood the OP's requirements and the proposed solution.

> it's obvious that
> you're just here to provide noise. plonk plonk.)

I don't think you should make those assumptions, in your words, there are too
much dumb or malevolent peoples on this list for it to be that "obvious".

Nevertheless, I don't really see why I should explain myself for being here
and trying to bring a (little, not like yours, of course) contribution.


--
_____________

Maric Michaud

0 new messages