Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Fast Case Insensitive String Comparisons

25 views
Skip to first unread message

Merk

unread,
Dec 12, 2006, 12:14:54 PM12/12/06
to
What are some alternatives to using .ToUpper() to perform case insensitive
string comparisons?

The reason I'm asking is that I'm comparing strings in a long loop, looking
for equality; and I want for this loop to run as fast as possible. So I'm
looking for a method that would be faster than .ToUpper().

Thanks!


Andy

unread,
Dec 12, 2006, 12:26:14 PM12/12/06
to
Did you try specifying case insensitivity in the .Compare method?

Marc Gravell

unread,
Dec 12, 2006, 12:28:47 PM12/12/06
to
In 2.0, IIRC, from tests (now deleted) I believe that
string.Equals(lhs,rhs, ComparerOptions.OrdinalIgnoreCase) is the
fastest.

You can also use StringComparer.OrdinalIgnoreCase.Equals(...) but I
beleive that this is a little slower.

Your best bet is to try every option in a tight loop to test;
you could try:

lhs.ToUpper() == rhs.ToUpper()

lhs.Equals(rhs, StringComparison.OrdinalIgnoreCase); // or
InvariantCultureIgnoreCase

string.Equals(lhs, rhs, StringComparison.OrdinalIgnoreCase) // or
InvariantCultureIgnoreCase

StringComparer.OrdinalIgnoreCase.Equals(lhs, rhs); // or invariant case
insensitive

etc

Marc

Marc Gravell

unread,
Dec 12, 2006, 12:30:41 PM12/12/06
to
> ComparerOptions.OrdinalIgnoreCase
I meant StringComparison.OrdinalIgnoreCase, but intellisense would have
told you that...

Marc

bob.abr...@qrm.com

unread,
Dec 12, 2006, 12:32:33 PM12/12/06
to

System.String.Compare(string a, string b, bool ignoreCase);

or

System.Collection.CaseInsensitiveComparer.DefaultInvariant.Compare(string
a, string b);

anyone have any ideas about my mulitple browser problem?

Bruce Wood

unread,
Dec 12, 2006, 1:20:03 PM12/12/06
to

Are you comparing the same string over and over again? For example, are
you sorting an array? If you are, then store the strings in both their
uppercase and mixed case versions, and compare only the uppercase
versions. You'll incur the cost of uppercasing them only once, and then
get the payback on the comparisons. Trade memory for more speed.

If you test each string only once then, of course, this won't help.

Usually you gain efficiencies when you step back and look at the
overall problem, and how you can avoid doing the same work over and
over again, rather than trying to figure out how to do that work faster.

Jon Skeet [C# MVP]

unread,
Dec 12, 2006, 2:27:06 PM12/12/06
to
Marc Gravell <marc.g...@gmail.com> wrote:
> In 2.0, IIRC, from tests (now deleted) I believe that
> string.Equals(lhs,rhs, ComparerOptions.OrdinalIgnoreCase) is the
> fastest.
>
> You can also use StringComparer.OrdinalIgnoreCase.Equals(...) but I
> beleive that this is a little slower.
>
> Your best bet is to try every option in a tight loop to test;
> you could try:
>
> lhs.ToUpper() == rhs.ToUpper()

Note that this test is not a culture-safe one. For instance, in Turkey,
I believe (if I remember the bug I had to fix in a system a while ago
:) that "mail".ToUpper() != "MAIL".

Using a StringComparer is a much better way, IMO.

--
Jon Skeet - <sk...@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Mark Wilden

unread,
Dec 12, 2006, 3:21:28 PM12/12/06
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.1fe912ea1...@msnews.microsoft.com...

>
> Note that this test is not a culture-safe one. For instance, in Turkey,
> I believe (if I remember the bug I had to fix in a system a while ago
> :) that "mail".ToUpper() != "MAIL".

Just out of curiosity, did "mail".ToUpper() == "MAIL".ToUpper()?

///ark


Jon Skeet [C# MVP]

unread,
Dec 12, 2006, 3:28:14 PM12/12/06
to

Nope :)

using System;
using System.Globalization;
using System.Threading;

class Test
{
static void Main()
{
CultureInfo info = CultureInfo.CreateSpecificCulture("tr-TR");

Thread.CurrentThread.CurrentCulture = info;

Console.WriteLine ("mail".ToUpper()=="MAIL");
Console.WriteLine ("mail".ToUpper()=="MAIL".ToUpper());
}
}

ToLower() doesn't work either.

Isn't i18n fun? :)

Marc Gravell

unread,
Dec 12, 2006, 4:12:41 PM12/12/06
to
Good to know; cheers for the input Jon.

For ref, I only mentioned the ToUpper() as a performance comparison
(since the OP explicitely mentioned it) to the StringComparer and
string.Equals() [with stated comparison], but thanks for the heads-up
and "proof positive" example.

Marc

Lucian Wischik

unread,
Dec 12, 2006, 6:30:38 PM12/12/06
to
Jon Skeet [C# MVP] <sk...@pobox.com> wrote:

>Mark Wilden <mwi...@communitymtm.com> wrote:
>> Just out of curiosity, did "mail".ToUpper() == "MAIL".ToUpper()?
>Nope :)

Funny!

The issue was that lowercase "i" gets capitalised to U+0130, "Latin
Capital Letter I With Dot Above".

Instead of the more normal U+0049, "Latin Capital Letter I".


I'm curious! Are there any Turks here who can explain Turkish
capitalisation?

--
Lucian

Mark Wilden

unread,
Dec 12, 2006, 7:41:31 PM12/12/06
to
>> "Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
>> news:MPG.1fe912ea1...@msnews.microsoft.com...
>> >
>> > Note that this test is not a culture-safe one. For instance, in Turkey,
>> > I believe (if I remember the bug I had to fix in a system a while ago
>> > :) that "mail".ToUpper() != "MAIL".
>>
>> Just out of curiosity, did "mail".ToUpper() == "MAIL".ToUpper()?
>
> Nope :)

Oh well - I guess it's nobody's business but the Turks'.

///ark


Jon Skeet [C# MVP]

unread,
Dec 13, 2006, 2:37:24 AM12/13/06
to
Mark Wilden <mwi...@communitymtm.com> wrote:
> > Nope :)
>
> Oh well - I guess it's nobody's business but the Turks'.

Are you suggesting a history-insensitive comparison?

StringComparer.IgnoreHistory.Equals("Istanbul". "Constantinople")

Next up: a "man" comparison: Man.Triangle > Man.Particle etc?

0 new messages