Sorting names: fast or correct, pick one

20 views
Skip to first unread message

Eemeli Aro

unread,
Feb 14, 2014, 5:47:53 AM2/14/14
to konop...@googlegroups.com
On 10 February 2014 14:25, Gareth Kavanagh <omeg...@gmail.com> wrote:
>
> On 10 February 2014 12:17, Eemeli Aro <eem...@gmail.com> wrote:
>>
>> On 10 February 2014 13:49, Gareth Kavanagh <omeg...@gmail.com> wrote:
>> > A Quick question, How are you handling Sorts for the characters that do
>> > not exist in the English set?
>> >
>> > I noticed previously that if you are sorting or splitting on say Ó that
>> > it basically vanishes from the list.
>>
>> Ah. You're right, it does. The sorting of names is currently by char
>> code value; A is 65, Z is 90, and Ó is 211. What's happening therefore
>> isn't about the characters being non-English, it's about their
>> alphabetic order not matching their char code order.
>>
>> I'll look into this, and probably switch to using localeCompare()
>> instead, but may need to consider older browsers carefully as well
>> since it's a rather recent feature.
>
> Thanks,
>
> It was one of the few bugs i found, and then forgot all about until this
> language thing came up.

As it turns out, pretty much all browsers support localeCompare at
least to some extent. It's just very, very slow. I currently sort the
people array when loading, and with the string comparison it's about
100x faster than with localeCompare(). Using Arisia data and its 492
programme participants, switching to localeCompare adds about half a
second to the page's load time on my *desktop* computer running Chrome
32.

There's bound to be a way of doing this fast and correct, both at the
same time, but for now I added a new flag non_ascii_people that if set
to true, enables localeCompare. I'd recommend testing with your actual
data before switching that on, as it does bring a serious slowdown
with it.

eemeli
Reply all
Reply to author
Forward
0 new messages