Effect of suppressing contractions in the root search collation

2 views
Skip to first unread message

Henri Sivonen

unread,
Feb 15, 2023, 2:32:07 PM2/15/23
to cldr-...@unicode.org
Hi,

Do I understand correctly that the suppressContractions line in the root search collation has no effect on which comparisons have "equal" as the result and it is only a performance optimization that has some search-irrelevant effects on greater-than vs. less-than results?

Markus Scherer

unread,
Feb 15, 2023, 4:52:21 PM2/15/23
to Henri Sivonen, cldr-...@unicode.org
On Wed, Feb 15, 2023 at 11:32 AM Henri Sivonen <hsiv...@mozilla.com> wrote:
Do I understand correctly that the suppressContractions line in the root search collation has no effect on which comparisons have "equal" as the result and it is only a performance optimization that has some search-irrelevant effects on greater-than vs. less-than results?

The characters whose root contractions are suppressed are the ones with the Logical_Order_Exception property, that is, vowels that are written before their consonants but are pronounced after.
The root contractions make vowel+consonant sort like consonant+vowel, for combinations that occur in normal text.

For search, it's better to work visually, for which you need to undo the contractions.

markus

Henri Sivonen

unread,
Feb 16, 2023, 5:39:41 AM2/16/23
to Markus Scherer, cldr-...@unicode.org
On Wed, Feb 15, 2023 at 11:52 PM Markus Scherer <marku...@gmail.com> wrote:
On Wed, Feb 15, 2023 at 11:32 AM Henri Sivonen <hsiv...@mozilla.com> wrote:
Do I understand correctly that the suppressContractions line in the root search collation has no effect on which comparisons have "equal" as the result and it is only a performance optimization that has some search-irrelevant effects on greater-than vs. less-than results?

The characters whose root contractions are suppressed are the ones with the Logical_Order_Exception property, that is, vowels that are written before their consonants but are pronounced after.
The root contractions make vowel+consonant sort like consonant+vowel, for combinations that occur in normal text.

Thanks. At least the ones I've spot-checked are all of that form such that the contraction outputs two primaries in the swapped order relative to the mappings for the characters alone.
 
For search, it's better to work visually, for which you need to undo the contractions.

ECMA-402 only does full-string matching, so the incremental progression upon typing the string to be searched doesn't matter. I.e. if the only effect of suppressing the contraction is swapping the order of certain primary pairs in both the haystack and the needle, the cases that compare equal are the same, AFAICT.

--
Henri Sivonen
hsiv...@mozilla.com
Reply all
Reply to author
Forward
0 new messages