ICU Design Proposal: Support inverse rule for [] span in RBNF

0 views
Skip to first unread message

George Rhoten

unread,
Jan 5, 2025, 1:34:14 PMJan 5
to icu-design
Dear ICU team & users,

I would like to propose the following RBNF syntax change for: ICU 77
Please provide feedback by: 2024-1-15
Designated API reviewer: Volunteers are welcome

This proposal only affects the documentation and RBNF syntax.

I’d like to extend the RBNF syntax to support more complex grammar.  I’d like to change the omission rule with square brackets.  By default, everything between the square brackets are omitted when the remainder is 0.  My proposal will not change this behavior by default, unless a “|” (pipe symbol) is present between the square brackets.  You can think of it performing like an else statement.  Everything between the beginning square bracket and the pipe acts as it currently does.  Everything between the pipe symbol and the end square bracket will be used instead of omitting the text.

This behavior is important for supporting large ordinals in slavic languages.  It’s convenient for other languages, like English.

The test case in the prototype and the ticket provides more examples of the change.  Below is a simplified example of the new syntax.  Right now, we have the following ordinals in English.
%%tieth:
0: tieth;
1: ty-=%spellout-ordinal=;
%spellout-ordinal:
...
20: twen>%%tieth>;
30: thir>%%tieth>;
40: for>%%tieth>;
50: fif>%%tieth>;
That could be simplified to the following rules instead.
%spellout-ordinal:
...
20: twent[y->>|ieth];
30: thirt[y->>|ieth];
40: fort[y->>|ieth];
50: fift[y->>|ieth];
The cardinal and ordinal rules will work on either side of the pipe symbol.

I plan to port these changes from ICU4J to ICU4C before creating a pull request.  Once a released version of ICU starts supporting this syntax, the CLDR rules will be able to adopt this new syntax for the languages that need it.

Sincerely,
George

Mark Davis Ⓤ

unread,
Jan 6, 2025, 4:45:03 PMJan 6
to George Rhoten, icu-design
Looks reasonable to me, but I'd also like to make sure Rich is cool with it.

--
You received this message because you are subscribed to the Google Groups "icu-design" group.
To unsubscribe from this group and stop receiving emails from it, send an email to icu-design+...@unicode.org.
To view this discussion visit https://groups.google.com/a/unicode.org/d/msgid/icu-design/24F70BEC-08D6-4880-94AF-6C138CE933D4%40apple.com.
For more options, visit https://groups.google.com/a/unicode.org/d/optout.

George Rhoten

unread,
Jan 6, 2025, 5:52:34 PMJan 6
to Mark Davis Ⓤ, icu-design
Thanks!

Rich said that he will try to get to it today.  

Also porting to ICU4C was simpler than anticipated.  The pull request for all the changes is here: https://github.com/unicode-org/icu/pull/3326

George

Rich Gillam

unread,
Jan 6, 2025, 7:55:16 PMJan 6
to George Rhoten, Mark Davis Ⓤ, icu-design
George and I have talked about this proposal a couple times before.  I’m wholeheartedly in favor; I just wish I’d thought of it first.  :-)

I’m taking a look at the PR right now...

—Rich

--
You received this message because you are subscribed to the Google Groups "icu-design" group.
To unsubscribe from this group and stop receiving emails from it, send an email to icu-design+...@unicode.org.

George Rhoten

unread,
Jan 7, 2025, 10:56:43 PMJan 7
to icu-design
Thanks!

Rich approved the pull request at https://github.com/unicode-org/icu/pull/3326

Unless I hear an objection or other feedback, I’ll merge it Wednesday of next week.

George
Reply all
Reply to author
Forward
0 new messages