C API : Does utrans_openU register system transliterators?

27 views
Skip to first unread message

douglas mennella

unread,
Sep 29, 2025, 12:10:09 AMSep 29
to icu-s...@unicode.org
I’m trying to understand how the C API works.

In the utrans_register documentation it says:
“NOTE: After this call the system owns the adoptedTrans and will close it. The user must not call utrans_close()on adoptedTrans.”


But in the utrans_openU documentation it says 

“the transliterator rules. See the C++ header rbt.h for rules syntax. If NULL then a system transliterator matching the ID is returned.”


Does this mean when passing a NULL for rules in utrans_openU that you’re not expected to explicitly close the returned transliterator?

Separately I’m a little unclear if I’m responsible for releasing resources of the “system” as a whole.

Any guidance would be appreciated.

Regards,
Doug

Sent from my iPhone

Markus Scherer

unread,
Sep 29, 2025, 12:19:46 PMSep 29
to douglas mennella, icu-s...@unicode.org
On Sun, Sep 28, 2025 at 9:10 PM douglas mennella <douglas....@gmail.com> wrote:
I’m trying to understand how the C API works.

In the utrans_register documentation it says:
“NOTE: After this call the system owns the adoptedTrans and will close it. The user must not call utrans_close()on adoptedTrans.”


But in the utrans_openU() documentation it says 

“the transliterator rules. See the C++ header rbt.h for rules syntax. If NULL then a system transliterator matching the ID is returned.”

Same link as above. The link to utrans_openU docs is here.

Does this mean when passing a NULL for rules in utrans_openU that you’re not expected to explicitly close the returned transliterator?

No. utrans_openU() returns a pointer to a non-const UTransliterator. That means
- You own it and have to _close() it.
- When you pass it into functions that take non-const UTransliterator, it will be mutated/modified.

If you get “a system transliterator matching the ID” then it's a mutable clone.

Separately I’m a little unclear if I’m responsible for releasing resources of the “system” as a whole.

No.

Best regards,
markus

douglas mennella

unread,
Sep 29, 2025, 5:54:02 PMSep 29
to Markus Scherer, icu-s...@unicode.org
Thanks very much.  I still have one more question in that case.  If I do register it can I save constructing a new one each time I need it?  If so do I use the same openU interface to get the registered transliterator?  How do I check if it’s registered?

Sent from my iPhone

On Sep 30, 2025, at 1:19 AM, Markus Scherer <marku...@gmail.com> wrote:



Markus Scherer

unread,
Sep 29, 2025, 7:53:06 PMSep 29
to douglas mennella, icu-s...@unicode.org
On Mon, Sep 29, 2025 at 2:54 PM douglas mennella <douglas....@gmail.com> wrote:
If I do register it can I save constructing a new one each time I need it?  If so do I use the same openU interface to get the registered transliterator?  How do I check if it’s registered?

Once registered, you should be able to open one via its ID, without the rules. You should get a clone of what you registered, rather than getting the rules parsed again.
If something doesn't work, then you should get a failure UErrorCode.

markus

douglas mennella

unread,
Sep 29, 2025, 9:35:11 PMSep 29
to Markus Scherer, icu-s...@unicode.org
Thanks though I’m still slightly confused.  For my case I’m not using any rules so I’m using openU with a null pointer for the rules.

The first time I call openU I and get the transliterator.  Then I register it.  The second time do I get it with openU again?  Should I expect to get a clone of the registered version?  How do I know I don’t need to follow up the second call to openU with a registration?  Can I check if the version I get back is registered?

Thanks again for your help here.

Sent from my iPhone

On Sep 30, 2025, at 8:53 AM, Markus Scherer <marku...@gmail.com> wrote:



Markus Scherer

unread,
Sep 30, 2025, 11:29:30 AMSep 30
to douglas mennella, icu-s...@unicode.org
On Mon, Sep 29, 2025 at 6:35 PM douglas mennella <douglas....@gmail.com> wrote:
Thanks though I’m still slightly confused.  For my case I’m not using any rules so I’m using openU with a null pointer for the rules.

The first time I call openU I and get the transliterator.  Then I register it.

Why are you registering a transliterator if you are not creating a new one from rules or providing one with custom code?
Do you have a complex compound ID that takes a long time to parse and build?

douglas mennella

unread,
Oct 1, 2025, 2:50:20 AMOct 1
to Markus Scherer, icu-s...@unicode.org
Thanks for getting back.  First, I’m new all of this so I’m trying to understand the how the API is intended to be used.  I wouldn’t read too much into my wild guesses.

As I said earlier I was hoping to avoid the cost of construction of the transliterator each time I need a transliteration.  I could somehow manage to hold onto the first one I construct but I was guessing that registration takes care of some of that work for me.

What I was expecting was something like the following process:

- try to fetch a registered transliterator 
- if that fails construct a transliterator and register it

Having the API point for fetching and constructing be the same function call doesn’t fit that pattern though which led to my confusion.

My first attempt at this was to call openU whenever I need a translation but my hunch is that’s expensive.

Is there any mode of usage similar to what I described above?

Thanks again,
-Doug

Sent from my iPhone

On Oct 1, 2025, at 12:29 AM, Markus Scherer <marku...@gmail.com> wrote:



Markus Scherer

unread,
Oct 1, 2025, 10:19:08 AMOct 1
to douglas mennella, icu-s...@unicode.org
I don't know all of the details of the Transliterator API, but
  • As far as I know, the built-in transliterators are cached after you open them.
  • Registering one adds it into the cache; registering a built-in one should be redundant.
  • Getting one via ID from the cache should be a lookup + clone.
  • You get a new object each time you create/open a Transliterator because the object is stateful. If you were to just get a pointer to the cached item, then you would share/mix/corrupt state across call sites and threads.
Hope this helps,
markus

douglas mennella

unread,
Oct 2, 2025, 2:03:30 AMOct 2
to Markus Scherer, icu-s...@unicode.org
That is helpful.  Thanks.  I still think it feels funny that you can’t fetch a registered one without calling openU which is indistinguishable from constructing one.

Perhaps openU consults the cache?  If so why wouldn’t it be self-registering?

In any event, at this point I have the info I need am just ruminating on the API.

Thanks again for your help.

Regards,
-Doug

Sent from my iPhone

On Oct 1, 2025, at 11:19 PM, Markus Scherer <marku...@gmail.com> wrote:



Markus Scherer

unread,
Oct 2, 2025, 12:47:24 PMOct 2
to douglas mennella, icu-s...@unicode.org
On Wed, Oct 1, 2025 at 11:03 PM douglas mennella <douglas....@gmail.com> wrote:
That is helpful.  Thanks.  I still think it feels funny that you can’t fetch a registered one without calling openU which is indistinguishable from constructing one.

Perhaps openU consults the cache?  If so why wouldn’t it be self-registering?

I don't know what's "funny" here. u<service>_openXyz() functions give you an object that does what you request, like a C++ or Java factory method. Depending on the service, it might use a cache to speed things up, and/or a registry to allow you to use custom implementations. Most service objects are mutable (settings, iterators), so you get and then own a mutable object. If you work in C++, you can use a LocalUTransliteratorPointer (functionally similar to std::unique_ptr) for automatic release.

douglas mennella

unread,
Oct 2, 2025, 10:58:22 PMOct 2
to Markus Scherer, icu-s...@unicode.org
As the subject suggests my questions are pointed towards the C API.

What’s funny is that it’s not clear what’s expected of the end user.  I seem to have a choice of whether to register it or not but no way of knowing whether it was created “with a cache” or not.  I’m also told that if I don’t register it I’m expected to close it implying I own it.  I thought I was clear in that my interest is in keeping down construction costs.  If the answer is that I have to do that singleton management myself that’s fine but it could be more clear either in the way the API is structured or in the documentation.

I tried to be pretty clear about the API I expected so I’m not sure why there’s still confusion about that.

In any event, as I said this is mostly mulling over the API.  I’m happy to engage in that topic but I don’t understand the point of putting me on my back foot.

Thanks again for your responses.

Regards,
-Doug

Sent from my iPhone

On Oct 3, 2025, at 1:47 AM, Markus Scherer <marku...@gmail.com> wrote:


Reply all
Reply to author
Forward
0 new messages