potential security

100 views
Skip to first unread message

ossdev

unread,
May 4, 2026, 3:30:07 PMMay 4
to icu-design
Hello,
I wanted to discuss potential security improvements to icu library. Specifically I wanted to look into adopting the Safe Buffers Programming Model to migrate from C-style buffers to C++ containers, including the use of -Wunsafe-buffer-usage.

Would the maintainers be interested in adopting safe buffers, I recall warnings showing in several thousands and the library could benefit from using safe buffers.

Sent with Proton Mail secure email.

Fredrik Roubert

unread,
May 4, 2026, 4:14:37 PMMay 4
to devtes...@gmail.com, icu-design
On Mon, May 4, 2026 at 9:30 PM ossdev <devtes...@gmail.com> wrote:

> Would the maintainers be interested in adopting safe buffers,

In general, yes, that's certainly a goal we'd like to work towards.
See for example some previous work in that direction done here:

https://unicode-org.atlassian.net/browse/ICU-22901
https://unicode-org.atlassian.net/browse/ICU-22696
https://unicode-org.atlassian.net/browse/ICU-22520
https://unicode-org.atlassian.net/browse/ICU-21289
https://unicode-org.atlassian.net/browse/ICU-21035
https://unicode-org.atlassian.net/browse/ICU-20803
https://unicode-org.atlassian.net/browse/ICU-20445
https://unicode-org.atlassian.net/browse/ICU-20158
https://unicode-org.atlassian.net/browse/ICU-13417

https://github.com/search?q=repo%3Aunicode-org%2Ficu+%22ICU-22901%22&type=commits
https://github.com/search?q=repo%3Aunicode-org%2Ficu+%22ICU-22696%22&type=commits
https://github.com/search?q=repo%3Aunicode-org%2Ficu+%22ICU-22520%22&type=commits
https://github.com/search?q=repo%3Aunicode-org%2Ficu+%22ICU-21289%22&type=commits
https://github.com/search?q=repo%3Aunicode-org%2Ficu+%22ICU-21035%22&type=commits
https://github.com/search?q=repo%3Aunicode-org%2Ficu+%22ICU-20803%22&type=commits
https://github.com/search?q=repo%3Aunicode-org%2Ficu+%22ICU-20445%22&type=commits
https://github.com/search?q=repo%3Aunicode-org%2Ficu+%22ICU-20158%22&type=commits
https://github.com/search?q=repo%3Aunicode-org%2Ficu+%22ICU-13417%22&type=commits

But ICU is a very old library that has made many promises in the past
that makes this work considerably more difficult than it would have
been with a more modern codebase.

First and foremost, there's a promise that the library should be fully
usable also when built without C++ exception handling together and a
promise that the library should always handle memory allocation
failures gracefully (reporting U_MEMORY_ALLOCATION_ERROR back to the
caller, not ignoring failures or terminating the process or throwing
exceptions), which unfortunately is fundamentally incompatible with
STL containers (which by design require exceptions to report memory
allocation failures). So it isn't possible to just replace existing
buffers with STL data structures like it would be possible to do in
most codebases.

So it seems unlikely to me that it'd be possible to make any large and
sweeping changes here, but if you were to pick some specific set of
buffers in ICU and devise a plan for how to replace them with
something safer (without breaking backward compatibility for callers)
then we'd sure be interested.

--
Fredrik Roubert
rou...@google.com

ossdev

unread,
May 10, 2026, 5:22:16 AM (11 days ago) May 10
to icu-design, Fredrik Roubert, icu-design, devtes...@gmail.com
Glad to find interest in adopting safe buffer practices. I spent the previous few days getting familiar with the codebase and also going through the attached commits. The icu library is fairly large and making large sweeping changes would be unfeasible rather going in smaller sets of unsafe buffer usage would be manageable. For example going through C style headers file source/common/cstring.h. The following are categorized as unsafe functions 

#define uprv_strcpy(dst, src) U_STANDARD_CPP_NAMESPACE  strcpy(dst, src)
#define uprv_strlen(str) U_STANDARD_CPP_NAMESPACE strlen(str)
#define uprv_strcmp(s1, s2) U_STANDARD_CPP_NAMESPACE strcmp(s1, s2)
#define uprv_strcat(dst, src) U_STANDARD_CPP_NAMESPACE strcat(dst, src)
#define uprv_strchr(s, c) U_STANDARD_CPP_NAMESPACE strchr(s, c)
#define uprv_strstr(s, c) U_STANDARD_CPP_NAMESPACE strstr(s, c)
#define uprv_strrchr(s, c) U_STANDARD_CPP_NAMESPACE strrchr(s, c)
#define uprv_strncpy(dst, src, size) U_STANDARD_CPP_NAMESPACE strncpy(dst, src, size)
#define uprv_strncmp(s1, s2, n) U_STANDARD_CPP_NAMESPACE strncmp(s1, s2, n)
#define uprv_strncat(dst, src, n) U_STANDARD_CPP_NAMESPACE strncat(dst, src, n)

e.g in locid.cpp
bool
Locale::operator==( const   Locale& other) const
{
    return uprv_strcmp(other.getName(), getName()) == 0;
}

Function 'strcmp' is unsafeclang(-Wunsafe-buffer-usage-in-libc-call)
macro uprv_strcmp
provided by "cstring.h"

I wanted to go through each of the defined macro and provide safe implementation also while preserving backward compatibility keeping the macro version as function, e.g
// #define uprv_strcpy(dst, src) U_STANDARD_CPP_NAMESPACE  strcpy(dst, src)
static
inline char* uprv_strcpy(char* dst, const char* src)
{
  #pragma clang unsafe_buffer_usage begin
  return U_STANDARD_CPP_NAMESPACE strcpy(dst, src);
  #pragma clang unsafe_buffer_usage end
}
#ifdef __cplusplus
U_NAMESPACE_USE
static
inline void uprv_strcpy(CharString &dst, const char* src, UErrorCode &status) {
    if (U_FAILURE(status)) return;
    if (dst == nullptr) {
        status = U_ILLEGAL_ARGUMENT_ERROR;
        return;
    }
    dst.clear();
    dst.append(src, status);
}
#endif /* __cplusplus */

Scope of work:
The defined macro gets converted into function and two versions are provided, one for backward compatibility and the other providing safe usage. Going through each of the defined macros the reference count is also high, e.g. uprv_strcpy is used in 59 files. When using unsafe version as is, the developer would get warning and get reminded to use safe the version.

The above changes are safe to implement and do not break backward-compatibility. If the changes make sense please respond and I am able to begin work on PR. 

ossdev

unread,
May 11, 2026, 8:29:51 AM (10 days ago) May 11
to icu-design, ossdev, Fredrik Roubert, icu-design, marku...@gmail.com
Dear ICU team & users,
The above proposal is awaiting discussion and approval. Ticket number does not exist in yet pending approval.
Please provide feedback by: next Monday, 2026-05-18
Designated API reviewer: Markus

Thanks

ossdev

unread,
May 15, 2026, 5:24:12 AM (6 days ago) May 15
to icu-design, ossdev, Fredrik Roubert, icu-design, marku...@gmail.com
Hi Markus,
From the earlier post, I gathered there is interest in implementing safe buffers just needs more direction as to how to proceed. I wanted to check if there's anything that could strengthen the Safe Buffers proposal from May 10, or if there's a particular format or template you'd prefer for API proposals like this one? Thanks.

ossdev

unread,
May 20, 2026, 2:02:13 AM (yesterday) May 20
to icu-design, ossdev, Fredrik Roubert, icu-design
Hello Fredrik & ICU team,
Any thoughts or questions on the proposal? I am able to open pull request in github. The proposal is suggesting to replace unsafe functions with safe versions using ICU classes and objects i.e ( CharString). 
Thanks

On Sunday, May 10, 2026 at 2:22:16 PM UTC+5 ossdev wrote:

Fredrik Roubert

unread,
9:04 AM (7 hours ago) 9:04 AM
to devtes...@gmail.com, icu-design
On Sun, May 10, 2026 at 11:22 AM ossdev <devtes...@gmail.com> wrote:

> inline void uprv_strcpy(CharString &dst, const char* src, UErrorCode &status) {

I don't understand: When would it be preferable to call this helper
function instead of directly calling CharString::copyFrom() or
CharString::append()?

--
Fredrik Roubert
rou...@google.com
Reply all
Reply to author
Forward
0 new messages