2025-09-09 13:50 UTC+0200 Przemyslaw Czerpak (druzus/at/
poczta.onet.pl)
* include/hbdefs.h
+ added new types HB_WCHAR16 and HB_WCHAR32, existing type HB_WCHAR
is mapped to HB_WCHAR16 (just like before)
* include/hbapicdp.h
* src/harbour.def
* src/rtl/cdpapi.c
+ added new C functions for encoding and decoding UTF-8 string using
which HB_WCHAR32:
int hb_cdpU32CharToUTF8( char * szUTF8, HB_WCHAR32 wc );
HB_BOOL hb_cdpUTF8GetU32( const char * pSrc, HB_SIZE nLen,
HB_SIZE * pnIndex, HB_WCHAR32 * pWC );
HB_BOOL hb_cdpUTF8GetUCS( const char * pSrc, HB_SIZE nLen,
HB_SIZE * pnIndex, HB_WCHAR32 * pWC );
HB_BOOL hb_cdpUTF8GetU16( const char * pSrc, HB_SIZE nLen,
HB_SIZE * pnIndex, HB_WCHAR16 * pWC );
HB_BOOL hb_cdpUTF8Validate( const char * pSrc, HB_SIZE nLen );
They support full UCS and are much more restrictive against errors and
wrong UTF-8 encoding, i.e. now overlong encoding is forbidden.
The wrong characters are translated to 0xFFFD and later if such
character does not exist in final CP to '?' ASCII character.
* declaration of the following UTF-8 C functions have been changed to
operate on HB_WCHAR32 instead of HB_WCHAR:
int hb_cdpUTF8CharSize( HB_WCHAR32 wc );
HB_WCHAR32 hb_cdpUTF8StringPeek( const char * pSrc, HB_SIZE nLen,
HB_SIZE nPos );
* the following C functions have been changed to internally operate on
HB_WCHAR32 instead of HB_WCHAR:
hb_cdpUTF8StringLength()
hb_cdpUTF8StringAt()
hb_cdpUTF8StringSubstr()
* the following C functions have been changed to use new hb_cdpUTF8GetU*()
instead of step by step decoding with hb_cdpUTF8ToU16NextChar()
hb_cdpStrToUTF8Disp()
hb_cdpUTF8AsStrLen()
hb_cdpUTF8ToStr()
hb_cdpStrToU16()
hb_cdpUtf8Char()
* use HB_CDP_ERROR_* macros to mark wrong encoding
* src/rtl/cdpapihb.c
* the following UTF-8 C functions have been changed to operate on
HB_WCHAR32 instead of HB_WCHAR:
hb_utf8Chr()
hb_utf8Asc()
hb_utf8Poke()
hb_utf8Peek()
Other UTF-8 PRG functions have been adopted to HB_WCHAR32 by changes
in corresponding C functions.
* src/codepage/cp_utf8.c
* use new function hb_cdpUTF8GetU16() to decode UTF-8 strings in UTF8EX CP
* src/rtl/arc4.c
+ added new macro HB_NO_SYSCTL which allow to disable sysctl() in Linux
builds for GLIBC < 2.30
best regards
Przemek