Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[ANN] UXStrings package available (UXS_20210207).

180 views
Skip to first unread message

Blady

unread,
Feb 8, 2021, 6:22:16 AM2/8/21
to
UXStrings is now available on Github with the whole API implemented
(version UXS_20210207 [1]).

The objectives are Unicode and dynamic length support for strings, those
are closed to VSS [2] from Adacore.

However, the UXStrings API is inspired from Ada.Strings.Unbounded in
order to minimize adaptation work from existing Ada source codes.
Gnoga and Zanyblue has been adapted to UXString with success, see Gnoga
announcement [3].

This is a first implementation POC. UTF-8 encoding is chosen for
internal representation. The Strings_Edit [4] library is used for UTF-8
encoding management.
It has not been intensively tested but this implementation is for
demonstrate the possible usages of UXString. A test program is also
provided with some features demonstrated [5].

See readme [6] for full details.

Comments especially on specifications [7] are welcome and others too ;-)

Enjoy, Pascal.

[1] https://github.com/Blady-Com/UXStrings/releases/tag/UXS_20210207
[2] https://github.com/AdaCore/VSS
[3] https://sourceforge.net/p/gnoga/mailman/message/37199377/
[4] http://www.dmitry-kazakov.de/ada/strings_edit.htm
[5]
https://github.com/Blady-Com/UXStrings/blob/master/tests/test_uxstrings.adb
[6] https://github.com/Blady-Com/UXStrings/blob/master/readme.md
[7] https://github.com/Blady-Com/UXStrings/blob/master/src/uxstrings1.ads

Emmanuel Briot

unread,
Feb 11, 2021, 3:19:26 AM2/11/21
to
There is clearly a need here, given the number of implementations out there. I had also implemented GNATCOLL.Strings 4 years ago, with similar goals to yours:
- unicode support (via generic formal parameters and traits packages, so you can use UTF8, UTF16,... internally)
- unbounded strings (with optional copy-on-write)
- task safety (using traits to chose what kind of counter to use)
- performance (small-string optimization: no memory alloc for strings of 18 characters or less)
- extended API (all missing subprograms from Ada.Strings.Unbounded)
- extensive testing

I must admit I am not sure why AdaCore chose to write VSS instead of improving one of their string implementations (ada.strings.unbounded, gnatcoll.strings,...)
My initial idea had been that it would be possible to provide a nice generic package, highly configurable via traits, on top of which we could reimplement ada.strings.unbounded,
ada.strings.bounded,...) but I left AdaCore before that could be accomplished.

I took a look at VSS and find the API confusing. Your API UXString is at least much clearer (if lacking doc at the moment :-)

I am hoping that the work on Alire (Ada package manager) will ultimately help us find one implementation that is good enough for everyone,
and could ultimately become part of the language.

Emmanuel

Blady

unread,
Feb 27, 2021, 4:14:24 AM2/27/21
to
Le 11/02/2021 à 09:19, Emmanuel Briot a écrit :
> There is clearly a need here, given the number of implementations out there. I had also implemented GNATCOLL.Strings 4 years ago, with similar goals to yours:
> - unicode support (via generic formal parameters and traits packages, so you can use UTF8, UTF16,... internally)
> - unbounded strings (with optional copy-on-write)
> - task safety (using traits to chose what kind of counter to use)
> - performance (small-string optimization: no memory alloc for strings of 18 characters or less)
> - extended API (all missing subprograms from Ada.Strings.Unbounded)
> - extensive testing
>
> I must admit I am not sure why AdaCore chose to write VSS instead of improving one of their string implementations (ada.strings.unbounded, gnatcoll.strings,...)
> My initial idea had been that it would be possible to provide a nice generic package, highly configurable via traits, on top of which we could reimplement ada.strings.unbounded,
> ada.strings.bounded,...) but I left AdaCore before that could be accomplished.

I'm preparing some optimization when the character set is reduced thus
the internal structure will adapt to the actual content.
But the memory management is bad, the set of API is very basic.
I'll be glad that you can help.

> I took a look at VSS and find the API confusing. Your API UXString is at least much clearer (if lacking doc at the moment :-)

Some documentation has been added in a form of comments of each API:
https://github.com/Blady-Com/UXStrings/commit/2bee0ab61841f5e319533b67d2747dda66aa9bd7#diff-90cde6014508061fab9d62e58b327815a954859e5da8a1fd655fa4e5854e7ac5

> I am hoping that the work on Alire (Ada package manager) will ultimately help us find one implementation that is good enough for everyone,
> and could ultimately become part of the language.

Alire registration is on the way:
https://github.com/alire-project/alire-index/pull/250

Pascal.

Blady

unread,
Mar 6, 2021, 1:13:27 PM3/6/21
to
UXStrings is now available with Alire
(https://alire.ada.dev/crates/uxstrings), in your Alire project, just
add UXStrings dependency:

% alr with uxstrings

Thus you can import the UXStrings package in your programs.

Pascal.

PS: for French readers, while referencing UXStrings on Alire, I make the
opportunity to write a short howto with ALire:
https://blady.pagesperso-orange.fr/a_savoir.html#alire


Blady

unread,
Apr 11, 2021, 4:45:59 AM4/11/21
to
Le 06/03/2021 à 19:13, Blady a écrit :
> UXStrings is now available with Alire
> (https://alire.ada.dev/crates/uxstrings), in your Alire project, just
> add UXStrings dependency:
>
> % alr with uxstrings
>
> Thus you can import the UXStrings package in your programs.

> PS: for French readers, while referencing UXStrings on Alire, I make the
> opportunity to write a short howto with ALire:
> https://blady.pagesperso-orange.fr/a_savoir.html#alire

Hello,

A second POC implementation for UXStrings is provided. The source code
files are ending with the number 2 as for instance "uxstrings2.ads".
https://github.com/Blady-Com/UXStrings/blob/master/src/uxstrings2.ads

A GNAT project file "uxstrings2.gpr" is provided with some naming
conventions for both packages UXStrings and UXStrings.Text_IO.

Some API have been added to support ASCII 7 bits encoding for both
version UXStrings 1 and 2. ASCII is a subset of UTF-8 thus no change
with the internal UTF-8 representation.

However, in addition of UXStrings 1 implementation, the API are now
aware if content is full ASCII. On one hand, this permits to access
directly to the position of one character without iterating on UTF-8
characters. Thus this is a time improvement when content is full ASCII.
On the other hand, when content is changing the API check if the new
content is full ASCII. Thus this is a time penalty when changes are not
full ASCII.

English contents as programming text files are composed of lines in
majority full ASCII but they may have some line with characters out of
ASCII set. UXStrings is dealing with both.

Available on GitHub (https://github.com/Blady-Com/UXStrings) and also on
Alire (https://alire.ada.dev/crates/uxstrings.html).

Feedback is welcome on the actual time improvement on your real use cases.

Thanks, Pascal.
0 new messages