Fixed Size Strings.

1,408 views
Skip to first unread message

Jake Arkinstall

unread,
Mar 7, 2019, 4:41:52 AM3/7/19
to std-pr...@isocpp.org
Hi all,

I would like to propose a standardised fixed size string type, fixed_string<N> as a thin wrapper of char[N+1].

The standard allows for implementations to store extra data in std::array, so that is why char[N+1] makes more sense than std::array for this purpose - the size should be exactly N+1 chars.

The name is debatable. Fixed has a number of meanings and I don't like the idea of someone thinking the objective is to "fix" something broken with std::string, but this name is common to many existing implementations. The +1 is also debatable. Given the number of functions in the wild that assume a null char termination, that extra char provides safety. Overwriting that final char is then classed as UB. Having non-null chars after the first null char is UB. Primary focus should be on providing intuitive interface that makes it easier to stick to defined behaviour.

I have worked with many 3rd party libraries involving low latency transport, and fixed size string implementations are to be found in the majority of those. I use fixed size strings in my own libraries, and I have also seen them in a variety of open source codebases. The benefits of it are the same that std::array has over std::vector, or that char[N] has over char*, in that its value is stack allocated and its length is encoded in its type. The upside of standardisation is to eventually introduce compatibility between such libraries, but the primary use for me is on easier, type safe wrapping of C interfaces (e.g. sockets) without resorting to explicit use of C string functions.

Main benefits:

- Easy and efficient to transmit/read. A struct containing them can be reinterpret_casted to a char array and sent over a socket or written to a file directly, assuming the struct is packed. This is quite common when using UDP, for example, but through a char[N] interface (and with all the C functions that come with it). 

- Efficient to process. The string data is held within the body of any struct holding it, so you benefit from cache locality and stack allocation. The max size is known, so a lower bound search for the null char will suffice in larger strings (hence non-null chars after the first null char being UB).

- Type safety. There is are many cases that runtime size checking is unnecessary, such as conversion from smaller/larger fixed strings (via a fixed size memcpy, rather than an strcpy, with conversion from larger strings truncating the input). Concatenation of a fixed_string<N> and a fixed_string<M> results in a fixed_string<N+M>, though this will require a search for the null char in the left operand.

The interface should have much of the functionality we enjoy with std::string. End iterators can come in two varieties: end() which is the first null char, and fixed_end() which is begin() + N + 1, which can also enjoy a view through the ranges interface.

I have an interface in mind, but I'm looking for criticism and further ideas to the above before proposing the interface itself. I'm aware that this idea is far from unique (that's actually the main reason I believe standardisation is beneficial) and there are a variety of tried and tested approaches.

Thanks, 
Jake

Andrew Tomazos

unread,
Mar 7, 2019, 5:20:53 AM3/7/19
to std-pr...@isocpp.org, jake.ar...@gmail.com
Hi Jake,



I've put this on hold because I think we're going to get a constexpr std::string in not too distant future.  The main motivation was to have a compile-time string (formally, a string class of literal type) for reflection.

But if you want to pick it up I'd be happy to work with you.

Regards,
Andrew.


--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAC%2B0CCPasEHK_HO7UvWB67vBZRLZSKxBdNbqAActNASpZTneUA%40mail.gmail.com.

Jake Arkinstall

unread,
Mar 7, 2019, 5:53:14 AM3/7/19
to Andrew Tomazos, std-pr...@isocpp.org
Hi Andrew,

Thanks for this, I wasn't aware of an existing proposal (and was shocked when a Google search revealed a few stack overflow posts and a couple of open source projects, but not much else).

The progress of constexpr functionality to dynamic allocation does take away a few of the potential benefits of fixed size strings vs std::string, though with the focus on fixed binary size and the benefits that come with it in terms of I/O and cache locality, I think a revival might be worthwhile. I'll give this a read and get back to you.

Thanks again,
Jake
Reply all
Reply to author
Forward
Message has been deleted
0 new messages