[boost] Variadic append for std::string

Olaf van der Spek

unread,

Dec 27, 2016, 9:48:01 AM12/27/16

to bo...@lists.boost.org

Hi,

One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view (boost and std), integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);

append(s, "A", "B", 42);

Would this be useful for the Boost String Algo lib?

--
Olaf

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Billy O'Neal (VC LIBS)

unread,

Dec 27, 2016, 12:25:53 PM12/27/16

to bo...@lists.boost.org

Pretty please?

Also concat(stringy_things...)

Billy3
________________________________
From: Boost <boost-...@lists.boost.org> on behalf of Olaf van der Spek <m...@vdspek.org>
Sent: Tuesday, December 27, 2016 6:47:33 AM
To: bo...@lists.boost.org
Subject: [boost] Variadic append for std::string

[This is one of the first messages you've received from M...@VDSPEK.ORG. Learn how we recognize email senders at http://aka.ms/LearnAboutSenderIdentification]

Unsubscribe & other changes: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.boost.org%2Fmailman%2Flistinfo.cgi%2Fboost&data=02%7C01%7Cbion%40microsoft.com%7Ce8f9415a3f87491e848d08d42e675b57%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636184468831322908&sdata=wqjhaHLHfaeV0rw9xGlLdY2zHIWLtZdnew%2FwP%2F%2Fyf7A%3D&reserved=0

Yakov Galka

unread,

Dec 28, 2016, 7:20:28 PM12/28/16

to bo...@lists.boost.org

On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <m...@vdspek.org> wrote:

> One frequently needs to append stuff to strings, but the standard way
> (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
>

Can't we already write it through (((s += "A") += "B") += to_string(42))?
This is the time I think that assignment operators, other than =, should
have had left associativitiy... pity they don't.

--
Yakov Galka
http://stannum.co.il/

Nat Goodspeed

unread,

Dec 28, 2016, 7:44:00 PM12/28/16

to bo...@lists.boost.org

On Dec 28, 2016 7:20 PM, "Yakov Galka" <ybunga...@gmail.com> wrote:

On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <m...@vdspek.org> wrote:

> One frequently needs to append stuff to strings, but the standard way
> (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
>

Can't we already write it through (((s += "A") += "B") += to_string(42))?
This is the time I think that assignment operators, other than =, should
have had left associativitiy... pity they don't.

I think what's desired here is a two-pass approach in which append() or
concat() or whatever first figures out the final length required, allocates
that much storage, then appends into it with no further expansion.

When we say it should understand two kinds of string_view, etc., I assume
that the catch-all case for each arg would be "anything that can be treated
as a range of char_type."

Olaf van der Spek

unread,

Dec 29, 2016, 3:54:49 AM12/29/16

to bo...@lists.boost.org

On Thu, Dec 29, 2016 at 1:19 AM, Yakov Galka <ybunga...@gmail.com> wrote:
> On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <m...@vdspek.org> wrote:
>
>> One frequently needs to append stuff to strings, but the standard way
>> (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
>>
>
> Can't we already write it through (((s += "A") += "B") += to_string(42))?
> This is the time I think that assignment operators, other than =, should
> have had left associativitiy... pity they don't.

We can, but it's ugly and I'd like to avoid the explicit to_string. It
also wouldn't allow the two-pass optimization to calculate the final
length before allocation.

--
Olaf

Olaf van der Spek

unread,

Dec 29, 2016, 3:56:33 AM12/29/16

to bo...@lists.boost.org

On Thu, Dec 29, 2016 at 1:43 AM, Nat Goodspeed <n...@lindenlab.com> wrote:
> I think what's desired here is a two-pass approach in which append() or
> concat() or whatever first figures out the final length required, allocates
> that much storage, then appends into it with no further expansion.
>
> When we say it should understand two kinds of string_view, etc., I assume
> that the catch-all case for each arg would be "anything that can be treated
> as a range of char_type."

Sounds good.. but if you've got an input range (not readable twice)
then you can no longer apply the two-pass optimization.
The interface certainly allows it.

--
Olaf

Andrey Semashev

unread,

Dec 29, 2016, 8:54:07 AM12/29/16

to bo...@lists.boost.org

On 12/29/16 11:54, Olaf van der Spek wrote:
> On Thu, Dec 29, 2016 at 1:19 AM, Yakov Galka <ybunga...@gmail.com> wrote:
>> On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <m...@vdspek.org> wrote:
>>
>>> One frequently needs to append stuff to strings, but the standard way
>>> (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
>>>
>>
>> Can't we already write it through (((s += "A") += "B") += to_string(42))?
>> This is the time I think that assignment operators, other than =, should
>> have had left associativitiy... pity they don't.
>
> We can, but it's ugly and I'd like to avoid the explicit to_string. It
> also wouldn't allow the two-pass optimization to calculate the final
> length before allocation.

I already mentioned in the std-proposals discussion that I don't think
formatting should be dealed with by std::string or a function named
append(). If formatting is to be involved I'd suggest creating a
formatting library, but at that point you should provide clear
advantages over the other formatting libraries we have in Boost.

Olaf van der Spek

unread,

Dec 31, 2016, 12:36:31 PM12/31/16

to bo...@lists.boost.org

On Thu, Dec 29, 2016 at 2:53 PM, Andrey Semashev
<andrey....@gmail.com> wrote:
>>>> One frequently needs to append stuff to strings, but the standard way
>>>> (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
>>>>
>>>
>>> Can't we already write it through (((s += "A") += "B") += to_string(42))?
>>> This is the time I think that assignment operators, other than =, should
>>> have had left associativitiy... pity they don't.
>>
>>
>> We can, but it's ugly and I'd like to avoid the explicit to_string. It
>> also wouldn't allow the two-pass optimization to calculate the final
>> length before allocation.
>
>
> I already mentioned in the std-proposals discussion that I don't think
> formatting should be dealed with by std::string or a function named
> append().

It'd be helpful if you include *why* you think so..

> If formatting is to be involved I'd suggest creating a formatting
> library, but at that point you should provide clear advantages over the
> other formatting libraries we have in Boost.

--
Olaf

Andrey Semashev

unread,

Dec 31, 2016, 6:21:57 PM12/31/16

to bo...@lists.boost.org

On 12/31/16 20:36, Olaf van der Spek wrote:
> On Thu, Dec 29, 2016 at 2:53 PM, Andrey Semashev
> <andrey....@gmail.com> wrote:
>>
>> I already mentioned in the std-proposals discussion that I don't think
>> formatting should be dealed with by std::string or a function named
>> append().
>
> It'd be helpful if you include *why* you think so..

I've already explained my opinion in the std-proposals discussion. For
the sake of completeness, here's a short version:

It's simply not std::string's job to do the formatting, IMO. This class
should be nothing more than a container of characters (well, it is
slightly more now, but I don't consider that a good thing). I guess,
that's mostly because I believe one class should be responsible for
doing only one thing, and in case of std::string it's storing a string.
Adding formatting functionality to std::string would increase the class'
bloat in terms of interface and implementation and likely add new
dependencies. Though, this is probably not an argument against a
separate non-intrusive library.

Back to the proposal for Boost, I don't mind if there is a standalone
function or library that does the formatting, as long as it offers some
advantage over the existing libraries. The proposed function though
should not be named `append` IMO because it's not the primary thing the
function does. I would expect an `append` algorithm to be generic and
compatible with any container, i.e. something that does nothing more
than `c.insert(c.end(), x)` or `c.append(x)`, for `x` being every
argument in the list of arguments to be appended. I.e. this should work:

std::list< double > c{ 1.0, 2.0, 3.0 };
append(c, 10.0, 20.0, 30.0); // calls c.insert()

As well as this:

std::string s{ "Hello" };
append(s, ", world!", " Happy 2017! :)"); // calls s.append()

This, however:

append(s, 47);

should result not in appending "47" but in appending "/" (a character
with code 47). I can see how this could be confusing to someone, but
that is what you'd get from calling `s.insert()` manually, and what I'd
expect from a function called `append`.

If formatting is required I would prefer to be required to spell my
intent more clearly, like this:

print(s, 47);

or:

format(s) << 47;

Also, I'm not clear enough about the intended use cases of the proposed
library. Is the goal just to optimize memory allocation? Is that at all
possible when formatting is involved? Would it be better than snprintf
into a local buffer?

Does the library open new use cases? For example, someone suggested in
the std-proposals discussion something similar to this:

throw std::runtime_error(format(std::string()) << "Error " << 47);

(I wrapped the default-constructed std::string() into format(), because
I don't think overloading operator<< for std::string is an acceptable
approach for the same reasons I mentioned above.)

I think, something with one line capability like that would be useful.
Would the library allow something like this?

Would the library support targets other than std::string? E.g. would I
be able to format into an `std::array< char, 10 >`?

Olaf van der Spek

unread,

Jan 3, 2017, 5:14:51 AM1/3/17

to bo...@lists.boost.org

On Sun, Jan 1, 2017 at 12:21 AM, Andrey Semashev
<andrey....@gmail.com> wrote:
> If formatting is required I would prefer to be required to spell my intent
> more clearly, like this:
>
> print(s, 47);

I'd expect print to output to cout.. wouldn't you?
sprint then?

> or:
>
> format(s) << 47;

I'd expect format to accept modifiers which this proposal explicitly
doesn't support.

> Also, I'm not clear enough about the intended use cases of the proposed
> library. Is the goal just to optimize memory allocation?

No, the goal is also to provide a better and simpler way to handle integers.

> Is that at all
> possible when formatting is involved?

Yes, as manually calling reserve beforehand is always possible.
How to optimally implement this is still an open question but that's
kind of an implementation detail.

> Would it be better than snprintf into
> a local buffer?

The resulting code certainly looks simpler to me.

> Does the library open new use cases? For example, someone suggested in the
> std-proposals discussion something similar to this:
>
> throw std::runtime_error(format(std::string()) << "Error " << 47);
>
> (I wrapped the default-constructed std::string() into format(), because I
> don't think overloading operator<< for std::string is an acceptable approach
> for the same reasons I mentioned above.)

I disagree.. I really don't see the benefit, especially for the user,
of the format wrapper.

operator<< would be a different proposal but throw
runtime_error(append(string(), "Error ", 47)); might work.

> I think, something with one line capability like that would be useful. Would
> the library allow something like this?
>
> Would the library support targets other than std::string?

Yes, probably.

> E.g. would I be
> able to format into an `std::array< char, 10 >`?

No, as array is fixed-size you can't append to it..
vector<char> might work though.

--
Olaf

Olaf van der Spek

unread,

Jan 3, 2017, 6:03:11 AM1/3/17

to bo...@lists.boost.org

On Thu, Dec 29, 2016 at 1:19 AM, Yakov Galka <ybunga...@gmail.com> wrote:

> On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <m...@vdspek.org> wrote:
>
>> One frequently needs to append stuff to strings, but the standard way
>> (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
>>
>
> Can't we already write it through (((s += "A") += "B") += to_string(42))?
> This is the time I think that assignment operators, other than =, should
> have had left associativitiy... pity they don't.

<< does the trick:

s << "A" << "B" << 42;

std::string& operator<<(std::string& os, std::string_view v)
{
return os += v;
}

std::string& operator<<(std::string& os, long long v)
{
return os += std::to_string(v);
}

--
Olaf

Dominique Devienne

unread,

Jan 3, 2017, 7:17:57 AM1/3/17

to bo...@lists.boost.org

On Thu, Dec 29, 2016 at 2:53 PM, Andrey Semashev <andrey....@gmail.com>
wrote:

> On 12/29/16 11:54, Olaf van der Spek wrote:
>
>> On Thu, Dec 29, 2016 at 1:19 AM, Yakov Galka <ybunga...@gmail.com>
>> wrote:
>>
>>> On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <m...@vdspek.org>
>>> wrote:
>>>
>>> One frequently needs to append stuff to strings, but the standard way
>>>> (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
>>>>
>>>>
>>> Can't we already write it through (((s += "A") += "B") += to_string(42))?
>>> This is the time I think that assignment operators, other than =, should
>>> have had left associativitiy... pity they don't.
>>>
>>
>> We can, but it's ugly and I'd like to avoid the explicit to_string. It
>> also wouldn't allow the two-pass optimization to calculate the final
>> length before allocation.
>>
>
> I already mentioned in the std-proposals discussion that I don't think
> formatting

should be dealt with by std::string or a function named append(). If

> formatting

is to be involved I'd suggest creating a formatting library, but at that
> point you should

provide clear advantages over the other formatting libraries we have in
> Boost.

Or that exist elsewhere, e.g. https://github.com/fmtlib/fmt --DD

Andrey Semashev

unread,

Jan 3, 2017, 7:32:46 AM1/3/17

to bo...@lists.boost.org

On 01/03/17 13:14, Olaf van der Spek wrote:
> On Sun, Jan 1, 2017 at 12:21 AM, Andrey Semashev
> <andrey....@gmail.com> wrote:
>> If formatting is required I would prefer to be required to spell my intent
>> more clearly, like this:
>>
>> print(s, 47);
>
> I'd expect print to output to cout.. wouldn't you?
> sprint then?

sprint works for me as well.

>> Also, I'm not clear enough about the intended use cases of the proposed
>> library. Is the goal just to optimize memory allocation?
>
> No, the goal is also to provide a better and simpler way to handle integers.
>
>> Is that at all
>> possible when formatting is involved?
>
> Yes, as manually calling reserve beforehand is always possible.
> How to optimally implement this is still an open question but that's
> kind of an implementation detail.

But you would have to either overallocate memory or perform the
formatting to determine its length. And while overallocating might be
possible for standard types such as integers and FP numbers (assuming
C-locale format), that does not seem possible for user's types. Or are
you not planning to support user-defined types?

>> I think, something with one line capability like that would be useful. Would
>> the library allow something like this?
>>
>> Would the library support targets other than std::string?
>
> Yes, probably.
>
>> E.g. would I be
>> able to format into an `std::array< char, 10 >`?
>
> No, as array is fixed-size you can't append to it..
> vector<char> might work though.

My intent was to format into a local/preallocated buffer, without any
additional allocations, but I assume that won't work because
`std::array` is lacking APIs for insertion. That probably means that you
have to define a concept of the possible target, what operations it must
support.

Olaf van der Spek

unread,

Jan 3, 2017, 7:46:12 AM1/3/17

to bo...@lists.boost.org

On Tue, Jan 3, 2017 at 1:32 PM, Andrey Semashev
<andrey....@gmail.com> wrote:
>> Yes, as manually calling reserve beforehand is always possible.
>> How to optimally implement this is still an open question but that's
>> kind of an implementation detail.
>
>
> But you would have to either overallocate memory or perform the formatting
> to determine its length.

True

> And while overallocating might be possible for
> standard types such as integers and FP numbers (assuming C-locale format),
> that does not seem possible for user's types. Or are you not planning to
> support user-defined types?

Supporting such types in one big call to sprint would be nice but it
does complicate the proposal.
One could always call sprint(s, <udt>) 'manually'. Or maybe the
two-argument version could be the extension point.

Maybe a sprint_max_size(s, <udt>) could be used (if defined) to
estimate the size required.

> My intent was to format into a local/preallocated buffer, without any
> additional allocations, but I assume that won't work because `std::array` is
> lacking APIs for insertion. That probably means that you have to define a
> concept of the possible target, what operations it must support.

Right, if we opt for a generic version. It would again complicate the
proposal though.

--
Olaf

Christof Donat

unread,

Jan 3, 2017, 8:19:32 AM1/3/17

to bo...@lists.boost.org

Hi,

Am 01.01.2017 00:21, schrieb Andrey Semashev:
> throw std::runtime_error(format(std::string()) << "Error " << 47);

How would that differ from

throw std::runtime_error((std::ostringstream{} << "Error " <<
47).str());

Christof

Olaf van der Spek

unread,

Jan 3, 2017, 8:20:32 AM1/3/17

to bo...@lists.boost.org

On Tue, Jan 3, 2017 at 2:19 PM, Christof Donat <c...@okunah.de> wrote:
> Hi,
>
> Am 01.01.2017 00:21, schrieb Andrey Semashev:
>>
>> throw std::runtime_error(format(std::string()) << "Error " << 47);
>
>
> How would that differ from
>
> throw std::runtime_error((std::ostringstream{} << "Error " << 47).str());

Simpler syntax, better performance

--
Olaf

Christof Donat

unread,

Jan 3, 2017, 9:16:45 AM1/3/17

to bo...@lists.boost.org

Hin

Am 03.01.2017 14:20, schrieb Olaf van der Spek:
> On Tue, Jan 3, 2017 at 2:19 PM, Christof Donat <c...@okunah.de> wrote:
>> Am 01.01.2017 00:21, schrieb Andrey Semashev:
>>>
>>> throw std::runtime_error(format(std::string()) << "Error " << 47);
>>
>>
>> How would that differ from
>>
>> throw std::runtime_error((std::ostringstream{} << "Error " <<
>> 47).str());
>
> Simpler syntax, better performance

I see the chances for better performance, but for the syntax I don't
really see
any remarkable improvements.

If performance matters, I'd try with boost::spirit::karma. The syntax
will be
less concise, but I am not aware of a faster generic solution.

auto message = std::string{10}; // <- preallocate enough memory for
the message
if( !karma::generate(std::begin(message), ascii::space, "Error " <<
karma::uint_, 47) ) {
// formating the error message failed. throw something else.
}
// since this is the exit of the function, the compiler might apply
copy elision.
throw std::runtime_error(message);

If you have multiple places like that in your code, I guess, you'd like
to wrap it
into a generic function and you have a similar API to the "append()"
proposal. Now
I see, how it might be useful, thanks. I think, append() should rely on
karma
generators then, instead of yet another int to string implementation,
because we
only already have five dozens.

Christof

Olaf van der Spek

unread,

Jan 3, 2017, 9:23:49 AM1/3/17

to bo...@lists.boost.org

On Tue, Jan 3, 2017 at 3:16 PM, Christof Donat <c...@okunah.de> wrote:
> Hin
>
> Am 03.01.2017 14:20, schrieb Olaf van der Spek:
>>
>> On Tue, Jan 3, 2017 at 2:19 PM, Christof Donat <c...@okunah.de> wrote:
>>>
>>> Am 01.01.2017 00:21, schrieb Andrey Semashev:
>>>>
>>>>
>>>> throw std::runtime_error(format(std::string()) << "Error " << 47);
>>>
>>>
>>>
>>> How would that differ from
>>>
>>> throw std::runtime_error((std::ostringstream{} << "Error " <<
>>> 47).str());
>>
>>
>> Simpler syntax, better performance
>
>
> I see the chances for better performance, but for the syntax I don't really
> see
> any remarkable improvements.

The extra parentheses and the .str() part are annoying.. same goes for
boost::format.

How about this one?

throw std::runtime_error("Error "s << 47);

> If performance matters, I'd try with boost::spirit::karma. The syntax will

Performance matters but it's not the only thing that matters.
What solution do you think someone new to C++ understands better?

> If you have multiple places like that in your code, I guess, you'd like to
> wrap it
> into a generic function and you have a similar API to the "append()"
> proposal. Now
> I see, how it might be useful, thanks. I think, append() should rely on
> karma
> generators then, instead of yet another int to string implementation,
> because we
> only already have five dozens.

It's an implementation detail but yes, it might be useful.

--
Olaf

Christof Donat

unread,

Jan 3, 2017, 9:49:05 AM1/3/17

to bo...@lists.boost.org

Hi,

Am 03.01.2017 15:23, schrieb Olaf van der Spek:
> On Tue, Jan 3, 2017 at 3:16 PM, Christof Donat <c...@okunah.de> wrote:
> The extra parentheses and the .str() part are annoying.. same goes for
> boost::format.

I see. For me that is not a big issue, but people are different.

> How about this one?
>
> throw std::runtime_error("Error "s << 47);

Uh. How does that work with

std::cout << "Error "s << 47;

Will that be

(std::cout << "Error "s) << 47;

or

std::cout << ("Error "s << 47);

Also I'd expect a std::string to behave like a stream then and try to
use e.g. manipulators. Maybe that would be acceptable with a different
operator. e.g. like in SQL:

throw std::runtime_error("Error "s || 47);

Now this is explicit:

std::cout << "Error "s || 47; // versus
std::cout << "Error "s << 47;

But then again this might behave surprisingly:

throw std::runtime_error("Error "s || 47 || 11);

I still don't feel comfortable with it.

>> If performance matters, I'd try with boost::spirit::karma. The syntax
>> will
>
> Performance matters but it's not the only thing that matters.
> What solution do you think someone new to C++ understands better?

I think, mixing the notion of strings and streams, but not for e.g.
manipulators would confuse people a lot. I am a big fan of overloading
operators and expression templates, wherever they improve the expression
of intent. In this particular case my gut feeling tells me, that it will
harm the expression of intent more often, than it will improve.

Christof

Andrey Semashev

unread,

Jan 3, 2017, 9:54:15 AM1/3/17

to bo...@lists.boost.org

On 01/03/17 17:23, Olaf van der Spek wrote:
> On Tue, Jan 3, 2017 at 3:16 PM, Christof Donat <c...@okunah.de> wrote:
>> Hin
>>
>> Am 03.01.2017 14:20, schrieb Olaf van der Spek:
>>>
>>> On Tue, Jan 3, 2017 at 2:19 PM, Christof Donat <c...@okunah.de> wrote:
>>>>
>>>> Am 01.01.2017 00:21, schrieb Andrey Semashev:
>>>>>
>>>>>
>>>>> throw std::runtime_error(format(std::string()) << "Error " << 47);
>>>>
>>>>
>>>>
>>>> How would that differ from
>>>>
>>>> throw std::runtime_error((std::ostringstream{} << "Error " <<
>>>> 47).str());
>>>
>>>
>>> Simpler syntax, better performance
>>
>>
>> I see the chances for better performance, but for the syntax I don't really
>> see
>> any remarkable improvements.
>
> The extra parentheses and the .str() part are annoying.. same goes for
> boost::format.
>
> How about this one?
>
> throw std::runtime_error("Error "s << 47);

Well, if we're using UDLs, we might as well use my `format` proposal or
a stream or Boost.Format behind the scene.

throw std::runtime_error("Error "fmt << 47);
throw std::runtime_error("Error "strm << 47);
throw std::runtime_error("Error %d"fmt % 47);

Each of the UDL operators would create a wrapper that implements
formatting and is convertible to std::string. No need to infect
std::string itself with formatting.

Olaf van der Spek

unread,

Jan 3, 2017, 10:08:39 AM1/3/17

to bo...@lists.boost.org

On Tue, Jan 3, 2017 at 3:48 PM, Christof Donat <c...@okunah.de> wrote:
> Hi,
>
> Am 03.01.2017 15:23, schrieb Olaf van der Spek:
>>
>> On Tue, Jan 3, 2017 at 3:16 PM, Christof Donat <c...@okunah.de> wrote:
>> The extra parentheses and the .str() part are annoying.. same goes for
>> boost::format.
>
>
> I see. For me that is not a big issue, but people are different.

I think the language still doesn't allow temporaries to bind to
mutable& so we might've cheered too soon.. :(

>> How about this one?
>>
>> throw std::runtime_error("Error "s << 47);
>
>
> Uh. How does that work with
>
> std::cout << "Error "s << 47;
>
> Will that be
>
> (std::cout << "Error "s) << 47;
>
> or
>
> std::cout << ("Error "s << 47);

We've got rules for that..
http://en.cppreference.com/w/cpp/language/operator_precedence

> Also I'd expect a std::string to behave like a stream then and try to
> use e.g. manipulators. Maybe that would be acceptable with a different
> operator. e.g. like in SQL:
>
> throw std::runtime_error("Error "s || 47);
>
> Now this is explicit:
>
> std::cout << "Error "s || 47; // versus

Why would you want to combine << for ostream and string? Just use the
ostream one in both cases.

I think << is quite elegant.

--
Olaf

Christof Donat

unread,

Jan 3, 2017, 10:55:48 AM1/3/17

to bo...@lists.boost.org

Hi,

Am 03.01.2017 16:08, schrieb Olaf van der Spek:
>>> How about this one?
>>>
>>> throw std::runtime_error("Error "s << 47);
>>
>>
>> Uh. How does that work with
>>
>> std::cout << "Error "s << 47;
>>
>> Will that be
>>
>> (std::cout << "Error "s) << 47;
>>
>> or
>>
>> std::cout << ("Error "s << 47);
>
> We've got rules for that..
> http://en.cppreference.com/w/cpp/language/operator_precedence

I know, but you came up with C++ beginners. They already often get
confused
with streams and bit shifts, mostly when they come to C++ from C. Now we
additionally mix expressions, where we can use manipulators with those,
where we can't. That will probably make things worse not only for C++
beginners.

>> Also I'd expect a std::string to behave like a stream then and try to
>> use e.g. manipulators. Maybe that would be acceptable with a different
>> operator. e.g. like in SQL:
>>
>> throw std::runtime_error("Error "s || 47);
>>
>> Now this is explicit:
>>
>> std::cout << "Error "s || 47; // versus
>
> Why would you want to combine << for ostream and string? Just use the
> ostream one in both cases.

We just came across the idea, that appending to the string will be
faster
than writing to a stream. Therefore it might be a good idea to put
together the string with appending and then hand the complete result to
the stream.

Anyway, the syntax I proposed would at least make the difference
explicit
for the reader. The programmer expresses the intent to either first
concatenate the strings and then write to the stream or write to the
stream directly.

> I think << is quite elegant.

I feel very uneasy with it and I think I have presented quite some
reasoning why. Also I don't seem to be the only one in this discussion.
At least that should count as a strong indicator, that you need much
better reasons to back your proposal.

Christof

Hans Dembinski

unread,

Jan 4, 2017, 5:24:17 AM1/4/17

to bo...@lists.boost.org

Hi Olaf,

>> I think << is quite elegant.
>
> I feel very uneasy with it and I think I have presented quite some
> reasoning why. Also I don't seem to be the only one in this discussion.
> At least that should count as a strong indicator, that you need much
> better reasons to back your proposal.

I agree with Christof. Also think about consistency within the C++ standard library. Consistency is good, because it allows you to apply the same thinking elsewhere, instead of looking up everything in a reference. There should be a minimum of surprises in using a language and a library. Python does this very well, it is codified in the "There should be one - and preferably only one - obvious way to do it" rule.

Since many years we have established streams and containers as separate things. std::string is a container, std::ostringstream is a stream. They have separate responsibilities. You should not mix these. Herb Sutter and other experts already use std::string as a prime example of a class with too many responsibilities (in form of many member functions). Like it was said before, std::string should be a dynamic container of characters, nothing more.

Hans

Olaf van der Spek

unread,

Jan 11, 2017, 3:29:15 AM1/11/17

to bo...@lists.boost.org

On Tue, Jan 3, 2017 at 4:55 PM, Christof Donat <c...@okunah.de> wrote:

> Hi,
>
> Am 03.01.2017 16:08, schrieb Olaf van der Spek:
>
>> How about this one?
>>>>
>>>> throw std::runtime_error("Error "s << 47);
>>>>
>>>
>>>
>>> Uh. How does that work with
>>>
>>> std::cout << "Error "s << 47;
>>>
>>> Will that be
>>>
>>> (std::cout << "Error "s) << 47;
>>>
>>> or
>>>
>>> std::cout << ("Error "s << 47);
>>>
>>
>> We've got rules for that..
>> http://en.cppreference.com/w/cpp/language/operator_precedence
>>
>
> I know, but you came up with C++ beginners. They already often get confused
> with streams and bit shifts, mostly when they come to C++ from C. Now we
> additionally mix expressions, where we can use manipulators with those,
> where we can't. That will probably make things worse not only for C++
> beginners.

Maybe, maybe not.
Overloading yet another operator for a similar (but not equal) purpose is
problematic too.

> I think << is quite elegant.
>>
>
> I feel very uneasy with it and I think I have presented quite some
> reasoning why.

You have, but I don't think we have a significantly better alternative.

> Also I don't seem to be the only one in this discussion.
> At least that should count as a strong indicator, that you need much
> better reasons to back your proposal.

--
Olaf

Olaf van der Spek

unread,

Jan 11, 2017, 3:32:10 AM1/11/17

to bo...@lists.boost.org

On Wed, Jan 4, 2017 at 11:28 AM, Hans Dembinski <hans.de...@gmail.com>
wrote:

> Hi Olaf,
>
> >> I think << is quite elegant.
> >
> > I feel very uneasy with it and I think I have presented quite some
> > reasoning why. Also I don't seem to be the only one in this discussion.
> > At least that should count as a strong indicator, that you need much
> > better reasons to back your proposal.
>
> I agree with Christof. Also think about consistency within the C++
> standard library. Consistency is good, because it allows you to apply the
> same thinking elsewhere, instead of looking up everything in a reference.
> There should be a minimum of surprises in using a language and a library.
> Python does this very well, it is codified in the "There should be one -
> and preferably only one - obvious way to do it" rule.
>
> Since many years we have established streams and containers as separate
> things. std::string is a container, std::ostringstream is a stream. They
> have separate responsibilities. You should not mix these. Herb Sutter and
> other experts already use std::string as a prime example of a class with
> too many responsibilities (in form of many member functions). Like it was
> said before, std::string should be a dynamic container of characters,
> nothing more.
>

I agree with Herb and that's why those operators are NOT member functions..
Note that the proposed syntax could also support other containers.

--
Olaf

Christof Donat

unread,

Jan 11, 2017, 4:02:54 AM1/11/17

to bo...@lists.boost.org

Hi,

Am 11.01.2017 09:28, schrieb Olaf van der Spek:
> On Tue, Jan 3, 2017 at 4:55 PM, Christof Donat <c...@okunah.de> wrote:
>> I know, but you came up with C++ beginners. They already often get
>> confused
>> with streams and bit shifts, mostly when they come to C++ from C. Now
>> we
>> additionally mix expressions, where we can use manipulators with
>> those,
>> where we can't. That will probably make things worse not only for C++
>> beginners.
>
> Maybe, maybe not.
> Overloading yet another operator for a similar (but not equal) purpose
> is
> problematic too.

That is true. Actually I think, the best option is a different class,
that behaves just like a stream. Then it'll support manipulators, etc.
No one will be surprised, that strings behave similar to streams, but
not really the same, etc. We already have that class in the standard
library: std::ostringstream.

BTW: boost::format follows that same pattern as well. It just doesn't
use the shift operator to prevent confusion with stream.

>> I think << is quite elegant.
>>>
>>
>> I feel very uneasy with it and I think I have presented quite some
>> reasoning why.
>
> You have, but I don't think we have a significantly better alternative.

The significantly better alternative is std::ostringstream. I think our
debate showed very nicely, that all the options, we have come up with to
simplify the concatenation of strings are significantly inferior in
total, though we could find examples, where they slightly improve
readability.

Christof

Olaf van der Spek

unread,

Jan 11, 2017, 10:03:55 AM1/11/17

to bo...@lists.boost.org

Interface is important, as is performance.. and performence of
ostringstream sucks..

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0067r1.html
https://gist.github.com/anonymous/7700052

--
Olaf

Christof Donat

unread,

Jan 11, 2017, 11:46:14 AM1/11/17

to bo...@lists.boost.org

Hi,

Am 11.01.2017 16:03, schrieb Olaf van der Spek:
> On Wed, Jan 11, 2017 at 10:02 AM, Christof Donat <c...@okunah.de> wrote:
>> The significantly better alternative is std::ostringstream. I think
>> our
>> debate showed very nicely, that all the options, we have come up with
>> to
>> simplify the concatenation of strings are significantly inferior in
>> total,
>> though we could find examples, where they slightly improve
>> readability.
>
> Interface is important, as is performance.. and performence of
> ostringstream sucks..

Well, but obviously up to now we didn't manage to come up with a
solution, that has a similar simple, maybe even better interface and
allows for faster implementations.

I'll try again:

Basically the Idea of that kind of interface should work as well for a
high performance solution. Let's start with this syntax:

auto myString = (cat() + "Hello"s + " "s + "World! "s + 42).str();

"cat" would return an object that collects all the values by reference,
and concatenates them in one go in the call to str(). Then when str()
runs, the information about the string length can be calculated, or at
least estimated correctly and no data needs to be copied unnecessarily.
I am sure, you see the structural similarity to the usage of
std::ostringstream as well. I just chose to use the operator +, because

1. we don't have stream manipulators here, so we don't want people to
think, that we are talking about streams and
2. strings already can be concatenated with operator +. Therefore this
is an intuitive and not very surprising interface for string
concatenation.

People might just think, they can add numbers to strings as well without
a "cat"-object. There is some room for confusion, but it is much less,
than people thinking, strings are a kind of streams. Maybe "cat" is not
the best name for this function template, it should express, that here
the concatenation includes conversions.

We also need an extensible way to define these conversions, because we
also will want to add e.g. complex<myBigRelativeType, myBigRelativeType>
to a string. We also will want to define the format, because sometimes,
we'll want to concat a number in decimal representation, sometimes in
hexadecimal, octal, etc. I think, these conversion functions should
write to a string view. e.g.:

template <> // the default conversion for this type
void boost::cat::convert<complex<...>>(std::string_view& output,
size_t max_length,
const complex<...>& data);

We could use parameters to cat to define conversion functions.

auto myString = (cat(hex<int>) + 42).str();

I am not sure now, if that syntax is so great to define the conversion
functions. We'll have to discuss.

Of course there should be an optimized default conversion function for
cat-objects as well:

auto myString = (cat() + "Hello"s + " "s + "World! "s + 42 +
(cat(hex<int>) + " "s + 42)).str();

Here the outer cat-object would "steal" the values from the inner
cat-object and use its conversion functions directly with a view on the
big result string.

I understand, that the syntax is not 100% what you try to achieve, but I
still think, that conversion to string should not be directly done on a
string object. This specific object to do concatenation and conversion
actually is the key to gain the maximum possible performance, because it
can delay the actual concatenation and conversion to a point, where all
necessary information is available.

Christof

Roberto Hinz

unread,

Jan 11, 2017, 12:41:01 PM1/11/17

to bo...@lists.boost.org

Hi,

I've been working in a format library that contains a function template
called appendf that can do this:

std::string str("blabla");
boost::stringify::appendf(str) () (" AAA ", 25, " BBB ", {255, "x"});
assert(str == "blabla AAA 25 BBB ff"); // ( 255 formated in
hexadecimal )

This library - I call it Boost.Stringify - is in very early stage of
development and I though it would be to premature to mention it now. But
given the repercussion in this thread so far, I changed my mind. So I will
soon publish it in github and I will start to write a brief documentation
so that I can soon present here to gauge interest.

Roberto Hinz

unread,

Jan 11, 2017, 2:04:23 PM1/11/17

to bo...@lists.boost.org

Sorry for top posing in the previous message. It wasn't intentional
Anyway, you can take a look at the source:
https://github.com/robhz786/stringify

Although there is no documentation, the unit tests and performance test
provides some usage examples.

I think it solves this variadic string append perfectly. However it will
take time to get ready.
As I said, perhaps is premature to gauge interest. Still, any interest ?

This is a c++14 format library, that:

- has good performance
- allows the user to extend input types and output types. And when adding
new input types, he can create formating options specifically this new
input type.
- is highly customizable. The user will able to customize, for instance
- how is the width calculated. It can be simple the length of the
string, or it be the count of unicode code points. Or it can be even more
sophisticated
- the numeric digits. For instance, to use arabic numbers.
- support all character types.
- its UTF friendly. For instance, while in std::ostream and others the fill
must be a single char, in Boost.Stringify, it is a char32_t. If you a
writing to char*, it is converted into utf-8 by default, but you can
customize that.
- currently can write to std::basic_string, char*. But it support also
FILE* and std::basic_streambuf
- is not locale sensitive. This is not necessarily aways an advantage. But
it would be nice to have one format that is totally not locale sensitive,
and thus ensure that it will aways write exactly the user expect.

Note: I did not test it in Visual Studio. ( gcc and clang only )

Olaf van der Spek

unread,

Jan 12, 2017, 2:06:00 PM1/12/17

to bo...@lists.boost.org

On Wed, Jan 11, 2017 at 5:45 PM, Christof Donat <c...@okunah.de> wrote:

> Hi,
>
> Am 11.01.2017 16:03, schrieb Olaf van der Spek:
>
>> On Wed, Jan 11, 2017 at 10:02 AM, Christof Donat <c...@okunah.de> wrote:
>>
>>> The significantly better alternative is std::ostringstream. I think our
>>> debate showed very nicely, that all the options, we have come up with to
>>> simplify the concatenation of strings are significantly inferior in
>>> total,
>>> though we could find examples, where they slightly improve readability.
>>>
>>
>> Interface is important, as is performance.. and performence of
>> ostringstream sucks..
>>
>
> Well, but obviously up to now we didn't manage to come up with a solution,
> that has a similar simple, maybe even better interface and allows for
> faster implementations.
>
> I'll try again:
>
> Basically the Idea of that kind of interface should work as well for a
> high performance solution. Let's start with this syntax:
>
> auto myString = (cat() + "Hello"s + " "s + "World! "s + 42).str();
>

I like fmt's syntax much more:

string s = fmt::format("The answer is {}", 42);

http://fmtlib.net/latest/index.html

I really don't get why we should bother with the (cat(...)).str() bit.

If you want that kind of formatting I'd really go for
http://fmtlib.net/latest/syntax.html

> I am not sure now, if that syntax is so great to define the conversion
> functions. We'll have to discuss.
>
> Of course there should be an optimized default conversion function for
> cat-objects as well:
>
> auto myString = (cat() + "Hello"s + " "s + "World! "s + 42 +
> (cat(hex<int>) + " "s + 42)).str();
>
> Here the outer cat-object would "steal" the values from the inner
> cat-object and use its conversion functions directly with a view on the big
> result string.
>
> I understand, that the syntax is not 100% what you try to achieve, but I
> still think, that conversion to string should not be directly done on a
> string object. This specific object to do concatenation and conversion
> actually is the key to gain the maximum possible performance, because it
> can delay the actual concatenation and conversion to a point, where all
> necessary information is available.

True, but then we're back again to an append / sprint style function.

--
Olaf

Christof Donat

unread,

Jan 13, 2017, 10:30:33 AM1/13/17

to bo...@lists.boost.org

Hi,

Am 12.01.2017 20:05, schrieb Olaf van der Spek:
> On Wed, Jan 11, 2017 at 5:45 PM, Christof Donat <c...@okunah.de> wrote:
>> Basically the Idea of that kind of interface should work as well for a
>> high performance solution. Let's start with this syntax:
>>
>> auto myString = (cat() + "Hello"s + " "s + "World! "s + 42).str();
>>
>
> I like fmt's syntax much more:
>
> string s = fmt::format("The answer is {}", 42);
>
>
> http://fmtlib.net/latest/index.html
>
> I really don't get why we should bother with the (cat(...)).str() bit.

Format will have to parse the format string at runtime and then chose
dynamically the function to stringify the number. The cat() aproach can
be implemented to do that choice at compiletime. So no string parsing
and not dynamic choice of formating functions. We were talking about
performance in this chapter, weren't we?

Christof

Peter Dimov

unread,

Jan 13, 2017, 11:30:33 AM1/13/17

to bo...@lists.boost.org

Christof Donat wrote:
> Am 12.01.2017 20:05, schrieb Olaf van der Spek:
> > I like fmt's syntax much more:
> >
> > string s = fmt::format("The answer is {}", 42);
> >
> >
> > http://fmtlib.net/latest/index.html

fmtlib looks pretty cool. In fact it's basically a superset of my private
library. Some of the design decisions aren't the same but the overall style
is very similar.

> Format will have to parse the format string at runtime and then chose
> dynamically the function to stringify the number.

It does, of course, parse the format string, but I'm not sure what you mean
by "choose function dynamically." The above basically does something like

os << "The answer is ";
os << 42;
return os.str();

using one virtual call for the os << 42 part.

damie...@lecbna.org

unread,

Jan 13, 2017, 6:24:23 PM1/13/17

to bo...@lists.boost.org

What about the typesafe sprintf from Abel regarding this discussion, a library based on it could forward to eirher compiletime/runtime version and benefit from the ease of the printf syntax.

http://abel.web.elte.hu/mpllibs/safe_printf/sprintf.html

I think the api of sprintf is simple and good it just shall be made typesafe and string concept overloaded, like a formatted print algorithm.

Any other new api will fail as boost.format, boost format is good but hasn't gained much traction because the added value balanced with it's complexity doesn't make it better than sprintf in usage.

I mean : str(boost::format("%1%") % myvar)) is not so much cooler than streams formatting or sprintf. So why not stick with a modernized sprintf for formatting?
--
Damien Buhl
Alias daminetreg

Christof Donat

unread,

Jan 15, 2017, 8:40:30 AM1/15/17

to bo...@lists.boost.org

Hi,

Am 13.01.2017 17:29, schrieb Peter Dimov:
> Christof Donat wrote:
>> Am 12.01.2017 20:05, schrieb Olaf van der Spek:
>> > I like fmt's syntax much more:
>> >
>> > string s = fmt::format("The answer is {}", 42);
>> >
>> >
>> > http://fmtlib.net/latest/index.html
>
> fmtlib looks pretty cool. In fact it's basically a superset of my
> private library. Some of the design decisions aren't the same but the
> overall style is very similar.
>
>> Format will have to parse the format string at runtime and then chose
>> dynamically the function to stringify the number.
>
> It does, of course, parse the format string, but I'm not sure what you
> mean by "choose function dynamically." The above basically does
> something like
>
> os << "The answer is ";
> os << 42;
> return os.str();
>
> using one virtual call for the os << 42 part.

A call to a virtual function is a dynamic choice of the function to
call.

A non virtual call can be inlined. Even if it is not, it is simply an
absolute jump, while a virtual call is at least loading the vtable
pointer and loading the pointer to the function from said vtable, before
you can jump to the function. But that is not the worst part of virtual
function calls. With non virtual calls, the branch prediction will
always be correct, the upcoming instructions will already be loaded and
fed into the pipeline, and the code runs almost as fast as the inlined
version (ignoring parameter passing here, which is where inlining really
pays of).

Virtual function calls give the branch prediction only a statistical
chance to chose correctly. When it fails, the instruction cache has to
be filled with instructions from the correct branch, while the
prefetched instructions, that already partially executed in the
pipeline, have to be rolled back. Depending on your hardware, that might
cost up to several dozens of clock cycles, in case of a cache miss.

We were talking about performance here. Therefore I tried to come up
with a syntax, that can be implemented without virtual function calls as
much as possible and avoids parsing format strings. Also that syntax
should allow for creating the result in one go, in order to avoid
unnecessary reallocations. Keeping that in mind, I think, my proposal is
not the worst possible, though, of course, someone might come up with a
better one.

Libraries like boost::format have their strengths, when simply
concatenating information does not fit the needs. E.g. for
internationalized text output. I like boost::format a lot, but speed
actually is not its focus.

auto translate(const std::string& key) -> std::string;

// ...
// will translate to e.g. "%2% Bytes werden von der Datei '%1%' belegt."
std::cerr << boost::format(translate("file '%1%' contains %2% bytes"s))
% filename % filesize << std::endl;

On the other hand e.g. for writing a set of values as CSV, my proposed
interface will produce faster code:

for(const auto& line : all_lines)
out << (cat() + line<1> + ',' + line<2> + ',' + line<3> + ',' +
line<4> + ',' + line<5> + '\n').str();

Christof

Damien Buhl

unread,

Jan 15, 2017, 3:42:31 PM1/15/17

to bo...@lists.boost.org

On 15/01/2017 14:39, Christof Donat wrote:
> We were talking about performance here. Therefore I tried to come up
> with a syntax, that can be implemented without virtual function calls
> as much as possible and avoids parsing format strings. Also that
> syntax should allow for creating the result in one go, in order to
> avoid unnecessary reallocations. Keeping that in mind, I think, my
> proposal is not the worst possible, though, of course, someone might
> come up with a better one.
>

Yes sorry I've been distracted by the discussion about formatting,
because indeed your API idea looks like a good choice when it come to
the initial problem of this discussion : appending / concatenating.
While users might find more intuitive a variadic concat(...) api.

But to close the ellipsis about formatting : with something like Abel
Sincovick's snprintf in principle you only pay for the parsing at
compile-time and at runtime you have your string formatted with no/less
dynamic allocation.

> Libraries like boost::format have their strengths, when simply
> concatenating information does not fit the needs. E.g. for
> internationalized text output. I like boost::format a lot, but speed
> actually is not its focus.

I personally uses boost::format alot, but I think boost::format would be
easier and shorter to use if it would be callable with an `std::string
snprintf(format_str, args...)`. And it would be awesome that format_str
would be, if already known at compile time handled with Metaparse.
Sorry for getting off-topic. :D

--
Damien Buhl

Olaf van der Spek

unread,

Jan 16, 2017, 3:54:56 AM1/16/17

to bo...@lists.boost.org

On Sun, Jan 15, 2017 at 9:42 PM, Damien Buhl <damie...@lecbna.org> wrote:

> On 15/01/2017 14:39, Christof Donat wrote:
> > We were talking about performance here. Therefore I tried to come up
> > with a syntax, that can be implemented without virtual function calls
> > as much as possible and avoids parsing format strings. Also that
> > syntax should allow for creating the result in one go, in order to
> > avoid unnecessary reallocations. Keeping that in mind, I think, my
> > proposal is not the worst possible, though, of course, someone might
> > come up with a better one.
> >
> Yes sorry I've been distracted by the discussion about formatting,
> because indeed your API idea looks like a good choice when it come to
> the initial problem of this discussion : appending / concatenating.
> While users might find more intuitive a variadic concat(...) api.
>
> But to close the ellipsis about formatting : with something like Abel
> Sincovick's snprintf in principle you only pay for the parsing at
> compile-time and at runtime you have your string formatted with no/less
> dynamic allocation.
>

http://abel.web.elte.hu/mpllibs/safe_printf/snprintf.html

It appears to only do checking at compile-time and then forwards to
sprintf..

>
> > Libraries like boost::format have their strengths, when simply
> > concatenating information does not fit the needs. E.g. for
> > internationalized text output. I like boost::format a lot, but speed
> > actually is not its focus.
> I personally uses boost::format alot, but I think boost::format would be
> easier and shorter to use if it would be callable with an `std::string
> snprintf(format_str, args...)`. And it would be awesome that format_str
> would be, if already known at compile time handled with Metaparse.
> Sorry for getting off-topic. :D
>

Can't boost::format be updated to a variadic variant?

--
Olaf

Christof Donat

unread,

Jan 16, 2017, 4:57:41 AM1/16/17

to bo...@lists.boost.org

I think so. Something like this should be possible, of course:

auto format_args_imp(const boost::format& f) -> const boost::format& {
return f;
}

template<typename Arg1, typename ... Args>
auto format_args_imp(const boost::format& f, Arg1 arg1, Args ... args)
-> const boost::format& {
return format_args_impl(f % arg1, args...);
}

template<typename ... Args>
auto format_args(const std::string& f, Args...args) -> std::string {
return format_args_impl(f, args...).str();
}

// ...
auto first_which = format_args("When shall we %1% meet again? In %2%,
%3%, or in %4%?"s, 3, "thunder"s, "lightning"s, "rain"s);

I have not tried to compile this, so please just take it as a sketch. It
inherits most of the advantages and disadvantages of boost::format, of
course.

Christof

Richard Hodges

unread,

Jan 16, 2017, 11:08:56 AM1/16/17

to bo...@lists.boost.org

Sorry to chime in so late in the discussion.

What about a syntax similar to this?

int main()
{
auto s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58);
std::cout << s << std::endl;

s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana"));
std::cout << s << std::endl;

}

Which would produce the following output:

Hello , World. The hex for 58 is 3a
a : b : c8 : “banana"

sample implementation (io manipulators may be incomplete, some efficiency gains could be made by re-implementing ostringstream more cleverly):

#include <sstream>
#include <iostream>
#include <iomanip>

namespace detail {
template<class SepStr>
struct separator_object
{
template<class T>
std::ostream& operator ()(std::ostream& s, T&& t) const
{
return s << sep << t;
}

//
// other iomanp specialisations here
//
std::ostream& operator ()(std::ostream& s, std::ios_base&(*t)(std::ios_base&)) const
{
t(s);
return s;
}

SepStr const& sep;
};

struct no_separator_object
{
template<class T>
std::ostream& operator ()(std::ostream& s, T&& t) const
{
return s << t;
}
};

template<class Separator, class String, class...Rest>
auto join(Separator&& sep, String&& s, Rest&&...rest)
{
std::ostringstream ss;
ss << s;
using expand = int [];
void(expand{0,
((sep(ss, rest)), 0)...
});
return ss.str();
};

}

template<class Sep>
static constexpr auto separator(Sep const& sep)
{
using sep_type = std::remove_const_t<std::remove_reference_t<Sep>>;
return detail::separator_object<sep_type> { sep };
}

template<class SepObject, class String, class...Rest>
auto join(const detail::separator_object<SepObject>& sep, String&& s, Rest&&...rest)
{
return detail::join(sep,
std::forward<String>(s),
std::forward<Rest>(rest)...);
};

template<class String, class...Rest>
auto join(String&& s, Rest&&...rest)
{
return detail::join(detail::no_separator_object(),
std::forward<String>(s),
std::forward<Rest>(rest)...);
};

int main()
{
auto s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58);
std::cout << s << std::endl;

s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana"));
std::cout << s << std::endl;

Damien Buhl

unread,

Jan 16, 2017, 3:24:18 PM1/16/17

to bo...@lists.boost.org

On 16/01/2017 09:54, Olaf van der Spek wrote:
> http://abel.web.elte.hu/mpllibs/safe_printf/snprintf.html
>
> It appears to only do checking at compile-time and then forwards to
> sprintf..

Yes the current implementation, but based on the expression template
resulting from the parsing, one could easily produce at compile-time
efficient code for formatting.

But I wasn't clear enough, I just wanted to tell that with the
underlying library metaparse, one can do the compile-time dispatching.
Naturally I think the compile-time cost would surely be higher than the
cat() solution.

--
--
Damien Buhl

Richard Hodges

unread,

Jan 18, 2017, 1:42:23 AM1/18/17

to bo...@lists.boost.org

Sorry to chime in so late in the discussion.

What about a syntax similar to this?

int main()
{
auto s = join("Hello ", ", World.", " The hex for ", 58, " is ",
std::hex, 58);
std::cout << s << std::endl;

s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana"));
std::cout << s << std::endl;

}

Which would produce the following output:

Hello , World. The hex for 58 is 3a
a : b : c8 : “banana"

sample implementation (io manipulators may be incomplete, some efficiency

gains could be mad):

SepStr const& sep;
};

}

_______________________________________________

Olaf van der Spek

unread,

Jan 18, 2017, 3:07:20 AM1/18/17

to bo...@lists.boost.org

On Mon, Jan 16, 2017 at 11:41 AM, Richard Hodges <hodg...@gmail.com> wrote:

> Sorry to chime in so late in the discussion.
>
> What about a syntax similar to this?
>
> int main()
> {
> auto s = join("Hello ", ", World.", " The hex for ", 58, " is ",
> std::hex, 58);
> std::cout << s << std::endl;
>
> s = join(separator(" : "), "a", "b", std::hex, 200 ,
> std::quoted("banana"));
> std::cout << s << std::endl;
>
> }
>
> Which would produce the following output:
>
> Hello , World. The hex for 58 is 3a
> a : b : c8 : “banana"
>

The syntax is fine but it's missing an appending variant, like append(s,
"A", "B", 42);
This variant is important as it (also) allows you to reuse existing storage.

Richard Hodges

unread,

Jan 18, 2017, 5:38:26 AM1/18/17

to bo...@lists.boost.org

Ok. here's my second attempt.

It has 2 improvements:

1. allows appending to an existing string by specifiying std::string&
join(onto(s), [optional separator("xxx"), parts...);
2. replaces use of std::ostringstream with a (very simple!) version that
does string appending.

#include <sstream>
#include <iostream>
#include <iomanip>

struct string_ref_buffer
: std::streambuf
{
using inherited = std::streambuf;
using char_type = inherited::char_type;
using char_traits = std::char_traits<char_type>;

int overflow(int c) override
{
if (c != char_traits::eof()) {
buffer_.push_back(c);
}
return char_traits::not_eof(c);
}

string_ref_buffer(std::string& buffer)
: buffer_(buffer)
{

}

const std::string& str() const & { return buffer_; }
std::string&& str()&& { return std::move(buffer_); }

std::string& buffer_;
std::size_t inpos_ = 0;
std::size_t outpos_ = 0;
};

namespace detail {
template<class SepStr>
struct separator_object
{
template<class T>
std::ostream& operator ()(std::ostream& s, T&& t) const
{
return s << sep << t;
}

//
// other iomanp specialisations here
//
std::ostream& operator ()(std::ostream& s,
std::ios_base&(*t)(std::ios_base&)) const
{
t(s);
return s;
}

SepStr const& sep;
};

struct no_separator_object
{
template<class T>
std::ostream& operator ()(std::ostream& s, T&& t) const
{
return s << t;
}
};

template<class Target, class Separator, class...Rest>
auto& join_onto(Target&& target, Separator&& sep, Rest&&...rest)
{
string_ref_buffer sbuf { target.str() };
std::ostream ss(std::addressof(sbuf));

using expand = int [];
void(expand{0,
((sep(ss, rest)), 0)...
});

return target;
};

template<class Separator, class String, class...Rest>
auto join(Separator&& sep, String&& s, Rest&&...rest)
{

std::string result {};
string_ref_buffer sbuf { result };
std::ostream ss(std::addressof(sbuf));

ss << s;
using expand = int [];
void(expand{0,
((sep(ss, rest)), 0)...
});

return result;
};

template<class String>
struct onto_type
{
String& str() { return target_.get(); }
std::reference_wrapper<String> target_;
};

}

template<class String>
auto onto(String& target)
{
return detail::onto_type<String> { target };
}

template<class Sep>
static constexpr auto separator(Sep const& sep)
{
using sep_type = std::remove_const_t<std::remove_reference_t<Sep>>;
return detail::separator_object<sep_type> { sep };
}

template<class SepObject, class String, class...Rest>

decltype(auto) join(detail::separator_object<SepObject> sep, String&&

s, Rest&&...rest)
{
return detail::join(sep,
std::forward<String>(s),
std::forward<Rest>(rest)...);
};

template<class String, class...Rest>
decltype(auto) join(String&& s, Rest&&...rest)

{
return detail::join(detail::no_separator_object(),
std::forward<String>(s),
std::forward<Rest>(rest)...);
};

template<class Target, class SepObject, class String, class...Rest>
decltype(auto) join(detail::onto_type<Target> target,
detail::separator_object<SepObject> sep, String&& s, Rest&&...rest)
{
return detail::join_onto(target,

sep,
std::forward<String>(s),
std::forward<Rest>(rest)...);
};

template<class Target, class String, class...Rest>
decltype(auto) join(detail::onto_type<Target> target, String&& s, Rest&&...rest)
{
return detail::join_onto(target,

detail::no_separator_object(),
std::forward<String>(s),
std::forward<Rest>(rest)...);
};

int main()
{
auto s= std::string("foo");

s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58);
std::cout << s << std::endl;

s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana"));
std::cout << s << std::endl;

join(onto(s), separator(", "), "funky", "chicken");
join(onto(s), "=====");

std::cout << s << std::endl;
}

expected output:

Hello , World. The hex for 58 is 3a
a : b : c8 : "banana"

a : b : c8 : "banana", funky, chicken=====

On 18 January 2017 at 09:53, Richard Hodges <hodg...@gmail.com> wrote:

> That's pretty straightforward with another overload:
>
> auto& s = join(to(y), separator(", "), "A", "b", 42);
>
> where to(y) is something like
>
> template<String>
> struct to_existing_type<String> {
> String& get() { return s_; }
> String s_;
> };
>
> template<class String>
> auto to(String& s)
> {
> return to_existing_type<S>(s);
> }
>
> With a bit of template unwrapping, we could imagine something like this:
>
> join(to(x), 2, 3, to(y), "foo", "bar", create(), "baz", 42);
>
> which would return a tuple:
>
> std::tuple<std::string&, std::string&, std::string>
>
> in c++17 this would allow:
>
> auto&& [x, y, z] = join(to(x), 2, 3, to(y), "foo", "bar", create(), "baz",
> 42);
>
> But this maybe taking it a bit far... What do you think?

Richard Hodges

unread,

Jan 18, 2017, 5:38:55 AM1/18/17

to bo...@lists.boost.org

That's pretty straightforward with another overload:

auto& s = join(to(y), separator(", "), "A", "b", 42);

where to(y) is something like

template<String>
struct to_existing_type<String> {
String& get() { return s_; }
String s_;
};

template<class String>
auto to(String& s)
{
return to_existing_type<S>(s);
}

With a bit of template unwrapping, we could imagine something like this:

join(to(x), 2, 3, to(y), "foo", "bar", create(), "baz", 42);

which would return a tuple:

std::tuple<std::string&, std::string&, std::string>

in c++17 this would allow:

auto&& [x, y, z] = join(to(x), 2, 3, to(y), "foo", "bar", create(), "baz",
42);

But this maybe taking it a bit far... What do you think?

On 18 January 2017 at 09:06, Olaf van der Spek <m...@vdspek.org> wrote:

Hans Dembinski

unread,

Jan 18, 2017, 6:12:59 AM1/18/17

to bo...@lists.boost.org

Hi Richard,

> On 18 Jan 2017, at 10:59, Richard Hodges <hodg...@gmail.com> wrote:
>
> Ok. here's my second attempt.

I really like the expressive syntax of join(…). :) The implementation looks fine, too, but I am not an expert. In any case, I would love to use this.

Best regards,
Hans

Christof Donat

unread,

Jan 18, 2017, 10:12:16 AM1/18/17

to bo...@lists.boost.org

Hi,

Am 18.01.2017 10:59, schrieb Richard Hodges:
> int main()
> {
> auto s= std::string("foo");
>
> s = join("Hello ", ", World.", " The hex for ", 58, " is ",
> std::hex, 58);
> std::cout << s << std::endl;
>
> s = join(separator(" : "), "a", "b", std::hex, 200 ,
> std::quoted("banana"));
> std::cout << s << std::endl;
>
> join(onto(s), separator(", "), "funky", "chicken");
> join(onto(s), "=====");
> std::cout << s << std::endl;
> }

I do like the idea, but not the naming. With a function name "join" I'd
expect to be able to pass an iterator range and have all its element
concatenated into a string with a defined separator like this:

auto my_number = std::vector<int>{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
std::cout << join(std::begin(my_number), std::end(my_number), ", ") <<
std::endl;

expected output:

1, 2, 3, 4, 5, 6, 7, 8, 9, 10

maybe the name cat(), or concatenate() would fit better here.

Also I am not 100% happy with is the reuse of iostream manipulators.
They already don't have a distinct range of effect in iostream library.
Another issue is, that the way, they work, is pretty complicated to be
reused in a high performance implementation.

How about an API like this:

s = concat("Hello ", ", World.", " The hex for ", 58, " is ",
concat_lib::hex(58));
s = concat(separator(" : "), "a", "b", concat_lib::hex(100),
std::quoted("banana"));

concat(onto(s), separator(", "), "funky", "chicken"); // will return a
reference to s I guess
concat(onto(s), "=====");

This would go well with what I'd expect with the function name join():

s = join(std::begin(my_number), std::end(my_number));
// -> "12345678910"
s = join(separator(", "), std::begin(my_number), std::end(my_number));
// -> "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
join(onto(s), separator(" - "), std::begin(my_number),
std::end(my_number));
// -> "1, 2, 3, 4, 5, 6, 7, 8, 9, 101 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 -
10"

Now, when we put it like that, there is no way, to let the functions
"steal" work from each other. If you take this e.g.:

concat(on(s), join(separator(", "), std::begin(my_number),
std::end(my_number)), " ",
join(separator(" - "), std::begin(my_number),
std::end(my_number)));

This will first execute the two join() calls that return strings, which
are then passed to concat(). Instead join() and concat() could return
objects, that can be converted to std::string. Then concat() can make
the result of join() render its output into a common buffer.

auto s = concat(join(separator(", "), std::begin(my_number),
std::end(my_number)), " ",
join(separator(" - "), std::begin(my_number),
std::end(my_number))).str();

Christof

Richard Hodges

unread,

Jan 18, 2017, 11:30:54 AM1/18/17

to bo...@lists.boost.org

I agree re naming. concat() or concatenate() seems a better fit.

I also share your unease re formatting. I have of course shoehorned the
ostream formatters in here without thinking it through deeply (there is
also the issue of the separator getting formatted, so any sequence of
formatters must be executed *after* the separator has been applied, and
formatters really ought to restore previous state.

I think it's reasonable to develop a series of lightweight function object
factories that provide trivial formatting objects in the same way that
std::quoted does for std::ostream.

In that case, it is perfectly reasonable to do away with the entire
atrocious ostream performance bottleneck altogether (perhaps there are
questions though, about formatting options and locales?).

Totally agree with returning a string factory. That makes perfect sense.
onto(x) could return the correct kind of wrapper, depending on the argument
type of x. So it could cope with x being for example, std::string&,
std::string const&, std::string&& or std::ostream&.

As an observation, expressing the join as an iterator pair lends itself to
being implemented in terms of std::copy(first, last,
formatting_iterator<...>).

I think this is good for containers, but for a series of disjoint types, or
for joining words (as opposed to letters), you'd still need some templatery.

boost::range springs to mind as a reasonable helper for expressing to
concat (or join) that you want to treat each element of a container.

Lee Clagett

unread,

Jan 18, 2017, 10:19:39 PM1/18/17

to bo...@lists.boost.org

On Mon, 16 Jan 2017 21:23:58 +0100
Damien Buhl <damie...@lecbna.org> wrote:

> On 16/01/2017 09:54, Olaf van der Spek wrote:
> > http://abel.web.elte.hu/mpllibs/safe_printf/snprintf.html
> >
> > It appears to only do checking at compile-time and then forwards to
> > sprintf..
> Yes the current implementation, but based on the expression template
> resulting from the parsing, one could easily produce at compile-time
> efficient code for formatting.
>

I have a project http://code.leeclagett.com/prima (currently just a
redirect to github) which generates a spirit::karma expression from a
C-string literal using metaparse. The functions are closely modeled on
the C format functions, but take an output iterator or a std::ostream
instead. The functions are also constexpr objects, so they should work
with the Fit library. For some reason the README.md does not mention
the `fprintf` function, but it is currently in the repo too (the
thread-safe variant has not been pushed out yet). There are a decent
number of tests that compare the results against the system C
formatting functions.

The compile times are not great; metaparse, proto, and spirit v2 are a
rough combination. And the documentation is not 100% accurate - a
number of format string errors are not trapped at compile-time yet. I'm
hoping to get a nicer error reporting system, currently the wrong type
can yield an impenetrable error from within spirit.

The "backend" is currently configurable, and if you can make sense of
my "IR" from the metaparse frontend an output system that does not use
spirit could be written. It would likely result in faster compile
times, but I didn't want to write a floating point generator since there
was much to experiment with on the interface and format string parsing
side. Perhaps something to do if there is interest ...

> But I wasn't clear enough, I just wanted to tell that with the
> underlying library metaparse, one can do the compile-time dispatching.
> Naturally I think the compile-time cost would surely be higher than
> the cat() solution.
>

Lee

Christof Donat

unread,

Jan 19, 2017, 8:36:09 AM1/19/17

to bo...@lists.boost.org

Hi,

Am 18.01.2017 17:30, schrieb Richard Hodges:
> Totally agree with returning a string factory. That makes perfect
> sense.
> onto(x) could return the correct kind of wrapper, depending on the
> argument
> type of x. So it could cope with x being for example, std::string&,
> std::string const&, std::string&& or std::ostream&.

How about this:

auto s = concat(1, " ", 2).str(); // -> s = "1 2"
concat(" ", 3).append_to(s); // -> s = "1 2 3"
// reuse preallocated memory
concat(4, " ", 5).overwrite(s); // -> s = "4 5"

overwrite() could also take a std::string_view with C++17, or const
char* and size_t with earlier versions of the standard library.

> As an observation, expressing the join as an iterator pair lends itself
> to
> being implemented in terms of std::copy(first, last,
> formatting_iterator<...>).

That definatelly is a possible implementation, yes. Though, of course
having join(), concat() and maybe other functions return string
factories instead of strings, enables generating the whole string in a
single buffer without copying.

concat(join(...), " ", 42, " - ", join(...), format("format string %1%
%2% %3%", a, b, 42)).str();

concat() will allocate a long enough string and call overwrite() on the
results of join() and format() with string views on that string.

Taking everything together, a string factory will have at least
interface like this:

template<typename T>
constexpr bool string_factory() {
return requires(T a, std::string& s, std::string_view v, const char*
p, size_t len) {
// estimate necessary memory to render the string
{ a.size() } const -> size_t;

// render and return result
{ a.str() } const -> std::string;

// render at the end of an existing string - return the number
of generated chars
{ a.append_to(s) } -> size_t;

// render into an existing string, reusing its preallocated
memory
{ a.overwrite(s) } -> size_t;

// render into a string view
{ a.overwrite(v) } -> size_t;

// render into a character buffer
{ a.overwrite(p, len) } -> size_t;
};
}

I hope, the constraint syntax is more or less correct. I haven't used
constraints in real code up to now ;-)

> I think this is good for containers, but for a series of disjoint
> types, or
> for joining words (as opposed to letters), you'd still need some
> templatery.

Yes, sure. You might also want to have somthing like this:

std::tuple<std::string, int, double> my_tuple{"asdf", 42, 2.5;
join(separator(", "), my_tuple);
// -> "asdf, 42, 2.5"

I think, using C++17s std::apply() it should be a straight forward
wrapper around concat().

> boost::range springs to mind as a reasonable helper for expressing to
> concat (or join) that you want to treat each element of a container.

Yes, when I wrote about my idea of join(), I thought of ranges as well.
With range adaptors, that will make up for a very powerfull and btw.
fast library to generate strings:

format("file names and sizes:\n%1%\n",
join(separator('\n'),
my_files |
range::transformed([](const std::filesystem::path& f) ->
auto {
return concat(separator(": "),
f.filename(),

std::filesystem::file_size(f));
})).str();

format().str() will ask the join() string factory to render into the
preallocated buffer. Then join() will walk through its range and find,
that it has a range string factories, returned by concat(). Therefore it
asks every string factory to render to the given buffer.

We "just" need format(), join(), concat(), and the corresponding string
factories.

Richard Hodges

unread,

Jan 19, 2017, 8:53:56 AM1/19/17

to bo...@lists.boost.org

I think this is looking good.

Some suggested separator flavours...

join(separator(", "), ...) -- results in "a, b, c"
join(prefix(", "), ...) -- results in ", a, b, c" (for appending to
an existing list)
join(suffix(", "), ...) -- results in "a, b, c, "

and the start of an idea...

join(wrapped_sequence(range, "[", "]", ",", " ")) -- results in "[ a,
b, c ]" or "[ ]" if the sequence is empty.

The last one would allow automatic generation of JSON...

Hans Dembinski

unread,

Jan 19, 2017, 9:12:50 AM1/19/17

to bo...@lists.boost.org

Hi Christof and Richard,

> On 18 Jan 2017, at 16:11, Christof Donat <c...@okunah.de> wrote:
> I do like the idea, but not the naming. With a function name "join" I'd expect to be able to pass an iterator range and have all its element concatenated into a string with a defined separator like this:

please don't change the name. I don't see why "join" implies iterators. "join" is a standard name for similar functions which join strings in other languages, e.g.

Python
Perl
C#
Java
JavaScript

There is also the Qt library which has a QStringList::join method.

"join" is well established and a short word, too, certainly better than "concat", which has to be abbreviated to not be too long.

I have more arguments. The word "concatenate" sounds awkward and artificial. "join" is a word you use in daily conversations, "concatenate" is not. Google search for "join" yields 3e9 hits, Google search for "concatenate" yields 3e6 hits, so you can say "join" is about 1000x more common. I personally hate technical jargon in any field. Language was invented to include, not to exclude.

Finally, "concatenate" in other programming contexts usually means that you append one collection to another collection. This is very different from what "join" does, which is piecing many individual strings and string-converted arguments together. That's clearly a "joining" operation.

Best regards,
Hans

Christof Donat

unread,

Jan 19, 2017, 9:15:15 AM1/19/17

to bo...@lists.boost.org

Hi,

Am 19.01.2017 14:53, schrieb Richard Hodges:
> and the start of an idea...
>
> join(wrapped_sequence(range, "[", "]", ",", " ")) -- results in
> "[ a,
> b, c ]" or "[ ]" if the sequence is empty.
>
> The last one would allow automatic generation of JSON...

I think, that is a great Idea. Maybe we'd prefer a more ranges like api
here:

join(separator(', '), range | wrapped("[", "]"));

The ranges expression then of course returns a range of string
factories. Just the first one and the last one do more than just pass
through to the unterlying type.

Christof Donat

unread,

Jan 19, 2017, 10:27:39 AM1/19/17

to bo...@lists.boost.org

Am 19.01.2017 15:12, schrieb Hans Dembinski:
> Hi Christof and Richard,
>
>> On 18 Jan 2017, at 16:11, Christof Donat <c...@okunah.de> wrote:
>> I do like the idea, but not the naming. With a function name "join"
>> I'd expect to be able to pass an iterator range and have all its
>> element concatenated into a string with a defined separator like this:
>
> please don't change the name. I don't see why "join" implies
> iterators. "join" is a standard name for similar functions which join
> strings in other languages, e.g.
>
> Python

https://docs.python.org/3/library/stdtypes.html#str.join

", ".join(my_numbers)

versus

"Hello" + ", " + "World"

or - a bit arkward

", ".join(["Hello", "World"])

or

"Hello".append(", ").append("World")

> Perl

join($sep, @array)

versus

jon($sep, a, b, c, d)

> C#

https://msdn.microsoft.com/de-de/library/57a79xd0(v=vs.110).aspx

String.join(sep, string_arr);

versus

String.concat(a, b);

> Java

https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#join-java.lang.CharSequence-java.lang.Iterable-

String.join(sep, some_iterable);

versus

String.join(sep, a, b, c, d);

> JavaScript

https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Global_Objects/Array/join

my_array.join(sep)

versus

String.concat(a, b, c, d);

While all languages, you named, support the idea of join(), to work on
iterables, only Java, and Perl reuse the name for parameter lists, as
well. Python can be shoehorned to concatenate with join(). Where I
proposed to use the word concat(), Python uses the operator +, or
append(), while C# and JavaScript call that concat().

Actually consistency with the experience from other languages is exactly
my concern here. We already have operator + and operator += for string
concatenation, but they are not the most efficient solutions.

> There is also the Qt library which has a QStringList::join method.

https://doc.qt.io/qt-5/qstringlist.html#join

Yes. Again just like I proposed to use that word. Thanks for making my
point. For concatenation of QStrings you'd use append(), push_back or
operator +:

s.append("Hello").append(", ").append("World");
s.push_back("Hello").push_back(", ").push_back("World");
s += "Hello" + ", " + "World");

> "join" is well established and a short word, too, certainly better
> than "concat", which has to be abbreviated to not be too long.

Yes, join is well established for a specific use case. I just ask not to
use it for a different one. Instead my proposal includes a join()
function, that works just like you'd expect from the established word.

> I have more arguments. The word "concatenate" sounds awkward and
> artificial. "join" is a word you use in daily conversations,
> "concatenate" is not. Google search for "join" yields 3e9 hits, Google
> search for "concatenate" yields 3e6 hits, so you can say "join" is
> about 1000x more common. I personally hate technical jargon in any
> field. Language was invented to include, not to exclude.

There are two distinct concepts, we'll have to either use to distinct
words or resort to overloading the same as Java and Perl do. Since I
like to call different things differently I'd propose to go the way, C#
and JavaScript go.

> Finally, "concatenate" in other programming contexts usually means
> that you append one collection to another collection.

See above.

To make things clear, a naive implementation of join, that only supports
strings could look like this:

template<typename Iterator>
auto join(const std::string& separator, Iterator i, Iterator end) ->
std::string {
auro rval = std::ostringstream{};

auto addSeparator = false;
for(; i != end; ++i) {
if( addSeparator ) rval << separator;
else addSeparator = true;

rval << *i;
}

return rval.str();
}

A naive implementation of concat(), could look like this:

auto concat_impl() -> std::ostringstream {
return {};
}

template <typename LastArg, typename ... Args>
auto concat_impl(Args&& ... args, LastArg&& last_arg) ->
std::ostringstream {
return concat_impl(std::forward(args)...) << last_arg;
}

template <typename ... Args>
auto concat(Args&& ... args) -> std::string {
return concat_impl(std::forward(args)...).str();
}

Both implementations are untested. They are just here to emphasize my
point.

Christof

Bjorn Reese

unread,

Jan 19, 2017, 12:02:00 PM1/19/17

to bo...@lists.boost.org

On 01/19/2017 02:53 PM, Richard Hodges wrote:

> join(wrapped_sequence(range, "[", "]", ",", " ")) -- results in "[ a,
> b, c ]" or "[ ]" if the sequence is empty.
>
> The last one would allow automatic generation of JSON...

Except that "[ a, b, c ]" is not valid JSON.

Richard Hodges

unread,

Jan 19, 2017, 1:26:50 PM1/19/17

to bo...@lists.boost.org

> Except that "[ a, b, c ]" is not valid JSON.

Youre right, a JSON adapter would need to enquote strings, write bools as
alphas and have access to an ADL-aware function to turn objects into tuples
of NVP generators. That shouldn't be too much of an issue.

Hans Dembinski

unread,

Jan 20, 2017, 4:41:27 AM1/20/17

to bo...@lists.boost.org

Dear Christof,

you did a much more thorough (and correct) job on researching the use of "join" in other languages, while mine was really quick. I respect that and feel embarrassed that I didn't do a better job myself.

> To make things clear, a naive implementation of join, that only supports strings could look like this:
>
> template<typename Iterator>
> auto join(const std::string& separator, Iterator i, Iterator end) -> std::string {
> auro rval = std::ostringstream{};
>
> auto addSeparator = false;
> for(; i != end; ++i) {
> if( addSeparator ) rval << separator;
> else addSeparator = true;
>
> rval << *i;
> }
>
> return rval.str();
> }
>
>
> A naive implementation of concat(), could look like this:
>
> auto concat_impl() -> std::ostringstream {
> return {};
> }
>
> template <typename LastArg, typename ... Args>
> auto concat_impl(Args&& ... args, LastArg&& last_arg) -> std::ostringstream {
> return concat_impl(std::forward(args)...) << last_arg;
> }
>
> template <typename ... Args>
> auto concat(Args&& ... args) -> std::string {
> return concat_impl(std::forward(args)...).str();
> }
>
>
> Both implementations are untested. They are just here to emphasize my point.

I still think that "join" is a better name for what you call "concat", as the function is literally joining various pieces and that name does not require an abbreviation. What you call "join" could be called "join_sequence".

Best regards,
Hans

Christof Donat

unread,

Jan 20, 2017, 2:03:40 PM1/20/17

to bo...@lists.boost.org

Hi,

Am 20.01.2017 10:41, schrieb Hans Dembinski:
> I still think that "join" is a better name for what you call "concat",
> as the function is literally joining various pieces and that name does
> not require an abbreviation. What you call "join" could be called
> "join_sequence".

I think, we should try and go for the least surprise. People coming form
other languages will probably associate with join(), what you call
"join_sequence", while they will expect a name like concat(), operator
+, append(), etc. for what you prefer to call "join".

So the least surprise is, I still think, with concat(a, b, c, d) and
join(separator, sequence), just as with C#, JavaScript. With Qt and
Python you use append() or operator + instead of concat(), but join() is
consistent with them as well. The advantage of concat() over the
operator + on strings in C++ is, that concat() can easily be implemented
in a much more efficient way with less realloc()s and therefore less
data copying. At lesst, when we don't want to have an API like this:

auto my_new_str = (cat() + "Hello" + ", " + "World" + "!").str();

Here cat() would return a string factory type, that collects all the
stuff, that you "add" to it. In str() it allocates a string big enough
for all the content and renders in there. Well compared to

auto my_new_str = concat("Hello", ", ", "World", "!").str();

Judge for yourself, which one is easier for the user to understand.

Christof

Hans Dembinski

unread,

Jan 23, 2017, 3:55:17 AM1/23/17

to bo...@lists.boost.org

> On 20 Jan 2017, at 20:03, Christof Donat <c...@okunah.de> wrote:
>
> So the least surprise is, I still think, with concat(a, b, c, d) and join(separator, sequence), just as with C#, JavaScript. With Qt and Python you use append() or operator + instead of concat(), but join() is consistent with them as well. The advantage of concat() over the operator + on strings in C++ is, that concat() can easily be implemented in a much more efficient way with less realloc()s and therefore less data copying. At lesst, when we don't want to have an API like this:

I agree with the principle of least surprise, it is something I am also applying. :) I think that "join" is less surprising, but it looks like the evidence is against me, so I stand down.

> auto my_new_str = (cat() + "Hello" + ", " + "World" + "!").str();
>
> Here cat() would return a string factory type, that collects all the stuff, that you "add" to it. In str() it allocates a string big enough for all the content and renders in there. Well compared to
>
>
> auto my_new_str = concat("Hello", ", ", "World", "!").str();
>
> Judge for yourself, which one is easier for the user to understand.

Oh, I never questioned this. I see the obvious technical and stylistic advantages of the latter. I am just arguing about names.

Your last proposal was without the ".str()", which made more sense:

auto my_new_str = concat("Hello", ", ", "World", "!");

I hope this was just a typo.

Christof Donat

unread,

Jan 23, 2017, 5:23:45 AM1/23/17

to bo...@lists.boost.org

Hi,

Am 23.01.2017 09:55, schrieb Hans Dembinski:
> On 20 Jan 2017, at 20:03, Christof Donat <c...@okunah.de> wrote:
>> auto my_new_str = concat("Hello", ", ", "World", "!").str();
>>

>> [...]

>
> Your last proposal was without the ".str()", which made more sense:
>
> auto my_new_str = concat("Hello", ", ", "World", "!");
>
> I hope this was just a typo.

Actually no, it wasn't. The idea is, that functions like concat(),
join() and format() return string factories, instead of strings. This is
an advantage, when you combine calls to these functions:

auto my_new_str =
concat("Hello ",

join(", ",std::begin(my_nums), std::end(my_nums)),
format(" the file %1% contains %2% bytes", filename,
filesize)).str();

As you see, I only call str() on the return value of concat(). We can
make the string factories write to a preallocated buffer. Here the
concat facory will allocate enough memory, and ask the join string
factory, and the format string factory to write directly to that buffer.
If join() and format() would return strings, we'd have to copy them in
concat().

An alternative to a function like str() could be an implicit conversion
operator to std::string. Then we would write:

auto my_new_str = std::string{concat("Hello ",
join(", ",std::begin(my_nums),
std::end(my_nums)),
format(" the file %1% contains %2%
bytes", filename, filesize))};

or

std::string my_new_str{concat("Hello ",
join(", ",std::begin(my_nums),
std::end(my_nums)),
format(" the file %1% contains %2% bytes",
filename, filesize))};

Again we pass two string factories to concat(). Therefore it can
allocate enough memory and ask these factories to write there. The only
difference is, that all that does not happen in a call to str(), but in
a call to operator std::string (). I am not completely opposed to
implicit conversions, but in this case, an explicit function call feels
better to me. Sorry, that I don't have a better reason at the moment.

No matter if we use str() of an implicit conversion, combining these
calls also lets us use tag parameters for formatting details:

format("%1% is %2% and %3%", 42,
concat(format::hex<int>, 42, " in hex"),
concat(format::oct<int>, 42, " in
oct")).str();

Other than the iostream manipulators these formatting instructions have
a defined scope. And still we don't have to copy temporary strings,
because we pass string factories.

Christof

Hans Dembinski

unread,

Jan 23, 2017, 10:32:49 AM1/23/17

to bo...@lists.boost.org

> On 23 Jan 2017, at 11:23, Christof Donat <c...@okunah.de> wrote:
>
> Hi,
>
> Am 23.01.2017 09:55, schrieb Hans Dembinski:
>> On 20 Jan 2017, at 20:03, Christof Donat <c...@okunah.de> wrote:
>>> auto my_new_str = concat("Hello", ", ", "World", "!").str();
>>> [...]
>> Your last proposal was without the ".str()", which made more sense:
>> auto my_new_str = concat("Hello", ", ", "World", "!");
>> I hope this was just a typo.
>
> Actually no, it wasn't. The idea is, that functions like concat(), join() and format() return string factories, instead of strings. This is an advantage, when you combine calls to these functions:
>
> auto my_new_str =
> concat("Hello ",
> join(", ",std::begin(my_nums), std::end(my_nums)),
> format(" the file %1% contains %2% bytes", filename, filesize)).str();

It makes sense to me for "join" to return a string factory, because it is likely to be nested in "concat". But I don't see the practical case of nested "concat" calls, at least it is not going to be a common pattern in the need of optimising.

If "concat" is the outer layer anyway, I would return a std::string directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.

Christof Donat

unread,

Jan 23, 2017, 11:32:37 AM1/23/17

to bo...@lists.boost.org

Hi,

Am 23.01.2017 16:32, schrieb Hans Dembinski:
>> On 23 Jan 2017, at 11:23, Christof Donat <c...@okunah.de> wrote:
>> auto my_new_str =
>> concat("Hello ",
>> join(", ",std::begin(my_nums), std::end(my_nums)),
>> format(" the file %1% contains %2% bytes", filename,
>> filesize)).str();
>
> It makes sense to me for "join" to return a string factory, because it
> is likely to be nested in "concat". But I don't see the practical case
> of nested "concat" calls, at least it is not going to be a common
> pattern in the need of optimising.

There is several usecases:

1. scope for formatting tags:

concat(format::hex<int>, 42, " is hex for ", concat(42)).str();

Here the inner concat will convert the 42 to its decimal representation,
while the outer one converts the first 42 to its hex representation.

2. concat() in calls to format():

format("%|1$40t|%2%", concat(first_name, " ", last_name),
phone_number).str();

format().str() will allocate the buffer and ask the concat string
factory to write into it.

3. results from concat() in a boost::range that is passed to join():

join(separator("\n"),
my_files | transformed([](const std::filesystem::path& f) -> auto {
return concat(f.filename, ": ",
std::filesystem::file_size(f));
})).str();

join().str() will ask every concat string factory to render directly
into the common buffer.

> If "concat" is the outer layer anyway, I would return a std::string
> directly for convenience. It is easy to forget the trailing .str() and
> it does not look elegant.

Of course better proposals are welcome :-) Would you prefer the implicit
conversion? If so, why?

Christof

Olaf van der Spek

unread,

Jan 23, 2017, 1:27:08 PM1/23/17

to bo...@lists.boost.org

On Mon, Jan 23, 2017 at 5:32 PM, Christof Donat <c...@okunah.de> wrote:

> Hi,
>
> Am 23.01.2017 16:32, schrieb Hans Dembinski:
>
>> On 23 Jan 2017, at 11:23, Christof Donat <c...@okunah.de> wrote:
>>> auto my_new_str =
>>> concat("Hello ",
>>> join(", ",std::begin(my_nums), std::end(my_nums)),
>>> format(" the file %1% contains %2% bytes", filename,
>>> filesize)).str();
>>>
>>
>> It makes sense to me for "join" to return a string factory, because it
>> is likely to be nested in "concat". But I don't see the practical case
>> of nested "concat" calls, at least it is not going to be a common
>> pattern in the need of optimising.
>>
>
> There is several usecases:
>
> 1. scope for formatting tags:
>
> concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
>
> Here the inner concat will convert the 42 to its decimal representation,
> while the outer one converts the first 42 to its hex representation.
>

Wouldn't concat(hex(42), " is hex for", 42) make more sense?

>
> 2. concat() in calls to format():
>
> format("%|1$40t|%2%", concat(first_name, " ", last_name),
> phone_number).str();
>

> Why not fold the name concat into the format string?

> format().str() will allocate the buffer and ask the concat string factory
> to write into it.
>
> 3. results from concat() in a boost::range that is passed to join():
>
> join(separator("\n"),
> my_files | transformed([](const std::filesystem::path& f) -> auto {
> return concat(f.filename, ": ",
> std::filesystem::file_size(f));
> })).str();
>
> join().str() will ask every concat string factory to render directly into
> the common buffer.
>
> If "concat" is the outer layer anyway, I would return a std::string
>> directly for convenience. It is easy to forget the trailing .str() and
>> it does not look elegant.
>>
>
> Of course better proposals are welcome :-) Would you prefer the implicit
> conversion? If so, why?

Implicit is problematic with auto..

--
Olaf

Richard Hodges

unread,

Jan 23, 2017, 1:44:10 PM1/23/17

to bo...@lists.boost.org

> Why not fold the name concat into the format string?

Because format strings are evil. They cause errors that can only be
detected at runtime, and that is not a Good Thing (tm).

Gavin Lambert

unread,

Jan 23, 2017, 6:16:02 PM1/23/17

to bo...@lists.boost.org

On 24/01/2017 07:26, Olaf van der Spek wrote:
>> 1. scope for formatting tags:
>>
>> concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
>>
>> Here the inner concat will convert the 42 to its decimal representation,
>> while the outer one converts the first 42 to its hex representation.
>
> Wouldn't concat(hex(42), " is hex for", 42) make more sense?

+1. If you like persistent formatting states (and all the unexpected
fun they cause when you forget to cancel them), use iostreams instead.

>> 3. results from concat() in a boost::range that is passed to join():
>>
>> join(separator("\n"),
>> my_files | transformed([](const std::filesystem::path& f) -> auto {
>> return concat(f.filename, ": ",
>> std::filesystem::file_size(f));
>> })).str();

Maybe I missed something, but what was the intended distinction between
concat() and join()?

To me, as vocabulary words, concat() implies "concatenate without
separator" and join() implies "concatenate with separator"; as such it
seems unnecessary to explicitly decorate the separator here since it's a
required parameter.

(Both should accept either variadics or ranges, if that's not too
complicated to arrange, though it's likely for join() to be used more
often on ranges; concat() might be more evenly split, though probably
leaning toward variadics.)

Although I suppose
http://www.boost.org/doc/libs/1_63_0/doc/html/string_algo/reference.html#header.boost.algorithm.string.join_hpp
and
http://www.boost.org/doc/libs/1_63_0/libs/range/doc/html/range/reference/utilities/join.html
might not entirely agree on this vocabulary...

(Loads the bikeshed on the back of a bike and rides away.)

>>> If "concat" is the outer layer anyway, I would return a std::string
>>> directly for convenience. It is easy to forget the trailing .str() and
>>> it does not look elegant.
>>
>> Of course better proposals are welcome :-) Would you prefer the implicit
>> conversion? If so, why?
>
> Implicit is problematic with auto..

While that's true, I think the flexibility of returning a factory from
concat() is more useful than the discomfort of either remembering the
str() or using std::string explicitly as the type instead of auto (or
using auto combined with explicit std::string construction).

Jeff Garland

unread,

Jan 23, 2017, 8:59:43 PM1/23/17

to bo...@lists.boost.org

So let me just say that this is a bike shed topic -- I've been in the shed
a decade ago on the subject -- but hey why not :) I dropped the proposal
below bc it was clear to me that no agreement was possible on the topic.
The proposal in isn't variadic because that feature didn't exist at the
time, but you can see how it would be trivially changed to be -- why it is
begging to be variadic. It brings together many of the discussed libraries
format, string_algo, regex into an interface that is simple and clear. It
allows for both formatting and appending. Extremely small sample

super_string <http://www.crystalclearsoftware.com/libraries/super_string/classbasic__super__string.html>
s(" (456789) [123] 2006-10-01 abcdef ");
s.to_upper();
cout << s << endl;

s.trim(); //lop off the whitespace on both sides
cout << s << endl;

double dbl = 1.23456;
s.append(dbl); //append any streamable type
s+= " ";
cout << s << endl;

date d(2006, Jul, 1);
s.insert_at(28, d); //insert any streamable type
cout << s << endl;

super_string s;
double dbl = 1.123456789;
int i = 1000;
s.append_formatted
<http://www.crystalclearsoftware.com/libraries/super_string/classbasic__super__string.html#67740abfd6224fab0402b3dd7076f216>(dbl,
i , dbl, i, "a string", "%-7.2f %-7d %-7.2f %-7d %s");
//s == "1.12 1000 1.12 1000 a string"

//other overloadings available with less parameters
super_string s1;
s1.append_formatted
<http://www.crystalclearsoftware.com/libraries/super_string/classbasic__super__string.html#67740abfd6224fab0402b3dd7076f216>(dbl,
"This is the value: %-7.2f");
//s1 == "This is the value: 1.12"

main page
http://www.crystalclearsoftware.com/libraries/super_string/index.html

My justification for *why* I did this as a type against all the *standard
judgement* of the c++ experts
http://www.crystalclearsoftware.com/libraries/super_string/index.html#why_type

My original post in 2006
http://lists.boost.org/Archives/boost/2006/07/107087.php

Jeff

On Mon, Jan 23, 2017 at 4:14 PM, Gavin Lambert <gav...@compacsort.com>
wrote:

Christof Donat

unread,

Jan 24, 2017, 4:07:17 AM1/24/17

to bo...@lists.boost.org

Hi,

Am 23.01.2017 19:26, schrieb Olaf van der Spek:
> On Mon, Jan 23, 2017 at 5:32 PM, Christof Donat <c...@okunah.de> wrote:
>> 1. scope for formatting tags:
>>
>> concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
>>
>> Here the inner concat will convert the 42 to its decimal
>> representation,
>> while the outer one converts the first 42 to its hex representation.
>
> Wouldn't concat(hex(42), " is hex for", 42) make more sense?

That is a valid approach for concat() and format(), but suboptimal for
join(). Think of this example:

join(hex<int>, separator(" "), my_nums);

Here all the numbers are converted to their hex representation. With
your approach this would look like:

join(separator(" "),
my_nums | transform([](int i) -> int {
return hex(i);
}));

That is much more difficult to understand.

Since for join() tag parameters to define the conversion is, to me, the
superior choice, I think, we should use it for concat() and format() as
well, for consistency.

>> 2. concat() in calls to format():
>>
>> format("%|1$40t|%2%", concat(first_name, " ", last_name),
>> phone_number).str();
>
> Why not fold the name concat into the format string?

In this example I want the full name to take up 40 characters, no matter
if the first name is long or short. With format strings as used by
boost::format I don't know how that could be achieved, and extending the
format language, of course, makes the interpreter slower and more
complex.

>>> If "concat" is the outer layer anyway, I would return a std::string
>>> directly for convenience. It is easy to forget the trailing .str()
>>> and
>>> it does not look elegant.
>>>
>>
>> Of course better proposals are welcome :-) Would you prefer the
>> implicit
>> conversion? If so, why?
>
> Implicit is problematic with auto..

That is one of the reasons, I prefer the explicit str() function. It
also fits well to other factory functions, we might prefer to have as
well, like e.g. concat(...).append_to(my_string), or
concat(...).overwrite(my_string).

Christof

Richard Hodges

unread,

Jan 24, 2017, 4:08:26 AM1/24/17

to bo...@lists.boost.org

For what it's worth, in my view join and/or concat should return a
generator.

To me the idiomatic way to convert it to a string would be and ADL
function to_string(generator)
-> std::string.

The generator itself should be streamable to at least a basic_ostream via
ADL operator<<

It seems to me that since c++17 is going to have string_view, and boost
already does, then there should be a to_string_view free function to return
the view of a temporary. The result will be that string construction is
then only necessary when the user wants it.

basic ADL interface something like this:

#include <memory>
#include <iostream>
#include <string>
#include <experimental/string_view>

namespace boost {
template<class Char, class Traits = std::char_traits<Char>, class
Allocator = std::allocator<Char>>
struct join_engine_type {
// implementation of joiner here
};

template<class Char, class Traits, class Allocator>
auto operator<<(std::basic_ostream<Char, Traits>& os,
const join_engine_type<Char, Traits, Allocator>& eng)
-> std::basic_ostream<Char, Traits>&
{
// impementation here
return os;
};

template<class Char, class Traits, class Allocator>
auto allocate_string(const join_engine_type<Char, Traits, Allocator>& eng)
-> std::basic_string<Char, Traits, Allocator>&
{
// note that the allocator is copied, so if the joiner has a memory pool
// allocator, the strings will share the memory pool
// impementation here
};

template<class Char, class Traits, class Allocator>
auto to_string(const join_engine_type<Char, Traits, Allocator>& eng)
-> std::basic_string<Char, Traits>&
{
// note that the standard allocator is used. Strings are now
independent.
// impementation here
};

template<class Char, class Traits, class Allocator>
auto to_string_view(const join_engine_type<Char, Traits, Allocator>& eng)
-> std::experimental::basic_string_view<Char, Traits>&
{
// note that the standard allocator is used. Strings are now
independent.
// impementation here
};
}

Furthermore, since the entire web is now (thankfully) UTF8, I strongly feel
that there should be a utf8 version, which accepts wide and narrow strings
and emits them correctly.

Christof Donat

unread,

Jan 24, 2017, 4:31:18 AM1/24/17

to bo...@lists.boost.org

Hi,

Am 24.01.2017 00:14, schrieb Gavin Lambert:
> On 24/01/2017 07:26, Olaf van der Spek wrote:
>>> 1. scope for formatting tags:
>>>
>>> concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
>>>
>>> Here the inner concat will convert the 42 to its decimal
>>> representation,
>>> while the outer one converts the first 42 to its hex representation.
>>
>> Wouldn't concat(hex(42), " is hex for", 42) make more sense?
>
> +1. If you like persistent formatting states (and all the unexpected
> fun they cause when you forget to cancel them), use iostreams instead.

My proposal does not have anything similar to iostream manipulators. The
conversion tags always extend to the whole concat parameter list, but
not to enclosed calls to concat. There is no change in behavior that
depends on the position of the conversion tag in the parameter list.
Actually I think, we should be able to put all the conversion tags at
the beginning of the parameter list.

>>> 3. results from concat() in a boost::range that is passed to join():
>>>
>>> join(separator("\n"),
>>> my_files | transformed([](const std::filesystem::path& f) ->
>>> auto {
>>> return concat(f.filename, ": ",
>>> std::filesystem::file_size(f));
>>> })).str();
>
> Maybe I missed something, but what was the intended distinction
> between concat() and join()?

The distinction is the same as in many other languages: join() works on
sequences, while concat() works on its parameters.

> To me, as vocabulary words, concat() implies "concatenate without
> separator" and join() implies "concatenate with separator"; as such it
> seems unnecessary to explicitly decorate the separator here since it's
> a required parameter.

I see. The approach, we had here up to now was, to define formatting
specifics as tag parameters at the beginning of the parameter list. If
you give no separator to join() it just puts all the representations of
the sequences objects in a row without any space in between. In Python
the equivalent would be to join on the empty string:
"".join(mySequence).

Then, of course it is reasonable to use the same tags for join(), and
concat(), and actually as well for format(), but there separator() is
unused.

I agree, that your definition is mostly consistent with what I found for
other languages as well, but most of them also only use join() on
sequences and concat() or equivalents on parameter lists. In languages,
where join() is used on parameter lists as well, like Java or Perl,
there is no other concatenation function. I guess, concat() usually has
no separator in most languages, because you can easily put it in as
additional parameters. With my proposal, it just comes for free.

> (Both should accept either variadics or ranges, if that's not too
> complicated to arrange, though it's likely for join() to be used more
> often on ranges; concat() might be more evenly split, though probably
> leaning toward variadics.)

That part is not consistent with what I found from other languages. Just
Perl and Java (from the list of languages I checked), use join() for
both usecases, but they don't have anything else, that is like concat()
then. All the other distinguish between concatenating a few parameters,
and joining a sequence.

>>>> If "concat" is the outer layer anyway, I would return a std::string
>>>> directly for convenience. It is easy to forget the trailing .str()
>>>> and
>>>> it does not look elegant.
>>>
>>> Of course better proposals are welcome :-) Would you prefer the
>>> implicit
>>> conversion? If so, why?
>>
>> Implicit is problematic with auto..
>
> While that's true, I think the flexibility of returning a factory from
> concat() is more useful than the discomfort of either remembering the
> str() or using std::string explicitly as the type instead of auto (or
> using auto combined with explicit std::string construction).

I agree. Would you prefer str(), or implicit conversion?

Christof

Christof Donat

unread,

Jan 24, 2017, 4:36:36 AM1/24/17

to bo...@lists.boost.org

Hi,

Am 24.01.2017 10:07, schrieb Richard Hodges:
> For what it's worth, in my view join and/or concat should return a
> generator.
>
> To me the idiomatic way to convert it to a string would be and ADL
> function to_string(generator)
> -> std::string.

Then why not use the std::string constructor instead? That would be the
case with the implicit conversion:

auto generator = concat(...);
auto my_str = std::string{generator};

> The generator itself should be streamable to at least a basic_ostream
> via
> ADL operator<<

Yes, that looks like a good idea to me as well. The simple and straight
forward implementation (using the std::string constructor) might be:

auto operator << (std::basic_ostream<...>& stream, generator g)
-> std::basic_ostream<...>& {

Christof Donat

unread,

Jan 24, 2017, 4:45:33 AM1/24/17

to bo...@lists.boost.org

Hi,

Sorry for the unfinished mail. I just was unable to handle my mail
client correctly :-(

Am 24.01.2017 10:07, schrieb Richard Hodges:

> It seems to me that since c++17 is going to have string_view, and boost
> already does, then there should be a to_string_view free function to
> return
> the view of a temporary.

But where will the temporary live then? Someone will have to free the
memory.

> basic ADL interface something like this:

I didn't get that part. What exactly does ADL stand for?

> Furthermore, since the entire web is now (thankfully) UTF8, I strongly
> feel
> that there should be a utf8 version, which accepts wide and narrow
> strings
> and emits them correctly.

UTF-8 is 8 bits only. Just some characters take more than a single byte.
See https://en.wikipedia.org/wiki/UTF-8

Anyway, I agree, that we should have a wide char version as well for
UTF-16 support.

Christof

Richard Hodges

unread,

Jan 24, 2017, 5:30:21 AM1/24/17

to bo...@lists.boost.org

Responses inline.

On 24 January 2017 at 10:45, Christof Donat <c...@okunah.de> wrote:

> Hi,
>
> Sorry for the unfinished mail. I just was unable to handle my mail client
> correctly :-(
>
> Am 24.01.2017 10:07, schrieb Richard Hodges:
>
>> It seems to me that since c++17 is going to have string_view, and boost
>> already does, then there should be a to_string_view free function to
>> return
>> the view of a temporary.
>>
>
> But where will the temporary live then? Someone will have to free the
> memory.

Imagine a function:

void foo(std::string_view s);

which we then call with:

foo(join("the answer to life, the universe and everything is: ", hex(42)));

The generator returned by join would stay alive until the end of the
function foo, so there would be no need to construct a string, only to take
a string_view of it. We could use the string_view implicit in the joiner
object. This saves us an allocation and a copy.

>
> basic ADL interface something like this:
>>
>
> I didn't get that part. What exactly does ADL stand for?

ADL stands for Argument Dependent Lookup. It means that when you call a
free function, the namespaces of its arguments are searched for that
function. This means that you can write operator<<, to_string, hash_code
etc in the namespace of your custom object and the compiler will select the
correct one.

It's used a great deal in boost for customisation of structures like
boost::hash

>
>
> Furthermore, since the entire web is now (thankfully) UTF8, I strongly feel
>> that there should be a utf8 version, which accepts wide and narrow strings
>> and emits them correctly.
>>
>
> UTF-8 is 8 bits only. Just some characters take more than a single byte.
> See https://en.wikipedia.org/wiki/UTF-8
>
> Anyway, I agree, that we should have a wide char version as well for
> UTF-16 support.

I have no problem with a u16 version (UTF-16) for the windows crowd and a
u8 version (UTF-8) for everyone else. The standard supports this idea in
its unicode specialisations of std::string - std::u16string and
std::u32string. A utf-8 string is just a std::basic_string<char> with
utf-aware traits type.

Hans Dembinski

unread,

Jan 24, 2017, 5:52:01 AM1/24/17

to bo...@lists.boost.org

Hi Christof

> Would you prefer str(), or implicit conversion?

Please, no implicit conversion. I don't like .str() but it is better than implicit conversion. Implicit conversion is confusing, especially in the context of templates and auto.

I quote the Zen of Python: "Explicit is better than implicit".

> There is several usecases:
> […]

Ok, you made a point. Also, it follows the principle of least surprise if concat and friends all consistently return string factories.

> Am 24.01.2017 00:14, schrieb Gavin Lambert:
>> On 24/01/2017 07:26, Olaf van der Spek wrote:
>>>> 1. scope for formatting tags:
>>>> concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
>>>> Here the inner concat will convert the 42 to its decimal representation,
>>>> while the outer one converts the first 42 to its hex representation.
>>> Wouldn't concat(hex(42), " is hex for", 42) make more sense?
>> +1. If you like persistent formatting states (and all the unexpected
>> fun they cause when you forget to cancel them), use iostreams instead.
>
> My proposal does not have anything similar to iostream manipulators. The conversion tags always extend to the whole concat parameter list, but not to enclosed calls to concat. There is no change in behavior that depends on the position of the conversion tag in the parameter list. Actually I think, we should be able to put all the conversion tags at the beginning of the parameter list.

+1 for concat(hex(42), " is hex for", 42)

If format::hex<int> is allowed to be in any position and still affect the whole string, that is not at all intuitive. What happens when you have two conflicting formatting tags in the argument list? Which one wins? Hard to judge for the user without looking into the reference manual. A design is intuitive if you don't need to look up stuff in the reference manual all the time. There is the principe of least surprise again.

If you use template magic to detect the conflicting formatting tags and raise a compile time error, you are still left the the natural exception of users that positions in "concat" matter. They matter for all the other arguments, so why shouldn't they matter for the formatting tags. It is breaking an implicit rule. Let's get rid of them to avoid the whole mess.

As other people pointed out, formatting tags should be left to streams, where they make sense. They don't make sense in concat, because the mental model for "concat" is "function call", not "stream". The most intuitive in this case to use explicit converters, unary functions like hex(42). Everyone who reads that can immediately understand what it does and that the call to hex(…) does not affect the other arguments in concat. Of course, hex(42) would return a string factory.

> join(hex<int>, separator(" "), my_nums);
>
> Here all the numbers are converted to their hex representation. With your approach this would look like:
>
> join(separator(" "),
> my_nums | transform([](int i) -> int {
> return hex(i);
> }));
>
> That is much more difficult to understand.

I don't see how that follows. You are free to write an overload for "join" which accepts an unary function as the first argument, which is then applied to all the values in the range. If hex<int> is an unary function, the first version makes even more sense.

The second version I reject regardless, because it uses an overloaded operator |. Operator overloads are ambiguous outside the realm of mathematics. We want to follow the principle of least surprise, and this is surprising syntax.

Hans

Christof Donat

unread,

Jan 24, 2017, 6:17:32 AM1/24/17

to bo...@lists.boost.org

Hi,

Am 24.01.2017 11:29, schrieb Richard Hodges:
> On 24 January 2017 at 10:45, Christof Donat <c...@okunah.de> wrote:
>> Sorry for the unfinished mail. I just was unable to handle my mail
>> client
>> correctly :-(
>>
>> Am 24.01.2017 10:07, schrieb Richard Hodges:
>>
>>> It seems to me that since c++17 is going to have string_view, and
>>> boost
>>> already does, then there should be a to_string_view free function to
>>> return
>>> the view of a temporary.
>>
>> But where will the temporary live then? Someone will have to free the
>> memory.
>
> Imagine a function:
>
> void foo(std::string_view s);
>
> which we then call with:
>
> foo(join("the answer to life, the universe and everything is: ",
> hex(42)));

Did you mean

foo(concat(hex<int>, "the answer to life, the universe and everything
is: ", 42));

? SCNR.

> The generator returned by join would stay alive until the end of the
> function foo, so there would be no need to construct a string, only to
> take
> a string_view of it. We could use the string_view implicit in the
> joiner
> object. This saves us an allocation and a copy.

I see. So the string would live inside the string factory as a member
object, when we implicitly convert to a string_view. With C++17
std::string will have an implicit conversion operator to
std::string_view. So this will be sufficient:

foo(std::string{concat("the answer to life, the universe and everything
is: ", hex(42))});

>> basic ADL interface something like this:
>>
>> I didn't get that part. What exactly does ADL stand for?
>
> ADL stands for Argument Dependent Lookup. It means that when you call a
> free function, the namespaces of its arguments are searched for that
> function. This means that you can write operator<<, to_string,
> hash_code
> etc in the namespace of your custom object and the compiler will select
> the
> correct one.

Ah, I see. Thank you. I was aware of the mechanism, but not of the name.

>>> Furthermore, since the entire web is now (thankfully) UTF8, I
>>> strongly feel
>>> that there should be a utf8 version, which accepts wide and narrow
>>> strings
>>> and emits them correctly.
>>
>> UTF-8 is 8 bits only. Just some characters take more than a single
>> byte.
>> See https://en.wikipedia.org/wiki/UTF-8
>>
>> Anyway, I agree, that we should have a wide char version as well for
>> UTF-16 support.
>
> I have no problem with a u16 version (UTF-16) for the windows crowd and
> a
> u8 version (UTF-8) for everyone else. The standard supports this idea
> in
> its unicode specialisations of std::string - std::u16string and
> std::u32string. A utf-8 string is just a std::basic_string<char> with
> utf-aware traits type.

Yes, but I don't get why you want wide string versions for
UTF-8-support. I sit about converting wide string to utf8? Like this:

std::string{concat(my_wide_string)};

Richard Hodges

unread,

Jan 24, 2017, 6:52:58 AM1/24/17

to bo...@lists.boost.org

Answers to answers inline :)

On 24 January 2017 at 12:17, Christof Donat <c...@okunah.de> wrote:

> Hi,
>
> Am 24.01.2017 11:29, schrieb Richard Hodges:
>
>> On 24 January 2017 at 10:45, Christof Donat <c...@okunah.de> wrote:
>>
>>> Sorry for the unfinished mail. I just was unable to handle my mail client
>>> correctly :-(
>>>
>>> Am 24.01.2017 10:07, schrieb Richard Hodges:
>>>
>>> It seems to me that since c++17 is going to have string_view, and boost
>>>> already does, then there should be a to_string_view free function to
>>>> return
>>>> the view of a temporary.
>>>>
>>>
>>> But where will the temporary live then? Someone will have to free the
>>> memory.
>>>
>>
>> Imagine a function:
>>
>> void foo(std::string_view s);
>>
>> which we then call with:
>>
>> foo(join("the answer to life, the universe and everything is: ",
>> hex(42)));
>>
>
> Did you mean
>
> foo(concat(hex<int>, "the answer to life, the universe and everything is:
> ", 42));
>

I probably meant:
foo(to_string_view(concat("the answer to life, the universe and everything
is: ", hex(42))));
or
foo(to_string_view(join(separator(' '), "the answer to life, the universe
and everything is:", hex(42))));

The idea being to avoid the construction of any un-necessary string
objects. The generator already contains a buffer (or could) so it seems
wasteful to me to create a string temporary simply to view its buffer.

I personally prefer the limited-scope manipulators. They feel more portable
and less surprising when, for example, refactoring or merging code.

> ? SCNR.
>
> The generator returned by join would stay alive until the end of the
>> function foo, so there would be no need to construct a string, only to
>> take
>> a string_view of it. We could use the string_view implicit in the joiner
>> object. This saves us an allocation and a copy.
>>
>
> I see. So the string would live inside the string factory as a member
> object, when we implicitly convert to a string_view. With C++17 std::string
> will have an implicit conversion operator to std::string_view. So this will
> be sufficient:
>

A string, a string-like buffer, or a reference to a string. I feel that the
generator should be able to work on a supplied string reference so that it
can be used to extend an existing string without reallocations or copies if
required.

Maybe to_utf8(concat(...)); would be better. It's again explicit and could
be given options to control behaviour. It also decouples the concept of
UTF8 from the concept of concatenation. This adheres more to the c++ way of
only paying for what you need.

Christof Donat

unread,

Jan 24, 2017, 6:55:07 AM1/24/17

to bo...@lists.boost.org

Hi,

Am 24.01.2017 11:52, schrieb Hans Dembinski:
>> Would you prefer str(), or implicit conversion?
>
> Please, no implicit conversion. I don't like .str() but it is better
> than implicit conversion. Implicit conversion is confusing, especially
> in the context of templates and auto.

I didn't have good reasons, but this is in line with my gut feeling. I
totally agree.

> If format::hex<int> is allowed to be in any position and still affect
> the whole string, that is not at all intuitive.

My idea was to allow those formatting tags only at the beginning of the
parameter list. Then separator("sadf") is just another formatting tag.

> What happens when you
> have two conflicting formatting tags in the argument list? Which one
> wins?

OK, here you made a point. In a specification I'd leave the behavior
undefined. In an actual implementation I'd have the later ones overwrite
the previous ones, because I expect that to be easy to implement. We
could also try and prevent that with meta programming, but I think, that
is not worth the effort. Leaving it unspecified in the specification,
still leaves us the option to do that later.

>> join(hex<int>, separator(" "), my_nums);
>>
>> Here all the numbers are converted to their hex representation. With
>> your approach this would look like:
>>
>> join(separator(" "),
>> my_nums | transform([](int i) -> int {
>> return hex(i);
>> }));
>>
>> That is much more difficult to understand.
>
> I don't see how that follows. You are free to write an overload for
> "join" which accepts an unary function as the first argument, which is
> then applied to all the values in the range. If hex<int> is an unary
> function, the first version makes even more sense.

I think, that composes less elegantly with boost::range or ranges::v3.
Maybe we could lean towards their adaptor APIs like this:

join(separator(" "), my_nums | hex<int>()));

Still separator("...") is a bit out of the picture now. Up to mow I
thought, that it is some kind of formatting information and therefore I
handled it like other formatting tags. Actually I begin to like the idea
of formatting functions, that return string factories. That fits very
nicely with the rest of the API and makes concat(), join() and format()
simpler.

A completely different approach, inspired by Python:

separator(" ").join(my_nums | hex<int>());

Then we add free functions like this:

template<typename Sequence
auto join(const Sequence& seq) {

}

> The second version I reject regardless, because it uses an overloaded
> operator |.

That is the adaptor API from boost::range and is the same in ranges::v3.
It's not my invention.

Christof

Richard Hodges

unread,

Jan 24, 2017, 7:12:58 AM1/24/17

to bo...@lists.boost.org

>
> The second version I reject regardless, because it uses an overloaded
> operator |.
>

> That is the adaptor API from boost::range and is the same in ranges::v3.
It's not my invention.

I must admit that I share Hans' distaste of the pipe overloads. I find them
harder to read than a sequence of function calls, and to me they are not
immediately expressive.

I agree that explicit is better than implicit.

I also strongly feel that .str() is an error, since it couples the concepts
of strings with joiners, or concatenators. Strings have allocators and
traits. I feel that a free function is more decoupled, and could even go in
a separate header file to reduce complexity.

In summary,

to_string(x) is better than x.str() is better than implicit conversion.

It is my view that this little interface change would make boost::format
more std-like and consistent. It would also make template programming more
convenient, since template expansion works much more nicely with ADL than
with member functions. The ADL free functions act as glue for objects that
lack a .str() member.

e.g:

#include <boost/format.hpp>
#include <iostream>

namespace boost {
auto to_string(const boost::format& f) -> std::string
{
return f.str();
}
}

int main()
{
auto s = to_string(boost::format("%1%") % "hello");
std::cout << s << std::endl;
}

I'll make a suggestion about this in the boost::format forum.

Christof Donat

unread,

Jan 24, 2017, 7:27:12 AM1/24/17

to bo...@lists.boost.org

Hi,

Am 24.01.2017 12:52, schrieb Richard Hodges:
> On 24 January 2017 at 12:17, Christof Donat <c...@okunah.de> wrote:
>> Am 24.01.2017 11:29, schrieb Richard Hodges:
>>> Imagine a function:
>>>
>>> void foo(std::string_view s);
>>>
>>> which we then call with:
>>>
>>> foo(join("the answer to life, the universe and everything is: ",
>>> hex(42)));
>>>
>>
>> Did you mean
>>
>> foo(concat(hex<int>, "the answer to life, the universe and everything
>> is:
>> ", 42));
>>
>
> I probably meant:
> foo(to_string_view(concat("the answer to life, the universe and
> everything
> is: ", hex(42))));
> or
> foo(to_string_view(join(separator(' '), "the answer to life, the
> universe
> and everything is:", hex(42))));
>
>
> The idea being to avoid the construction of any un-necessary string
> objects. The generator already contains a buffer (or could) so it seems
> wasteful to me to create a string temporary simply to view its buffer.

The idea was, that not the generator holds a buffer, but the function,
that actually executes it, instantiates a std::string. When that is
returned, either the compiler elides that copy, or it will be moved out.
So in reality the waste is very minimal, if not non existent.

>>> The generator returned by join would stay alive until the end of the
>>> function foo, so there would be no need to construct a string, only
>>> to
>>> take
>>> a string_view of it. We could use the string_view implicit in the
>>> joiner
>>> object. This saves us an allocation and a copy.
>>>
>>
>> I see. So the string would live inside the string factory as a member
>> object, when we implicitly convert to a string_view. With C++17
>> std::string
>> will have an implicit conversion operator to std::string_view. So this
>> will
>> be sufficient:
>
> A string, a string-like buffer, or a reference to a string. I feel that
> the
> generator should be able to work on a supplied string reference so that
> it
> can be used to extend an existing string without reallocations or
> copies if
> required.

Yes, that is possible, when the function, that executes the generator is
responsible for the buffer. Then you can have e.g.

concat(...).str(); // allocate a string and return it
concat(...).append_to(s); // append to an existing string;
concat(...).replace(s); // write over an existing string, reusing its
memory.

I think, we all agree here, that implicit conversion is not the way to
go. So my current proposal still is .str(), and the like. You propose
free functions instead.

>> Yes, but I don't get why you want wide string versions for
>> UTF-8-support.
>> I sit about converting wide string to utf8? Like this:
>>
>> std::string{concat(my_wide_string)};
>
> Maybe to_utf8(concat(...)); would be better.

Uh, coming back to the formatting tags and my currently preferred
syntax:

concat(utf_8, ...).str();

If you need additional options for utf_8, it can be a function call:

concat(utf_8(more, options), ...).str();

> It's again explicit and could
> be given options to control behaviour. It also decouples the concept of
> UTF8 from the concept of concatenation. This adheres more to the c++
> way of
> only paying for what you need.

The functions, that execute the string factories should, in my opinion,
only care, if they have enough memory, and let the factories care about
the content. Therefore I think, that the question of character encoding
should be dealt with in the factories.

I don't see, how the question of character encoding can be decoupled
from the concept of converting arbitrary objects to strings. The
converter has to have a way to encode its result.

Richard Hodges

unread,

Jan 24, 2017, 8:00:46 AM1/24/17

to bo...@lists.boost.org

It's almost looking like conversion to utf8, wide strings, strings or
string_views should be a filter rather than an option.

I agree that the factory hierarchy should be as static as possible, so that
it's trivially copyable.

Thinking on that, it seems to me that the act of joining or concatenating
is the result of applying an output filter (e.g. .str() or to_string()) to
a sequence of input filters.

Or put another way, executing a sequence of input filters with their
outputs set to some output filters.

While I don't like the idea of pipes, the following structure seems to
model the process:

auto s = some_string();
join(a, b, c) | to_utf8() | append(s);

Another way to express this is:

join(a, b, c).apply(to_utf8()).apply(append(s));

This is not dissimilar to the architecture of the boost::iostream library
(although that library is polymorphic, join can be generic).

Other options spring to mind:

join(a, b, c) | widen() | prepend(some_wide_string);
auto s = separator(" : ") | join(a, b, as_hex(fixed(4), c), std::quoted(d))
| create_string();

example output might be:

foo : bar : 003a : "baz"

alternative syntax:

auto s = separator(" : ").join(a, b, as_hex(fixed(4), c), std::quoted(d)
).apply(create_string());

Note that I am still resisting the idea of .str() as a member function. If
the joiner or concatenation object exports begin() and end(), it's
un-necessary, because the object returned by create_string() (or similar)
can use the iterators.

Having iterators also means that the attributed factory can be used as a
source in std::copy, std::transform, std::for_each etc.

Whether the factory should simply export input_iterators iterators or
random_access will depend on how much state we'd want the factory to carry.

For now, I think input_iterators are sufficient.

Hans Dembinski

unread,

Jan 24, 2017, 11:27:29 AM1/24/17

to bo...@lists.boost.org

> On 24 Jan 2017, at 12:54, Christof Donat <c...@okunah.de> wrote:
>
> That is the adaptor API from boost::range and is the same in ranges::v3. It's not my invention.

I didn't have time to look into boost::range, yet, so I didn't recognise this. Well that's a pity. :((

> I think, that composes less elegantly with boost::range or ranges::v3. Maybe we could lean towards their adaptor APIs like this:
>
> join(separator(" "), my_nums | hex<int>()));

If you use the design with the unary function in the beginning, it works for boost::range and classic iterators. You can still compose several unary functions by using a lambda.

> OK, here you made a point. In a specification I'd leave the behavior undefined. In an actual implementation I'd have the later ones overwrite the previous ones, because I expect that to be easy to implement. We could also try and prevent that with meta programming, but I think, that is not worth the effort. Leaving it unspecified in the specification, still leaves us the option to do that later.

If you can catch the error it at compile time, so that it costs nothing at runtime, it is certainly worth the effort.

> Still separator("...") is a bit out of the picture now. Up to mow I thought, that it is some kind of formatting information and therefore I handled it like other formatting tags. Actually I begin to like the idea of formatting functions, that return string factories. That fits very nicely with the rest of the API and makes concat(), join() and format() simpler.
>
> A completely different approach, inspired by Python:
>
> separator(" ").join(my_nums | hex<int>());

-1. It looks artificial in C++. In Python it is okay, because it creates a nice symmetry with .split(…). Here, there is no symmetry with .split. And any case, both should be methods of std::string then. For a user it will feel quite arbitrary that he/she has do use separator(" ").join(…) and instead of the simpler std::string(" ").join(…).

Christof Donat

unread,

Jan 25, 2017, 11:23:29 AM1/25/17

to bo...@lists.boost.org

Hi,

Am 24.01.2017 13:12, schrieb Richard Hodges:
>> The second version I reject regardless, because it uses an overloaded
>> operator |.
>
>> That is the adaptor API from boost::range and is the same in
>> ranges::v3.
> It's not my invention.
>
> I must admit that I share Hans' distaste of the pipe overloads.

As I said, this is not my invention. That notation is used with
boost::ranges and ranges::v3 and has good chances to be part of the C++
standard library in future. At least the committee seems to discuss
ranges::v3.

> I also strongly feel that .str() is an error, since it couples the
> concepts
> of strings with joiners, or concatenators. Strings have allocators and
> traits. I feel that a free function is more decoupled, and could even
> go in
> a separate header file to reduce complexity.

The allocator could be given with a parameter :

concat(...).str(my_allocator);

We could have a character type as template parameter, which might
default to char:

concat(...).str<wchar>(my_allocator);

The whole discussion started with a question for a concat() like
function, that would return std::string. Therefore I don't feel too
embarrassed, that up to now we came up with stuff, that is coupled to
produce strings.

> to_string(x) is better than x.str() is better than implicit conversion.

Maybe it is just my strong distaste of the name to_string(), which at
the moment is just a feeling, no good reasons. It just feels clumsy to
me, while .str() is in line with regular expression matches, and
stringstreams in the standard library.

Christof Donat

unread,

Jan 25, 2017, 11:50:12 AM1/25/17

to bo...@lists.boost.org

Hi,

Am 24.01.2017 17:27, schrieb Hans Dembinski:
>> On 24 Jan 2017, at 12:54, Christof Donat <c...@okunah.de> wrote:
>> I think, that composes less elegantly with boost::range or ranges::v3.
>> Maybe we could lean towards their adaptor APIs like this:
>>
>> join(separator(" "), my_nums | hex<int>()));
>
> If you use the design with the unary function in the beginning, it
> works for boost::range and classic iterators. You can still compose
> several unary functions by using a lambda.

Sure, but a lambda is still more noise than a list of range adaptors (in
ranges::v3 they are called "views").

join(separator("\n"), my_bytes | hex<int>(fixed_size(2)) |
view::chunk(16) | join(separator(" "))).str();

Here I have added another, new idea. When join is only called without a
range, it returns a range view/adaptor, that expects to iterate over a
range of ranges and will return a range of string factories. The above
code would produce multiple lines with 16 hex representation of bytes
each.

>> OK, here you made a point. In a specification I'd leave the behavior
>> undefined. In an actual implementation I'd have the later ones
>> overwrite the previous ones, because I expect that to be easy to
>> implement. We could also try and prevent that with meta programming,
>> but I think, that is not worth the effort. Leaving it unspecified in
>> the specification, still leaves us the option to do that later.
>
> If you can catch the error it at compile time, so that it costs
> nothing at runtime, it is certainly worth the effort.

The issue is, that doing that is a lot of work. If we really want to do
stuff and not just talk about, what could possibly be done, I'd propose
to not implement it in the first release, but keep the behavior
undefined in the specification, so that it can be implemented later.

>> A completely different approach, inspired by Python:
>>
>> separator(" ").join(my_nums | hex<int>());
>
> -1. It looks artificial in C++. In Python it is okay, because it
> creates a nice symmetry with .split(…). Here, there is no symmetry
> with .split. And any case, both should be methods of std::string then.
> For a user it will feel quite arbitrary that he/she has do use
> separator(" ").join(…) and instead of the simpler std::string("
> ").join(…).

The latter is, in my opinion, even worse, because I really think,
std::string should not have many member functions. But yes, the syntax I
proposed here is not really intuitive in C++. Let's just forget about
it.

Christof

Christof Donat

unread,

Jan 25, 2017, 12:19:41 PM1/25/17

to bo...@lists.boost.org

Hi,

Am 24.01.2017 14:00, schrieb Richard Hodges:
> It's almost looking like conversion to utf8, wide strings, strings or
> string_views should be a filter rather than an option.

A filter always needs an input encoding and an output encoding. If we
want to have filters to chose the output encoding between UTF-8, Latin
1, etc. what is going to be the input encoding?

> Note that I am still resisting the idea of .str() as a member function.
> If
> the joiner or concatenation object exports begin() and end(), it's
> un-necessary, because the object returned by create_string() (or
> similar)
> can use the iterators.

I like that idea, though I still don't like the name to_string().

How about this:

auto s = generate<std::sting>(concat(...));
auto ws = generate<std::wsting>(concat(...));

generate<std::sting>(s, concat(...)); // returns reference to s for
consistency

auto v = generate<std::vector<int>>(any_other_range); // actually will
just copy

A very naive Implementation might look like this:

template<typename ReturnT, typename Sequence>
auto generate(ReturnT& r, Sequence& seq) -> ReturnT& {
std::copy(std::begin(seq), std::end(seq), std::begin(r));
return r;
}
template<typename ReturnT, typename Sequence>
auto generate(Sequence& seq) -> ReturnT {
ReturnT r{seq.size()};
generate(r, seq);
return r;
}

The word "generate" is very generic, therefore I'm still not really
happy with it.

The string factories, of course need more than only begin() and end().
They also have to provide a way to estimate the size of the output,
because we want to be able to allocate the complete buffer upfront. In
the naive implementation I called that size().

> Having iterators also means that the attributed factory can be used as
> a
> source in std::copy, std::transform, std::for_each etc.

.. and in range expressions :-)

Richard Hodges

unread,

Jan 25, 2017, 12:37:58 PM1/25/17

to bo...@lists.boost.org

> The string factories, of course need more than only begin() and end().
They also have to provide a way to estimate the size of the output, because
we want to be able to allocate the complete buffer upfront. In the naive
implementation I called that size().

If you have size(), it means you know the size because the conversion work
has already been done. If you know the size, then the iterators returned by
begin() and end() can be of category std::random_access_tag, in which case
std::distance(begin(), end()) is both trivial and yields the size.

This is a long way of saying that if you have size(), you cant have the
simpler initial option of using forward_only iterators.

The practical fallout of this is that either the filter objects will have
to separate the concerns of size computation and formatting (they are very
much related), or the filter objects will have to carry some mutable state,
even when const (this is done, for example in google protocol buffers). The
latter options makes them a little less flexible.

Whether all this is worth it in terms of the performance gain of being able
to pre-compute size and therefore avoid un-necessary memory allocations...
I don't know. I think on balance it might.

Christof Donat

unread,

Jan 25, 2017, 12:44:48 PM1/25/17

to bo...@lists.boost.org

Hi,

Am 25.01.2017 18:37, schrieb Richard Hodges:
>> The string factories, of course need more than only begin() and end().
>> They also have to provide a way to estimate the size of the output,
>> because
>> we want to be able to allocate the complete buffer upfront. In the
>> naive
>> implementation I called that size().
>
> If you have size(), it means you know the size because the conversion
> work
> has already been done.

I think, it is fine, if it returns a reasonable upper bound, like e.g. 4
for a 16 bit value in hex representation. It is just meant, to make
sure, that we can allocate enough memory upfront, and avoid calls to
realloc().

Gavin Lambert

unread,

Jan 25, 2017, 5:52:19 PM1/25/17

to bo...@lists.boost.org

On 26/01/2017 06:44, Christof Donat wrote:
>> If you have size(), it means you know the size because the conversion
>> work has already been done.
>
> I think, it is fine, if it returns a reasonable upper bound, like e.g. 4
> for a 16 bit value in hex representation. It is just meant, to make
> sure, that we can allocate enough memory upfront, and avoid calls to
> realloc().

Perhaps spell it as capacity() instead, to indicate the *possible*
(worst-case) output size of the factory.

It's not exactly the same as the meaning in containers, but it's close
enough that people would understand it better, I think.

max_size() is another possibility.

Gavin Lambert

unread,

Jan 25, 2017, 6:00:27 PM1/25/17

to bo...@lists.boost.org

On 26/01/2017 05:23, Christof Donat wrote:
>> to_string(x) is better than x.str() is better than implicit conversion.
>
> Maybe it is just my strong distaste of the name to_string(), which at
> the moment is just a feeling, no good reasons. It just feels clumsy to
> me, while .str() is in line with regular expression matches, and
> stringstreams in the standard library.

Clumsy or not, it's in the standard now. [1]

str() is shorter, of course. I don't see any particular reason why you
can't provide all of them, though (even the implicit conversion), as
different things are going to feel more natural to different people,
particularly in different contexts.

[1] http://en.cppreference.com/w/cpp/string/basic_string/to_string

Christof Donat

unread,

Jan 26, 2017, 4:07:15 AM1/26/17

to bo...@lists.boost.org

Hi,

Am 25.01.2017 23:50, schrieb Gavin Lambert:
> On 26/01/2017 06:44, Christof Donat wrote:
>>> If you have size(), it means you know the size because the conversion
>>> work has already been done.
>>
>> I think, it is fine, if it returns a reasonable upper bound, like e.g.
>> 4
>> for a 16 bit value in hex representation. It is just meant, to make
>> sure, that we can allocate enough memory upfront, and avoid calls to
>> realloc().
>
> Perhaps spell it as capacity() instead, to indicate the *possible*
> (worst-case) output size of the factory.
>
> It's not exactly the same as the meaning in containers, but it's close
> enough that people would understand it better, I think.
>
> max_size() is another possibility.

I like max_size().

Christof

Hans Dembinski

unread,

Jan 26, 2017, 4:41:02 AM1/26/17

to bo...@lists.boost.org

>> Note that I am still resisting the idea of .str() as a member function. If
>> the joiner or concatenation object exports begin() and end(), it's
>> un-necessary, because the object returned by create_string() (or similar)
>> can use the iterators.
>
> I like that idea, though I still don't like the name to_string().
>
> How about this:
>
> auto s = generate<std::sting>(concat(...));
> auto ws = generate<std::wsting>(concat(...));

Why don't you add explicit conversion operators to the string factory

explicit operator std::string() { … }
explicit operator std::wstring() { … }

explicit type conversion was exactly added for this situation in C++11. The call is then

auto s = static_cast<std::string>(concat(…));

etc.

Otherwise I don't care if you add .str() or .to_string() members. I would prefer writing

auto s = concat(…).str();

over

auto s = static_cast<std::string>(concat(…));

But both should be supported.

Richard Hodges

unread,

Jan 26, 2017, 4:48:23 AM1/26/17

to bo...@lists.boost.org

> I don't see any particular reason why you can't provide all of them,
though (even the implicit conversion), as different things are going to
feel more natural to different people, particularly in different contexts.

There's a problem with implicit conversion.

imagine the overloads function:

void foo(std::string const& s);

The we have a "joining engine" with various useful implicit conversions:

struct engine {
operator std::string const& () const;
operator std::string_view () const;
operator const char*() const;
};

We have the code:

foo(join(...));

And all is well. The string conversion is chosen.

Then because of user performance concerns around the construction of
strings, the author of the foo library adds some useful overloads:

void foo(std::string_view s);
void foo(const char* s);

Now the above code will not compile, because there are 3 ambiguous
overloads. So users are forced to modify their code to:

foo(std::string_view(join(...)));

or

foo(to_string_view(join(...)));

or, much as I dislike the idea,

foo(join(...).strview());

The moral of the story is that if we are going to provide conversion
operators, they need to be explicit anyway.

If you're going to have explicit conversion operators, it is just as
expressive to have the to_xxx free functions. It's no more typing, because
what you lose in typing to_, you gain in not having to prefix the type with
std:: (no-one uses using namespace std; here, right? It doesn't play well
with boost).

In return, you get decoupling.

This means that if you later want to convert the formatted text to a vector
(say) or a vector of terms, you can simply provide the free function
overloads to_vector(), to_vector_of_strings() and so on.

Template specialisations of free functions are always a bad idea - they
don't play nicely with ADL. So I would not recommend
convert<std::string>(join(...)) etc.

R

Hans Dembinski

unread,

Jan 26, 2017, 4:55:30 AM1/26/17

to bo...@lists.boost.org

> On 26 Jan 2017, at 10:47, Richard Hodges <hodg...@gmail.com> wrote:
>
> There's a problem with implicit conversion.

> […]

Thank you Richard for this nice example, which illustrates the problem with implicit conversion. :)

It would be nice if there was a wiki for C++ with items like this ("don't do this and why") so that these explanations do not have to be written again and again.

Olaf van der Spek

unread,

Jan 26, 2017, 4:55:44 AM1/26/17

to bo...@lists.boost.org

On Thu, Jan 26, 2017 at 10:41 AM, Hans Dembinski <hans.de...@gmail.com>
wrote:

>
> >> Note that I am still resisting the idea of .str() as a member function.
> If
> >> the joiner or concatenation object exports begin() and end(), it's
> >> un-necessary, because the object returned by create_string() (or
> similar)
> >> can use the iterators.
> >
> > I like that idea, though I still don't like the name to_string().
> >
> > How about this:
> >
> > auto s = generate<std::sting>(concat(...));
> > auto ws = generate<std::wsting>(concat(...));
>
> Why don't you add explicit conversion operators to the string factory
>
> explicit operator std::string() { … }
> explicit operator std::wstring() { … }
>
> explicit type conversion was exactly added for this situation in C++11.
> The call is then
>
> auto s = static_cast<std::string>(concat(…));
>
> etc.
>
> Otherwise I don't care if you add .str() or .to_string() members. I would
> prefer writing
>
> auto s = concat(…).str();
>

>
> over
>
> auto s = static_cast<std::string>(concat(…));
>

auto s = std::string(concat(…));

Or even

std::string s(concat(…));

?
Why bother with the static_cast?

--
Olaf

Christof Donat

unread,

Jan 26, 2017, 7:21:39 AM1/26/17

to bo...@lists.boost.org

Am 25.01.2017 23:55, schrieb Gavin Lambert:
> On 26/01/2017 05:23, Christof Donat wrote:
>>> to_string(x) is better than x.str() is better than implicit
>>> conversion.
>>
>> Maybe it is just my strong distaste of the name to_string(), which at
>> the moment is just a feeling, no good reasons. It just feels clumsy to
>> me, while .str() is in line with regular expression matches, and
>> stringstreams in the standard library.
>
> Clumsy or not, it's in the standard now. [1]

Just as .str() is - in stringstream and in regular expression matches
:-)

The issue is, that the way, to_string() was explained, it seemed obvious
to me, that it should not only be able to produce strings, but also e.g.
wstrings. The original proposal was, to use a different name for every
potential output:

to_string(concat(...));
to_wstring(concat(...));

I'd prefer a more generic approach, because this does not give us much
more possibilities than .str(). The worst proposal was to_utf8(),
because then the string factories have to provide some representation,
that can be converted to any character encoding later. I'd really prefer
to set the character encoding on the string factory.

Therefore I have another proposal:

to<std::string>(concat(...));
to<std::wstring>(concat(...));
to<std::string>(concat(...).encoding(utf8));
to<std::string>(concat(...).encoding(latin1));
to<std::wstring>(concat(...).encoding(utf16));
to<std::string>(concat(...).encoding(utf16)); // error!
to<std::vector<char>>(concat(...));
to<QString>(concat(...));

or alternatively:

concat(...).to<std::string>();
concat(...).to<std::wstring>();
concat(...).encoding(utf8).to<std::string>();
concat(...).encoding(latin1).to<std::string>();
concat(...).encoding(utf16).to<std::wstring>();
concat(...).encoding(utf16).to<std::string>(); // error!
concat(...).to<std::vector<char>>();
concat(...).to<QString>();

Both versions can have overloads, to reuse existing result values. In
that case actually we don't need the template parameter, because it can
be derived from the parameters type.

to(s, concat(...));
to(ws, concat(...));
to(s, concat(...).encoding(utf8));
to(s, concat(...).encoding(latin1));
to(ws, concat(...).encoding(utf16));
to(v, concat(...));
to(qs, concat(...));

or:

concat(...).to(s);
concat(...).to(ws);
concat(...).encoding(utf8).to(s);
concat(...).encoding(latin1).to(s);
concat(...).encoding(utf16).to(ws);
concat(...).to(v);
concat(...).to(qs);

I prefer the second version. It reads like an english sentence. We can
have the compiler check, that encoding(utf16) does not allow
to<std::string>(). We just have to make sure, it returns a type, that
only implements to() for wide character types.

> str() is shorter, of course. I don't see any particular reason why
> you can't provide all of them, though (even the implicit conversion),
> as different things are going to feel more natural to different
> people, particularly in different contexts.

Yes. With the above proposal, we can easily implement all of them:

class string_factory {
public:
// ...

template<typename TargetType>
auto to() -> TargetType {...};
template<typename TargetType>
auto to(TargetType& t) -> TargetType& {...};

auto str() -> std::string { return to<std::string>() };
auto wstr() -> std::wstring { return to<std::wstring>() };
auto str(std::string& s) -> std::string& { return to<std::string>(s)
};
auto wstr(std::wstring& ws) -> std::wstring &{ return
to<std::wstring>(ws) };

operator std::string { return str(); };
operator std::wstring { return wstr(); };
}

template <typename StringFactory>
auto to_string(StringFactory& factory) -> std::string { return
factory.str(); }

template <typename StringFactory>
auto to_wstring(StringFactory& factory) -> std::string { return
factory.wstr(); }

template <typename StringFactory>
auto to_string(std::string& s, StringFactory& factory) -> std::string {
return factory.str(s); }

template <typename StringFactory>
auto to_wstring(std::wstring& ws, StringFactory& factory) -> std::string
{ return factory.wstr(ws); }

Christof

Christof Donat

unread,

Jan 26, 2017, 8:17:08 AM1/26/17

to bo...@lists.boost.org

Hi,

Am 26.01.2017 10:41, schrieb Hans Dembinski:
>>> Note that I am still resisting the idea of .str() as a member
>>> function. If
>>> the joiner or concatenation object exports begin() and end(), it's
>>> un-necessary, because the object returned by create_string() (or
>>> similar)
>>> can use the iterators.
>>
>> I like that idea, though I still don't like the name to_string().
>>
>> How about this:
>>
>> auto s = generate<std::sting>(concat(...));
>> auto ws = generate<std::wsting>(concat(...));
>
> Why don't you add explicit conversion operators to the string factory
>
> explicit operator std::string() { … }
> explicit operator std::wstring() { … }
>
> explicit type conversion was exactly added for this situation in
> C++11. The call is then
>
> auto s = static_cast<std::string>(concat(…));

That is the worst we had up to now. Even an implicit conversion is
better, which is not at all my favourite as well.

Christof

Christof Donat

unread,

Jan 26, 2017, 8:28:20 AM1/26/17

to bo...@lists.boost.org

Hi,

Am 26.01.2017 10:47, schrieb Richard Hodges:
>> I don't see any particular reason why you can't provide all of them,
>> though (even the implicit conversion), as different things are going
>> to
>> feel more natural to different people, particularly in different
>> contexts.
>
> There's a problem with implicit conversion.
>
> imagine the overloads function:
>
> void foo(std::string const& s);

[...]
> foo(join(...));
[...]

> void foo(std::string_view s);
> void foo(const char* s);
>
> Now the above code will not compile, because there are 3 ambiguous
> overloads.

But that is only an issue for users, who have relied on the implicit
conversion upfront. You can always use an implicit conversion explicitly
as well, of course, and implicit conversion will be just one of multiple
options, if we add it.

> The moral of the story is that if we are going to provide conversion
> operators, they need to be explicit anyway.

Up to now, you haven't made a point here. Just those users, who decide
to rely on implicit conversion, instead of one of the explicit variants,
have to cope with the downsides of implicit conversion. Others don't. I
think that fits well with "don't pay for what you don't use".

> Template specialisations of free functions are always a bad idea - they
> don't play nicely with ADL. So I would not recommend
> convert<std::string>(join(...)) etc.

Please elaborate more on that. I can't see it.

Christof

Hans Dembinski

unread,

Jan 26, 2017, 9:03:43 AM1/26/17

to bo...@lists.boost.org

> Am 26.01.2017 10:47, schrieb Richard Hodges:
>>> I don't see any particular reason why you can't provide all of them,
>>> though (even the implicit conversion), as different things are going to
>>> feel more natural to different people, particularly in different contexts.
>> There's a problem with implicit conversion.
>> imagine the overloads function:
>> void foo(std::string const& s);
> [...]
>> foo(join(...));
> [...]
>> void foo(std::string_view s);
>> void foo(const char* s);
>> Now the above code will not compile, because there are 3 ambiguous
>> overloads.
>
> But that is only an issue for users, who have relied on the implicit conversion upfront. You can always use an implicit conversion explicitly as well, of course, and implicit conversion will be just one of multiple options, if we add it.

You don't know why the user added the overloads for foo, perhaps she suddenly had to adapt foo so that it also works with C code which uses a lot of const char*. As a designer, you have no control over other peoples' interfaces.

You argued about the principle of least surprise. Let's say the user started to use concat in her code with a function foo(std::string const & s). Then she decided to add to the overload foo(const char* s). All of a sudden her code does not compile anymore. This should not happen. As a user, she will be very surprised at that moment. That's why it is a bad idea to have implicit conversions.

Why do you think the standards committee added explicit operator <type>() to the language? They don't do these language changes for fun.

Also, if you won't take it from us, please go and take the wisdom from Herb Sutter

http://www.gotw.ca/gotw/019.htm

"It's almost always a good idea to avoid writing automatic conversions, either as conversion operators or as single-argument non-explicit constructors."

>> The call is then
>> auto s = static_cast<std::string>(concat(…));
>
> That is the worst we had up to now. Even an implicit conversion is better, which is not at all my favourite as well.

As Olaf pointed out, it is sufficient to do

auto s = std::string(concat(…));

you don't need the static_cast. Nevertheless, casts are the official way of doing type conversions, whether you like the syntax or not. I think Stroustrup intentionally made them ugly, because he wanted type conversions to be the exception in his statically typed language.

Hans

Richard Hodges

unread,

Jan 26, 2017, 9:04:33 AM1/26/17

to bo...@lists.boost.org

>
> Template specialisations of free functions are always a bad idea - they
> don't play nicely with ADL. So I would not recommend
> convert<std::string>(join(...)) etc.
>

> Please elaborate more on that. I can't see it.

imagine:

namespace boost
{
// the concept
template<class To>
T convert(joiner const& j);

// some specialisations
template<> std::string convert<std::string>(joiner const& j) {... }

template<> std::wstring convert<std::wstring>(joiner const& j) {... }
};

then in user code:

template<class T>
void do_something(boost::joiner const& j)
{
using boost::convert;
auto v = something(convert<T>(j));
something_else(v);
}

now someone wishes to provide their own converter, not in the boost
namespace:

namespace user {

struct UserRepresentation;

// this is not allowed. There is not already a general template called
user::convert<>. It's called boost::convert<>
template<>
UserRepresentation convert<UserRepresentation>(boost::joiner const& j) {
... }
}

As mentioned in the comments, this is not allowed. So the user is not
forced to specialise the boost namespace. This is not the boost way (see
boost::hash), and for good reason. Namespaces are for separation. This
forces crowding of the boost namespace. Also, specialising templates in
foreign namespaces is a source of user confusion. Google mentioned this in
their proposal to make std::hash behave more like boost::hash (std::hash
being an example of the standards committee turning a fantastic tool into
an incomplete shambles, because they left out the bits that make it work
well).

If you want a template convert function (and I don't, but I can imagine
that some might), then the model to follow would be that of boost::hash<>

This uses a function object (boost::hash) which then calls out through a
namespace collector and finally to the ADL hash_value function. In our case
it would want to call out to a function like convert(boost::tag<T>,
boost::joiner const& j) -> T. The general form would be:

namespace boost {
template<class T>
struct join_converter {
decltype(auto) operator()(tag<T>, joiner const& j) const {
using boost::convert
}
};
}

The signature of the general converter would then be

auto convert(boost::tag<T>, joiner const& j) -> T

and the user's non-template overload would be:

namespace user {

auto convert(boost::tag<UserRepresentation>, boost::joiner const& j) ->
UserRepresentation;

}

boost would never define a convert function. Conversions for std::string,
std::wstring etc would be specialisations of boost::join_converter.

This allows ADL to find non-template converters in namespaces associated
with the thing that they are converting to.

The generic user calling function would then look more like this:

template<class T>
void do_something(boost::joiner const& j)
{
auto convert = boost::converter<T>();
auto v = convert(something(j));
something_else(v);
}

I'm afraid that this "class template which calls free function" dance is
necessary to allow calls to ADL free functions to search beyond the boost
namespace.

Again, see boost::hash for the gory details.

Christof Donat

unread,

Jan 26, 2017, 12:08:53 PM1/26/17

to bo...@lists.boost.org

Hi,

> This is not the boost way (see boost::hash), and for good reason.

Well, the extension points to boost::spirit work by specializing e.g.
is_contaner<>, or is_string<>. See
http://www.boost.org/doc/libs/1_61_0/libs/spirit/doc/html/spirit/advanced/customize/is_container.html
That is the same for traits in the standard library, so it actually is a
common pattern. I understand, that e.g. std::swap does it the way, you
propose.

Am 26.01.2017 15:04, schrieb Richard Hodges:
>>> Template specialisations of free functions are always a bad idea -
>>> they
>>> don't play nicely with ADL. So I would not recommend
>>> convert<std::string>(join(...)) etc.
>
>> Please elaborate more on that. I can't see it.
>
> imagine:

> [...]

> now someone wishes to provide their own converter, not in the boost
> namespace:

Actually I prefer to define a concept for targets of something like
convert(). The generic convert() then just works with that concept. Then
only users, who want to write to something, that does not fulfill the
concept. We might also provide a few specializations like char*, that do
not obey to the defined concept.

Actually I'd avoid the word "convert" here, because I'd use convert()
for a function, that returns a string factory to convert a single value.
Other than concat(), the "string factory" returned by convert() will be
able to convert in different ways, similar to boost::lexical_cast<>.
Currently my favorite API looks like this:

auto i = convert("42"s).to<int>();
// i = 42
// convert can convert from string to objects as well, like lexical_cast

auto s = convert(i).to<std::string>();
// s == "42"
hex(i).to(s);
// s == "2A"
convert(23).to(s);
// s = "23"
// convert(), hex(), and similar functions return string factories.

append(concat(" - the hex representation of ", i, " is ",
hex(i))).to(s);
// s = "23 - the hex representation of 42 is 2A"
concat("the hex representation of ", i, " is ", hex(i)).to(s);
// s = "the hex representation of 42 is 2A"
append(" Yeah!").to(s);
// s = "the hex representation of 42 is 2A Yeah!"

auto my_numbers = std::vector<int>{11, 12, 13, 14, 15};
join(separator(", "), my_numbers).to(s);
// s = "11, 12, 13, 14, 15"
join(separator(", "), std::begin(my_numbers),
std::end(my_numbers)).to(s);
// s = "11, 12, 13, 14, 15"
join(separator(", "), my_numbers | hex).to(s);
// s = "B, C, D, E, F"
// join() works with iterator pairs, ranges and range expressions

append(format(" - the hex representation of {2} is {1}", hex(i),
i)).to(s);
// s = "B, C, D, E, F - the hex representation of 42 is 2A"
format("the hex representation of {2} is {1}", hex(i), i).to(s);
// s = "the hex representation of 42 is 2A"

auto v = hex(i).to<std::vector<char>>();
// v == {'2', 'A'}
append(convert(i)).to(v);
// v == {'2', 'A', '4', '2'}

It almost reads like English cluttered with some weird punctuation. Of
course I have no issue, when .to() and .append_to() resort to a free
function, that can be overloaded using ADL, like you had proposed:

class string_factory {
public:
// ...

template <typename TargetT>

auto to(TargetT& t) -> TargetT& {
return stringify_to(*this, t);
};
template <typename TargetT>
auto to() -> TargetT {
TargetT r;
to(r);
return r;
};
};

Is there a problem for ADL, when stringify_to() is a template on the
string factory?

template <typename StringFactory>
myTargetType& stringify_to(StringFactory& f, myTargetType& t) {
// ...
}

Then we can avoid virtual function calls in the string factory.

Richard Hodges

unread,

Jan 26, 2017, 1:45:16 PM1/26/17

to bo...@lists.boost.org

> Of course I have no issue, when .to() and .append_to() resort to a free
function, that can be overloaded using ADL, like you had proposed:

If member functions are resolving to ADL free functions via a function
object, I'm fully ok with that.

> Is there a problem for ADL, when stringify_to() is a template on the
string factory?

Yes unfortunately. The template has the name `boost::stringify_to`
regardless of the types. But if you go via a template functor object that
uses ADL to find the free function, then the free function does not have to
be a template function, and ADL will work as expected. In order to make the
free function non-template you need to pass some trivial identifier from
the function's namespace as an argument. So you'll need a tag type.

so:

namespace boost {

template<class Tag>
string_factory {
using type = typename Tag::type;
type operator()(joiner const& j) const {
return stringify_to(Tag(), j);
}
};

}

which gets syntactically nasty.

which is why we have names like to_string(), hash_code() and so on... :)

Christof Donat

unread,

Jan 26, 2017, 4:55:34 PM1/26/17

to bo...@lists.boost.org

Hi,

Am 26.01.2017 19:44, schrieb Richard Hodges:
>> Is there a problem for ADL, when stringify_to() is a template on the
> string factory?
>
> Yes unfortunately. The template has the name `boost::stringify_to`
> regardless of the types. But if you go via a template functor object
> that
> uses ADL to find the free function, then the free function does not
> have to
> be a template function, and ADL will work as expected. In order to make
> the
> free function non-template you need to pass some trivial identifier
> from
> the function's namespace as an argument. So you'll need a tag type.
>
> so:
>
> namespace boost {
>
> template<class Tag>
> string_factory {
> using type = typename Tag::type;
> type operator()(joiner const& j) const {
> return stringify_to(Tag(), j);
> }
> };
>
> }

I think, we misunderstood each other. I want to have multiple string
factories, because the concat() string factory will work different than
the format() string factory. I could do that with virtual functions, but
that adds computational overhead at runtime. If I could templatize on
the string factory, while I overload on the object to write to, I can
resolve the string factory behavior at compile time and therefore get
rid of the virtual function call. Virtual function call are suboptimal
for the branch prediction of modern CPUs, and usually can not be
inlined.

This explanation suggests, that my approach should work:
http://en.cppreference.com/w/cpp/language/adl If there is a template
function with that name available by ordinary lookup, then Koenig lookup
kicks in and finds the correct overloaded template function.

namespace boost {
template <typename StringFactory>
std::string& stringify_to(StringFactory& f, std::string& t) {
// ...
}
template <typename StringFactory>
std::wstring& stringify_to(StringFactory& f, std::wstring& t) {
// ...

}

class string_factory {
public:
// ...

template <typename TargetT>
auto to(TargetT& t) -> TargetT& {
return stringify_to(*this, t);
};

// ...
};
}

Here the ordinary lookup will find the standard overloads of
stringify_to template function. Then there is no syntax error any more,
and Koenig lookup will find the overloaded template function in the
target objects namespace as well:

namespace my_target {

template <typename StringFactory>
myTargetType& stringify_to(StringFactory& f, myTargetType& t) {
// ...
}
}

Now concat(...).to<myTargetType>() should find this template.

Did I misinterpret the explanation on cppreference, or is it wrong?

Christof Donat

unread,

Jan 26, 2017, 5:17:51 PM1/26/17

to bo...@lists.boost.org

Hi,

But only people, that have relied on implicit conversion, might have an
issue. Richard proposed to have member functions like .str(), free
functions like to_string() and implicit conversion. Use the explicit
member function, or free function, and you'll be fine. Whoever, for
whichever reason, decides to rely on implicit conversion, will probably
get easier to read code, but .

Generally, please calm down. I am not attacking you personally. I am
just discussing, weather implicit conversion might be a good idea as an
additional way to execute string factories. My current opinion is, that
it is not my preferred interface, but as an additional option, why not?
Only those, who use it, pay for it.

> Why do you think the standards committee added explicit operator
> <type>() to the language? They don't do these language changes for
> fun.

They also added the possibility to create implicit type conversions. If
implicit conversions really are so bad, as you put them, why did they do
that? For fun? Were they drunk?

And I really dislike explicit type conversions. Every time I had
considered to write an explicit type conversion, a member function, or a
free function with a speaking name was the better choice. So if we come
to the conclusion, that the implicit conversions harms anyone, but those
who use them, I'd vote for no conversion operators at all. Use the
explicit member functions then.

> Also, if you won't take it from us, please go and take the wisdom from
> Herb Sutter
>
> http://www.gotw.ca/gotw/019.htm
>
> "It's almost always a good idea to avoid writing automatic
> conversions, either as conversion operators or as single-argument
> non-explicit constructors."

It is almost always a good idea to listen to the wise people and then
think for yourself, if what they say really fits to your situation.
[Christof Donat, just now]

>>> The call is then
>>> auto s = static_cast<std::string>(concat(…));
>>
>> That is the worst we had up to now. Even an implicit conversion is
>> better, which is not at all my favourite as well.
>
> As Olaf pointed out, it is sufficient to do
>
> auto s = std::string(concat(…));
>
> you don't need the static_cast. Nevertheless, casts are the official
> way of doing type conversions, whether you like the syntax or not. I
> think Stroustrup intentionally made them ugly, because he wanted type
> conversions to be the exception in his statically typed language.

I totally agree with him. Explicit type conversions are ugly as hell,
and it is good, that way. Implicit type conversions might be dangerous,
so we should think twice, before we add them. But if in this case they
don't hurt anyone, then I think there is no good reason to not add them.

Christof