Re: Conversion from `string_view` to `string`

9,121 views
Skip to first unread message
Message has been deleted

TONGARI J

unread,
Mar 22, 2017, 3:41:47 AM3/22/17
to ISO C++ Standard - Discussion, joseph....@gmail.com
On Wednesday, March 22, 2017 at 3:17:43 PM UTC+8, joseph....@gmail.com wrote:
[...]
Why is `string_view` not convertible to `string`? I have found this discussion from 2014 where Matthew Fioravante gives this rationale:

Converting a string to a string_view is not really a conversion, its more like constructing a reference to the data which performance wise is basically free (and possibly even more optimal then passing const string& because we don't have that extra indirection). The assignment is a shallow copy, similar to converting a child class pointer to a base class pointer. Converting from string_view to string however is a real data conversion. Memory has to be allocated and a copy of the data has to be made. In this case its better to have an explicit conversion so that people are prevented from easy to write hidden pessimizations. It marks in the code directly where copies of strings are being made.

I don't find this reasoning convincing. The same logic could be applied to conversion of `char const*` to `string`, yet this is permitted. Is the opinion of the committee that conversion of `char const*` to `string` was a mistake? As an example, say I have a function:

void f(string);

I can call this function in three ways:

f(s1);  // `decltype(s1)` is `char const*`
f
(s2);  // `decltype(s2)` is `string`
f
(s3);  // `decltype(s3)` is `string_view`

In each of these cases, an allocation and a copy is made, but only in the case of `string_view` is the programmer made explicitly aware of this. Even if the function looks like this instead:

void f(string const&);

There is still an allocation and a copy made in the `char const*` case. And in this case, I would say that the function should be updated like so:

void f(string_view);

This way no allocations or copies are made.

Also, if `string_view` should be thought of as a reference-like type, does that mean that conversion from `T&` to `T` should be explicit?

In my opinion, it the responsibility of the owner an object to be aware of when potentially expensive copies are being made. The distinction between implicit and explicit conversion should be made on the basis of safety and correctness, not performance. Disallowing conversion from `string_view` to `string` makes `string_view` awkward to use, which may discourage people from using it -- a net loss in my view. The usability of a type should not be hamstrung because of a fear that users might not aware of the performance costs of certain operations.

I agree with you, as I've started a similar thread recently.

joseph....@gmail.com

unread,
Mar 22, 2017, 3:46:20 AM3/22/17
to ISO C++ Standard - Discussion, joseph....@gmail.com

Don't know how I missed that *facepalm*

Richard Hodges

unread,
Mar 22, 2017, 3:59:59 AM3/22/17
to std-dis...@isocpp.org
My view is a little different.

In an ideal world, I would say that std::string::string(const char*) should at least be marked explicit, because there is a hidden conversion. Better in my view would be to require the pointer to be wrapped. something like: auto s = std::string(const_string_observer_pointer("xyz")); or auto s = std::string(from_null_terminated, "xyz");

In the light of std::string_view, all functions that take an immutable string as an argument should certainly be refactored to accept a string_view. This can only lead to performance improvements, and a clearer statement of intent.

Finally, on the subject of a conversion from string_view to a string, we already have auto s = std::string(sv.begin(), sv.end()); which has the advantage of being both explicit, clearly stating intent and giving a very strong indication that a copy is about to happen.

 


On 22 March 2017 at 08:17, <joseph....@gmail.com> wrote:
With regard to this post, I have noticed that the inability of `string_view` to convert to `string` causes a problem when considering interoperation with hypothetical heterogeneous associative container modifiers, such as:

template <typename K>
T
& map::operator[](K const& k);

A natural requirement of this operator is that `is_convertible_v<K, key_type>` be `true`. However, to support use of `string_view` (my motivating use case) with this operator, the requirement must be weakened to `is_constructible_v<key_type, K>` being `true`. This seems wrong, yet it seems natural that `string_view` should be usable as a key.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

joseph....@gmail.com

unread,
Mar 22, 2017, 5:50:21 AM3/22/17
to ISO C++ Standard - Discussion
On Wednesday, 22 March 2017 15:59:59 UTC+8, Richard Hodges wrote:
My view is a little different.

In an ideal world, I would say that std::string::string(const char*) should at least be marked explicit, because there is a hidden conversion. Better in my view would be to require the pointer to be wrapped. something like: auto s = std::string(const_string_observer_pointer("xyz")); or auto s = std::string(from_null_terminated, "xyz");

In the light of std::string_view, all functions that take an immutable string as an argument should certainly be refactored to accept a string_view. This can only lead to performance improvements, and a clearer statement of intent.

Finally, on the subject of a conversion from string_view to a string, we already have auto s = std::string(sv.begin(), sv.end()); which has the advantage of being both explicit, clearly stating intent and giving a very strong indication that a copy is about to happen.


I have posted my thoughts over on this thread to avoid duplication. My thoughts on what you have just said are outlined over there, but just to summarize:
  • Implicit conversions are "hidden" by nature. They are hidden because they are safe. Conversion of `string_view` to `string` is safe, so it should be implicit.
  • Similarly, I think it would be a mistake to make conversion from `char const*` to `string` explicit, as it is also safe (with the exception of the possibility of a null pointer, which is due to C++ lacking native "not null" pointer types, and I think should be dealt with through other means, such as static analysis).
  • We have `std::string(sv)` so I doubt anyone will use `std::string(sv.begin(), sv.end())`. And I see no reason why we need to force users to explicitly invoke the `string` constructor to copy a `string_view` when implicit copies of objects (including `string`) happen all the time. An implicit conversion is no different from a plain copy from a correctness or safety standpoint. See the other thread for my other thoughts on explicit conversion.
 


On 22 March 2017 at 08:17, <joseph....@gmail.com> wrote:
With regard to this post, I have noticed that the inability of `string_view` to convert to `string` causes a problem when considering interoperation with hypothetical heterogeneous associative container modifiers, such as:

template <typename K>
T
& map::operator[](K const& k);

A natural requirement of this operator is that `is_convertible_v<K, key_type>` be `true`. However, to support use of `string_view` (my motivating use case) with this operator, the requirement must be weakened to `is_constructible_v<key_type, K>` being `true`. This seems wrong, yet it seems natural that `string_view` should be usable as a key.

Why is `string_view` not convertible to `string`? I have found this discussion from 2014 where Matthew Fioravante gives this rationale:

Converting a string to a string_view is not really a conversion, its more like constructing a reference to the data which performance wise is basically free (and possibly even more optimal then passing const string& because we don't have that extra indirection). The assignment is a shallow copy, similar to converting a child class pointer to a base class pointer. Converting from string_view to string however is a real data conversion. Memory has to be allocated and a copy of the data has to be made. In this case its better to have an explicit conversion so that people are prevented from easy to write hidden pessimizations. It marks in the code directly where copies of strings are being made.

I don't find this reasoning convincing. The same logic could be applied to conversion of `char const*` to `string`, yet this is permitted. Is the opinion of the committee that conversion of `char const*` to `string` was a mistake? As an example, say I have a function:

void f(string);

I can call this function in three ways:

f(s1);  // `decltype(s1)` is `char const*`
f
(s2);  // `decltype(s2)` is `string`
f
(s3);  // `decltype(s3)` is `string_view`

In each of these cases, an allocation and a copy is made, but only in the case of `string_view` is the programmer made explicitly aware of this. Even if the function looks like this instead:

void f(string const&);

There is still an allocation and a copy made in the `char const*` case. And in this case, I would say that the function should be updated like so:

void f(string_view);

This way no allocations or copies are made.

Also, if `string_view` should be thought of as a reference-like type, does that mean that conversion from `T&` to `T` should be explicit?

In my opinion, it the responsibility of the owner an object to be aware of when potentially expensive copies are being made. The distinction between implicit and explicit conversion should be made on the basis of safety and correctness, not performance. Disallowing conversion from `string_view` to `string` makes `string_view` awkward to use, which may discourage people from using it -- a net loss in my view. The usability of a type should not be hamstrung because of a fear that users might not aware of the performance costs of certain operations.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.

joseph....@gmail.com

unread,
Mar 22, 2017, 5:50:52 AM3/22/17
to ISO C++ Standard - Discussion, joseph....@gmail.com


On Wednesday, 22 March 2017 17:50:21 UTC+8, joseph....@gmail.com wrote:
On Wednesday, 22 March 2017 15:59:59 UTC+8, Richard Hodges wrote:
My view is a little different.

In an ideal world, I would say that std::string::string(const char*) should at least be marked explicit, because there is a hidden conversion. Better in my view would be to require the pointer to be wrapped. something like: auto s = std::string(const_string_observer_pointer("xyz")); or auto s = std::string(from_null_terminated, "xyz");

In the light of std::string_view, all functions that take an immutable string as an argument should certainly be refactored to accept a string_view. This can only lead to performance improvements, and a clearer statement of intent.

Finally, on the subject of a conversion from string_view to a string, we already have auto s = std::string(sv.begin(), sv.end()); which has the advantage of being both explicit, clearly stating intent and giving a very strong indication that a copy is about to happen.


I have posted my thoughts over on this thread to avoid duplication. My thoughts on what you have just said are outlined over there, but just to summarize:

Sorry, this thread.
 
Reply all
Reply to author
Forward
0 new messages