Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A comment about library issue 434

3 views
Skip to first unread message

Gennaro Prota

unread,
Mar 14, 2004, 10:02:21 AM3/14/04
to
There are many ways in which library issue 434

http://std.dkuug.dk/jtc1/sc22/wg21/docs/lwg-active.html#434

surprises me. First of all the issue is not a true defect:

"It has been pointed out a number of times that the bitset
to_string() member function template is tedious to use since
callers must explicitly specify the entire template argument
list (3 arguments)."


when on earth a tedious syntax is considered defect? Secondly, the
proposed resolution is to add three (3) new to_string overloads! I
think this is enforcing a design error.

I'll copy here a couple of comments by James Kanze in a discussion
closely related to the issue at hand:

a) But having experienced the universal asString function in Java, I
can only conclude that it is a mistake. Conversion to string is, for
most types, formatting, and should be done by the standard formatting
conventions : operator<<. In this case, the presence of a to_string
function is, IMHO, a design error.

b) This is, in fact, one case which simply cries for a free function:
neither the class itself nor std::string should be encumbered with
knowledge of the other. For generic programming to work, it is
important that the name of this free function be standardized. As it
happens, it is standardized: the standard name for formatting to text
representation is operator<<, with the destination as the first
parameter, and what is to be formatted as the second.


I think James comments immediately generalize to include the
constructor from basic_string and operator>> as well. I'll only add,
to highlight the superiority of the stream approach,that '1' and '0'
(or their widened versions) are just one possible way to represent
bits. What if I want '+' and '-', or the first letter of
numpunct<char_type>'s truename() and falsename()? It would be
perfectly reasonable to convert 11000 to ttfff, for instance, but the
string conversion functions can't do that.

Please, tell me that there will not be three new overloads! :)


--
Genny.

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]

Gennaro Prota

unread,
Mar 15, 2004, 10:55:34 AM3/15/04
to
On Sun, 14 Mar 2004 15:02:21 +0000 (UTC), gennar...@yahoo.com
(Gennaro Prota) wrote:

>Secondly, the proposed resolution is to add three (3) new to_string
>overloads! I think this is enforcing a design error.

I meant "reinforcing", sorry.

Carl Barron

unread,
Mar 15, 2004, 1:39:29 PM3/15/04
to
Gennaro Prota <gennar...@yahoo.com> wrote:

> There are many ways in which library issue 434
>
> http://std.dkuug.dk/jtc1/sc22/wg21/docs/lwg-active.html#434
>
> surprises me. First of all the issue is not a true defect:
>
> "It has been pointed out a number of times that the bitset
> to_string() member function template is tedious to use since
> callers must explicitly specify the entire template argument
> list (3 arguments)."
>
>
> when on earth a tedious syntax is considered defect? Secondly, the
> proposed resolution is to add three (3) new to_string overloads! I
> think this is enforcing a design error.
>

Pardon mm if this has been discussed but one only n eeds only a free
function to reduce tedium:) something like:

template <class String,unsigned int N>
inline
String bits_to_string(std::bitset<N> const &b)
{
typedef typename String::char_type C;
typedef typename String::traits_type T
typedef typename String::allocator_type A;
return b.template to_string<C,T,A>();
}

usage:
std::bitset<512> bits_to_save;
std::string p = bits_to_string<std::string>(bits_to_save);

This will work with string,wstring or another basic_string...

tyoedef std::basic_string<unsigned char> ustring;

usstring u = bits_to_string<ustring>(bits_to_save);

Martin Sebor

unread,
Mar 15, 2004, 2:13:27 PM3/15/04
to
gennar...@yahoo.com (Gennaro Prota) wrote in message news:<9pk85055k0f84n9rs...@4ax.com>...

> There are many ways in which library issue 434
>
> http://std.dkuug.dk/jtc1/sc22/wg21/docs/lwg-active.html#434
>
> surprises me. First of all the issue is not a true defect:
>
> "It has been pointed out a number of times that the bitset
> to_string() member function template is tedious to use since
> callers must explicitly specify the entire template argument
> list (3 arguments)."
>
>
> when on earth a tedious syntax is considered defect?

Who says it is? ;-) It was agreed (in c++std-lib-12229) that for
simple tweaks like the this one it's fine to submit an issue rather
than write a whole proposal for an enhancement.

> Secondly, the
> proposed resolution is to add three (3) new to_string overloads! I
> think this is enforcing a design error.

Maybe, but the function is there, it is not going away, and it is
inconvenient to use. The overloads make using the function easier by
letting most programs call it without explicitly specifying all the
template parameters. Other than that, the presence of the overloads is
transparent.

>
..


> I think James comments immediately generalize to include the
> constructor from basic_string and operator>> as well. I'll only add,
> to highlight the superiority of the stream approach,that '1' and '0'
> (or their widened versions) are just one possible way to represent
> bits. What if I want '+' and '-', or the first letter of
> numpunct<char_type>'s truename() and falsename()? It would be
> perfectly reasonable to convert 11000 to ttfff, for instance, but the
> string conversion functions can't do that.

No, not the way it's specified. I thought I had also proposed to add a
couple of arguments to to_string() to accommodate alternate
representations of '0' and '1', like this:

template <class charT, class traits>
basic_string<charT, traits, allocator<charT> >
to_string (charT = '0', charT = '1') const;

I don't see how these could even be passed to operator<<() or
operator>>().

Martin

Gennaro Prota

unread,
Mar 17, 2004, 3:48:07 PM3/17/04
to
On Mon, 15 Mar 2004 18:39:29 +0000 (UTC), cbar...@ix.netcom.com (Carl
Barron) wrote:

> Pardon mm if this has been discussed but one only n eeds only a free
>function to reduce tedium:) something like:
>
> template <class String,unsigned int N>
> inline
> String bits_to_string(std::bitset<N> const &b)
> {
> typedef typename String::char_type C;
> typedef typename String::traits_type T
> typedef typename String::allocator_type A;
> return b.template to_string<C,T,A>();
> }

Yes. As I said, I think to_string should have never been provided. But
if it was, then yours is a much better form. It doesn't even require
the string header, and is easily implementable through the public
interface only (no friendship):

template <typename stringT, size_t sz>
stringT to_string(const bitset<sz>& b)
{
typedef typename stringT::traits_type Tr;
typedef typename stringT::value_type Ch;

locale loc;
const ctype<Ch> & ct = use_facet< ctype<Ch> > (loc);

const Ch zero(ct.widen('0'));
const Ch one (ct.widen('1'));

stringT result(sz, zero);

for (size_t i = 0; i < sz; ++i)
if(b[i])
Tr::assign(result[sz - 1 - i], one);

return result;
}

However, what is this? It's just a shortcut to get a text
representation when you don't need locale genericity and formatting
options (a la boost::lexical_cast). So, why don't generalize it a bit?


// ---------------------------------

template <typename Target, typename Source>
struct lexical_cast_impl; // fwd decl

template<typename Target, typename Source>
Target lexical_cast(Source arg)
{
return lexical_cast_impl<Target, Source>::do_it(arg);
}

template <typename Target, typename Source>
struct lexical_cast_impl
{
static Target do_it(Source src)
{
<generic implementation (stream based)...>
}
};


template <typename stringT, std::size_t sz>
struct lexical_cast_impl<stringT, std::bitset<sz> >
{
static stringT do_it(const std::bitset<sz>& b)
{
typedef typename stringT::traits_type Tr;
typedef typename stringT::value_type Ch;

locale loc;
const ctype<Ch> & ct = use_facet< ctype<Ch> > (loc);

const Ch zero(ct.widen('0'));
const Ch one (ct.widen('1'));

stringT result(sz, zero);

for (size_t i = 0; i < sz; ++i)
if(b[i])
Tr::assign(result[sz - 1 - i], one);

return result;

}
};

template <std::size_t sz>
struct lexical_cast_impl<unsigned long, std::bitset<sz> >
{
static unsigned long do_it(const std::bitset<sz> & b)
{
// if implemented this way, we should probably
// translate overflow_error into something else,
// like bad_lexical_cast, but you got the idea :)
//
return b.to_ulong();
}
};


Note how conversion to unsigned long easily integrates.


--
Genny.

Gennaro Prota

unread,
Mar 17, 2004, 11:01:17 PM3/17/04
to
On Mon, 15 Mar 2004 19:13:27 +0000 (UTC), sebor...@netscape.net
(Martin Sebor) wrote:

>gennar...@yahoo.com (Gennaro Prota) wrote in message news:<9pk85055k0f84n9rs...@4ax.com>...

>>[snip]


>> when on earth a tedious syntax is considered defect?
>
>Who says it is? ;-) It was agreed (in c++std-lib-12229) that for
>simple tweaks like the this one it's fine to submit an issue rather
>than write a whole proposal for an enhancement.

Hmm.

>> Secondly, the
>> proposed resolution is to add three (3) new to_string overloads! I
>> think this is enforcing a design error.
>
>Maybe, but the function is there, it is not going away, and it is
>inconvenient to use.

And that's a good thing. Discourage, people, discourage! :)

>[snip]


>No, not the way it's specified. I thought I had also proposed to add a
>couple of arguments to to_string() to accommodate alternate
>representations of '0' and '1', like this:
>
> template <class charT, class traits>
> basic_string<charT, traits, allocator<charT> >
> to_string (charT = '0', charT = '1') const;
>

That's horrible, sorry :( First of all, those default args only works
for charT=char. That's the same error all std::bitsets implementations
that I've seen do: treating '0' and '1' as a sort of generic character
literals. Admittedly, the specification in the standard was written
without having in mind what non-char 0 and 1 are either (see for
instance 23.3.5.1/5). As you propose it, you make

b.to_string<wchar_t, ...>();

ill-working (because it compiles, but...), and

b.to_string<MyCharType, ...>();

ill-formed, or anyway illegal (note that MyCharType must be a POD
type)

That said, take a look at the implementation I give in reply to Carl
(or the analogous one at http://tinyurl.com/2rvvf - I'm going to fight
on boost to deprecate that! :)). The only way to get generic CharT
versions of '0' and '1' is to use ctype's widen (see also lib issue
303), but of what locale? In to_string, you don't have a locale
parameter, so you have no other choice than using the global one. But
why I have to #include <locale> and cope with the ctype facet if the
function is so constrained? With streams I can imbue whatever I want.
With to_string I can only use the global locale.

>I don't see how these could even be passed to operator<<() or
>operator>>().

I've experimented a bit with this, for boost::dynamic_bitset. There
are at least three approaches: custom manipulators, a custom facet and
a special formatting class. The first approach is IMHO overkill for
such a simple task: it requires dealing with pword(), xalloc() and
streams callbacks... really too much code for such a trivial thing.
The second approach is quite simple; basically:

template<typename CharT>
class bitset_digits : public std::locale::facet
{
CharT m_zero; // Just an example implementation
CharT m_one;
public:
static std::locale::id id;
bitset_digits(CharT zero, CharT one): std::locale::facet(0),
m_zero(zero), m_one(one){}
CharT get_zero() const { return do_get_zero(); }
CharT get_one() const { return do_get_one(); }

protected:
virtual CharT do_get_zero() const { return m_zero; };
virtual CharT do_get_one() const { return m_one; };

};

template <typename CharT>
std::locale::id bitset_digits<CharT>::id;

However it complicates a bit the stream operators, as they have to
check for the existence of the bitset_digits<CharT> facet in the
stream locale. It's also a bit less convenient to use:

std::locale loc(std::locale(),
new bitset_digits<wchar_t>(L'a', L'b'));
std::wcout.imbue(loc);
std::wcout << my_bitset;

The third solution is to use a special class with its own operators <<
and >>. That is, you define special formatting in terms of a special
class:

bit_alpha<char> format ('*', 'x', my_bitset);
std::cout << format;

Such a class could be provided as an extension. In that case the
stream operators in std::bitset would simply construct a suitable
bit_alpha object and delegate the actual work to it. The annoyance to
specify the character type (as done above) can be eliminated too.


--
Genny.

Martin Sebor

unread,
Apr 2, 2004, 6:16:42 PM4/2/04
to
..

> > template <class charT, class traits>
> > basic_string<charT, traits, allocator<charT> >
> > to_string (charT = '0', charT = '1') const;
> >
>
> That's horrible, sorry :( First of all, those default args only works
> for charT=char.

It works for both char and wchar_t, the two types that the template is
most commonly going to be instantiated on. Programs that use character
types other than those two are, IMO, exceedingly rare, and can
specialize to_string on their own value of charT (although doing so
would be tedious for a generic bitset).

>
..


> As you propose it, you make
>
> b.to_string<wchar_t, ...>();
>
> ill-working (because it compiles, but...),

How so? It compiles and works just fine. Why is it ill-working (and
what do you mean by it)? (Note that the expression ('0' == L'0' && '1'
== L'1') must and does hold for all known locales.)

> and
>
> b.to_string<MyCharType, ...>();
>
> ill-formed, or anyway illegal (note that MyCharType must be a POD
> type)

Yes. Making this corner case well-formed requires IMO needlessly
complex machinery.

>
> That said, take a look at the implementation I give in reply to Carl
> (or the analogous one at http://tinyurl.com/2rvvf - I'm going to fight
> on boost to deprecate that! :)). The only way to get generic CharT
> versions of '0' and '1' is to use ctype's widen (see also lib issue
> 303), but of what locale? In to_string, you don't have a locale
> parameter, so you have no other choice than using the global one. But
> why I have to #include <locale> and cope with the ctype facet if the
> function is so constrained? With streams I can imbue whatever I want.
> With to_string I can only use the global locale.

I don't think you want to drag in all of locale just to format a
string of zeros ones. The proposed change is a tradeoff between
simplicity and robustness. The handful of programs that want to go
through locale can easily do that by bypassing bitset::to_string() and
implementing their own formatting.

Incidentally, as I mentioned above, ctype<wchar_t>::widen('0') is for
all intents and purposes required to be equal to L'0' (i.e., the same
as (wchar_t)'0' so all this complexity won't really buy you anything
but slow performance in the common case. And since ctype<charT> is not
required to be provided for any charT other than char and wchar_t,
your code won't compile either (unless the user writes their own
specialization of ctype on their charT; that's a lot more work than
specializing to_string on their own charT).

>
..


>
> However it complicates a bit the stream operators, as they have to
> check for the existence of the bitset_digits<CharT> facet in the
> stream locale. It's also a bit less convenient to use:

Right. Too much complexity for something so simple.

>
..


> The third solution is to use a special class with its own operators <<
> and >>. That is, you define special formatting in terms of a special
> class:
>
> bit_alpha<char> format ('*', 'x', my_bitset);
> std::cout << format;
>
> Such a class could be provided as an extension. In that case the
> stream operators in std::bitset would simply construct a suitable
> bit_alpha object and delegate the actual work to it. The annoyance to
> specify the character type (as done above) can be eliminated too.

I'm not sure I quite follow this but again, it seems a lot more
invasive (in terms of changes to the standard text) than the
admittedly limited extension I propose. I think Dietmar Kuhl said he'd
try to propose something along these lines but I haven't seen it yet.

Martin

0 new messages