Why no regex typedefs for 16- and 32-bit char types?

Scott Meyers

unread,

Oct 20, 2009, 5:24:18 PM10/20/09

to

C++0x regex support includes typedefs for match and iterator
instantiations for char*, wchar_t*, and narrow and wide strings (i.e.,
basic_string<char> and basic_string<wchar_t>). For example:

typedef regex_iterator<const char*> cregex_iterator;
typedef regex_iterator<const wchar_t*> wcregex_iterator;
typedef regex_iterator<string::const_iterator> sregex_iterator;
typedef regex_iterator<wstring::const_iterator> wsregex_iterator;

There are no corresponding typedefs for char16_t* or char32_t*
pointers, nor for basic_string instantiations with these character
types. Is there a (good) reason for this apparent bias in favor of
char and wchar_t over char16_t and char32_t?

Thanks,

Scott

--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@netlab.cs.rpi.edu]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]

Joe Smith

unread,

Oct 30, 2009, 2:31:54 PM10/30/09

to

"Scott Meyers" <use...@aristeia.com> wrote in message
news:hbl0ud$91u$1...@news.albasani.net...

C++0x regex support includes typedefs for match and iterator
instantiations for char*, wchar_t*, and narrow and wide strings (i.e.,
basic_string<char> and basic_string<wchar_t>). For example:

typedef regex_iterator<const char*> cregex_iterator;
typedef regex_iterator<const wchar_t*> wcregex_iterator;
typedef regex_iterator<string::const_iterator> sregex_iterator;
typedef regex_iterator<wstring::const_iterator> wsregex_iterator;

There are no corresponding typedefs for char16_t* or char32_t*
pointers, nor for basic_string instantiations with these character
types. Is there a (good) reason for this apparent bias in favor of
char and wchar_t over char16_t and char32_t?

Since there has been no reply, and I have trouble seeing any valid
reason to exclude those other types seeing as wchar_t could already be
encoding UCS-4, if not full blown UTF-16, I'd recomended resending as
a defect report. Im my eyes, unless a good reason is provided, any
such asymmetry is a defect.

By the way, are defect reports that are properly formated and are not
completely inane, actually getting forwarded to the committe reliably?
If so, is this though a the moderators forwarding the messages, or the
defect report list maintainers just noticing them and adding them, or
some other mechanism?

I notice that often times messages reporting defects get no replies,
and sometimes I later notice them on the DR lists, and other times I
don't (although that is to say I don't notice them, but they may still
be there, I'm not explicitly looking for them). The lack of a reply
even saying "forwarded to commitee", or "added as defect number #xxx"
can be a little disconcerting, making me wonder if some of the issues
noticed on this list get forgotten and will end up in the final
standard, despite plenty of advanced notice. So some assurance that
the defect reports are still getting to the committee despite the lack
of any apparent activity on the list would be reassuring.

Sean Hunt

unread,

Oct 30, 2009, 9:23:32 PM10/30/09

to

On Oct 30, 11:31 am, "Joe Smith" <unknown_kev_...@hotmail.com> wrote:
> By the way, are defect reports that are properly formated and are not
> completely inane, actually getting forwarded to the committe reliably?
> If so, is this though a the moderators forwarding the messages, or the
> defect report list maintainers just noticing them and adding them, or
> some other mechanism?
>
> I notice that often times messages reporting defects get no replies,
> and sometimes I later notice them on the DR lists, and other times I
> don't (although that is to say I don't notice them, but they may still
> be there, I'm not explicitly looking for them). The lack of a reply
> even saying "forwarded to commitee", or "added as defect number #xxx"
> can be a little disconcerting, making me wonder if some of the issues
> noticed on this list get forgotten and will end up in the final
> standard, despite plenty of advanced notice. So some assurance that
> the defect reports are still getting to the committee despite the lack
> of any apparent activity on the list would be reassuring.

The best way to submit a defect is to write to the maintainer of the
appropriate issues list - either William Miller or Howard Hinnant.
Their email addresses (which I will not repeat here to save them from
the ravages of spam) are on the appropriate issues lists. You should
probably submit this as a vague defect, as I don't believe regex is
the only place with this assymetry towards the new string types, and I
completely agree that the standard committee shouldn't half-bake the
Unicode support.

Sean Hunt

Howard Hinnant

unread,

Nov 1, 2009, 12:09:08 AM11/1/09

to

On Oct 30, 9:23 pm, Sean Hunt <ride...@gmail.com> wrote:

> The best way to submit a defect is to write to the maintainer of the
> appropriate issues list - either William Miller or Howard Hinnant.
> Their email addresses (which I will not repeat here to save them from
> the ravages of spam) are on the appropriate issues lists.

Thanks Sean! :-)

-Howard

Bo Persson

unread,

Nov 1, 2009, 12:06:12 AM11/1/09

to

But it is kind of half-baked already. :-)

The regex library contains I/O which requires that streams support the
extended char types. This, in turn, would require mandatory locale
support - which isn't there.

We do get Unicode strings, but not much else.

Bo Persson