Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Guru of the Week #84: Solution

2 views
Skip to first unread message

Pavel Vozenilek

unread,
Aug 10, 2002, 6:38:30 AM8/10/02
to
People who use editor with tooltip based help may prefere member
functions: as one types these get displayed in pop-up list. Also the
substr()/copy() parameter names can be shown, reducing possibility of
error a bit.

/Pavel

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

Brian McKeever

unread,
Aug 10, 2002, 6:40:35 AM8/10/02
to
Attila Feher <Attila...@lmf.ericsson.se> wrote in message news:<3D53DA6D...@lmf.ericsson.se>...

> Another comment: there may be containers/implementations where size
> cannot always be calculated in constant time whereas the "emptyness" can
> be determined in constant time, but only by knowing the internal
> architecture of the container. I cannot recall what things are these,
> but someone convinced me by this argument once... Of course, a
> non-member overloaded-friend empty() (or inline calling a member) can
> just do that as well.

Was that someone Scott Meyers? This is Item 4 in his "Effective STL".
He argues that constant-time splice() requires non-constant-time
size().
But can't you write "empty()" as "return begin() != end();" where
begin() and end() always constant-time calls?

Brian

Herb Sutter

unread,
Aug 10, 2002, 6:42:27 AM8/10/02
to
On 9 Aug 2002 14:49:43 -0400, "Anthony Williams"
<ant...@nortelnetworks.com> wrote:
>What a comprehensive beating of std::basic_string<> !

Even so, I think basic_string won by virtue of the "you shoulda seen the
other guy" principle -- I think I felt more beaten by the time I made it to
the end. :-)

>Anyway, you've forgotten that operator+= is an assignment operator, and
>thus must be a member.

I'm sure there may be typos or minor errors in the article (I wrote most of
it while under deadline pressure and more than normally sleep-deprived, and
when I just couldn't look at it any more I declared it done), but this is
not one of them... I read that part of the standard carefully several times
before declaring that operator+= could be a nonmember.

Although in the grammar the *= forms are lexically called
"assignment-operator"s (5.17), the governing subclause is actually in the
clause 13 (Overloading) which lays our the rules about which operators can
be overloaded and how. There, in subclause 13.5.3 (Assignment), the text
only mentions operator=(), and it does say that it must be a member. Because
the operator*=() forms are not mentioned in 13.5.3, they fall into the
general 13.5.2 (Binary operators) bucket which permits their implementation
as members or nonmembers.

(Yeah, personally, I think it's confusing that 13.5.3 talks about an
"assignment operator" in normal font and means only operator=(), whereas
5.17 uses the grammar tag "assignment-operator" in code font and means not
just operator=() but also the operator*=() versions -- the only difference I
can see is a dash and a font. Oh well. That's what language lawyers get paid
to read.)

Anyway, I thought that the above reasoning was too long, and too
uninteresting to anyone but standards geeks, to put into the article.

To summarize with an example, the following nonmember operator+=() is
perfectly legal:

template<class charT, class traits, class Allocator>
basic_string<charT, traits, Allocator>&
operator+=( basic_string<charT, traits, Allocator>& s,
basic_string<charT, traits, Allocator>& t )
{
s.append( t );
return s;
}

Not only should it work, but it does work fine under BCC 5.5, Comeau 4.3,
Metrowerks CW 8, and MSVC 7 and "7.1" beta, even using each compiler's
strict (conforming) mode switch where I happened to remember it off the top
of my head. I didn't try any other compilers, but that's a good sample to
give me comfort that I'm not the only one who reads 13.5.3 that way.

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
C++ community program manager, Microsoft (www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002 (www.thecppseminar.com)

Herb Sutter

unread,
Aug 10, 2002, 6:46:46 AM8/10/02
to
On 9 Aug 2002 14:54:35 -0400, Markus Werle <numerical....@web.de>
wrote:
>> [...] immediately notice that [...] can be implemented efficiently
>
>Herb Sutter once wrote in http://www.gotw.ca/gotw/033.htm: "Programmers
>are notoriously poor guessers about where their
>code's true bottlenecks lie."
>
>How come You make an assumption about the efficiency of the code during
>Your design phase that massively affects Your design here? (You are a
>_guru_, but I am not, so convince me I should follow
>this way ...)

It's a different issue. The main reason I mention efficiency here is to
underline (for those who care about efficiency) that there is no downside to
implementing these functions as nonmember nonfriends. That just erases the
efficiency counterargument before it arises.

Carl Daniel

unread,
Aug 10, 2002, 7:00:51 AM8/10/02
to
"Markus Werle" <numerical....@web.de> wrote in message
news:aj0nis$182fnd$1...@ID-153032.news.dfncis.de...
> Your approach does not go well with Delayed Optimization:
> If optimization issues force me to redesign, say switch back to special
> memfun, well, what do You say?

I'd say "then add a private member function, and make the (inline)
non-member a friend which forwards to it.

-cd

Bob Bell

unread,
Aug 11, 2002, 6:40:41 AM8/11/02
to
pavel_v...@yahoo.co.uk (Pavel Vozenilek) wrote in message news:<731020ca.02081...@posting.google.com>...

> People who use editor with tooltip based help may prefere member
> functions: as one types these get displayed in pop-up list. Also the
> substr()/copy() parameter names can be shown, reducing possibility of
> error a bit.

However, it's also true that this editor feature is there because of
member-function-centric designs. Because programmers rely (too much)
on member functions, editors support help for that style. If
programmers began to prefer nonmember functions more, editors would
change.

Arguing that we should use member functions because editors make it
easier is putting the cart before the horse, I think.

Bob

Sergey P. Derevyago

unread,
Aug 11, 2002, 7:24:38 AM8/11/02
to
Herb Sutter wrote:
> I'll start with the punchline: If you're writing a function that
> can be implemented as either a member or as a non-friend non-member,
> you should prefer to implement it as a non-member function. That
> decision increases class encapsulation. When you think
> encapsulation, you should think non-member functions. [4]
It seems like the goal is a microkernel class architecture (MCA).

> What about clear()? Easy, that's the same as erase(begin(),end()).
> No fuss, no muss. Exercise for the reader, and all that.
But clear() is one of those operations, that can be significantly optimized
having a direct access to the internals. For example, most of SQL servers has
TRUNCATE command while you can use unconditional DELETE FROM. Performance is
the key.

> It's widely accepted that basic_string has way too many member
> functions. Of the 103 functions in basic_string, only 32 really need
> to be members, and 71 could be written as nonmember nonfriends without
> loss of efficiency.
IMHO it's not a good idea just to (rather arbitrarily, i.e. s.size() but
length(s) or size(s) but s.length()?!) split string's interface into the two
parts. Currently I know that (almost) all of the interface functions are
members and it's OK: I don't want to learn those 32 "really member" names. STL
is already too big and any inconsistent changes will be really annoying.

And we can easily use MCA for the sake of clear and terse implementation:

class string {
class kernel {
// internals
public:
// 32 "real member"s
};

kernel ker;
public:
// interface functions

// in particular:
size_type size() const { return ker.size(); } // use kernel
size_type length() const { return size(); } // redirect
};
--
With all respect, Sergey. http://cpp3.virtualave.net/
mailto : ders at skeptik.net

Dave Harris

unread,
Aug 11, 2002, 7:53:54 PM8/11/02
to
hsu...@gotw.ca (Herb Sutter) wrote (abridged):
> -- Prefer to make it a member if it needs access to internals:
> Which operations need access to internal data we would
otherwise
> have to grant via friendship? These should normally be
> members.

Whether it needs access to internals will depend on the implementation.
Change the implementation and we might need to change the set of
functions
that are members. In any case, it's a pain for clients to have to
remember
which operations need access.

Wouldn't it be better to provide /all/ functions as non-members if C++
allows them to be? If we don't want to use "friend", they can forward to

member functions in the same way that other non-members do. We'll still
have some public member functions, but client code would be discouraged
from using them directly.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

Luke Burton

unread,
Aug 11, 2002, 7:55:28 PM8/11/02
to

Herb,

I am having trouble with the entire concept of scattering class related
operations into the current namespace as nonmembers. Your principal
arguments for this seem to be:

1) better encapsulation
2) improved generacity for the functions

I have not yet read any of the reference material supporting 1), so I
won't delve into that.

But for the case of 2) the drawbacks seem to be:

- Counter-intuitive. Markus Werle elaborated on this at length - it just
_feels_ better to invoke an operation on something. Laptop.turnOn()
rather than TurnOn(laptop). It feels like we're devolving to C !

- Scattered functionality. Why should users have to include the
potentially non-related string.h to obtain sequence operations?
Shouldn't these functions be moved into their own header files, and
hence become not part of the classes implementation at all?

- Inconsistent. Now you have, in the case of string, some instances
where you call string.insert and others where you call insert(string). I
would prefer one or the other being settled on to simplify matters.
Maybe that points to _everything_ being nonmember _friends_ ?

- Avoiding utilisation of classes. If you want to redelegate some of
string's functionality, why not give it up to another class? Instead of
string.replace(...), do SequenceOperations.replace(string), where string
now has to implement an abstract interface called Sequence before it can
be used with this class.

- Namespaces usage. Unfortunately, my current contract sees me working
in an environment with a sun compiler that doesn't support namespaces.
So for these projects, global namespace pollution becomes a real issue.
Static members of an abstract class could be a work around ...

Disclaimer: I am an intermediate level C++ programmer. I submit these
thoughts with the intention of being corrected and hence becoming a
better programmer! I'm genuinely interested in being re-educated, if I'm
barking up the wrong tree.

Regards,

Luke.

--

Yorn desh born, der ritt de gitt der gue
Orn desh, dee born desh, de umn
bork! bork! bork!

Herb Sutter

unread,
Aug 11, 2002, 8:57:07 PM8/11/02
to
On 11 Aug 2002 07:24:38 -0400, "Sergey P. Derevyago"

<non-ex...@iobox.com> wrote:
> IMHO it's not a good idea just to (rather arbitrarily, i.e. s.size() but
>length(s) or size(s) but s.length()?!) split string's interface into the two
>parts. Currently I know that (almost) all of the interface functions are
>members and it's OK: I don't want to learn those 32 "really member" names. STL
>is already too big and any inconsistent changes will be really annoying.

Agreed. In the section where I talk about size() and lenth() I specifically
listed that as one of two possible counterarguments:

"- Consistency. You could argue that keeping things like empty() as
members follows the principle of least surprise -- similar functions
are members, and in other class libraries things like IsEmpty()
functions are commonly members, after all. I think this argument is
valid, but weakened when we notice that this wouldn't be surprising
at all if people were in the habit of following Meyers' advice in the
first place, routinely writing functions as nonmembers whenever
reasonably possible. ..."

Pursuing that one step further than the (already-long) GotW issue did, we
could go beyond categorizing which existing members could be nonmember
nonfriends, and talk about adding new nonmember nonfriends which could be
simple passthroughs or synonyms for members functions, specifically to
address this concern and unify the calling syntax. That is, we could provide
a nonmember size() so that both size(s) and s.size() work, getting rid of
the need to remember length(s) vs. s.size(). Then people would get used to
the nonmember syntax.

BTW, there are other reasons to prefer nonmember syntax. For one thing, it
turns out it makes writing templates significantly easier in some cases if
you know that the functions are nonmembers for all types, instead of members
for some types and nonmembers for others. For another, it would make
(erstwhile-)members and nonmembers overload -- and if you're jumping back in
horror and about to strenuously resist that as undesirable, well, experience
has shown that it could often be a Good Thing and some committee members
feel that in general the overloading of members and nonmembers might ought
to be added to C++0x (this is far from complete though, and may never happen
-- just pointing out that even experts feel it's potentially a good idea).
Just ask Andrei how having everything as nonmembers would simplify generic
programming, but be ready for an earful!

For balance, as noted, some people worry about namespace pollution. In
practice I believe that making the names nonmembers is actually a powerful
force for making them consistent in syntax and semantics, within std:: in
particular, and that's a Good Thing too as it would for example make
container::empty() easier to implement (once for all time instead of once
per container) and more consistent (avoiding some existing semantics
differences between member and nonmember versions of algorithms and
container operations, or more so differences in the same member from one
container to another).

But in general, the point was to demonstrate just how many did not actually
need to be members or friends, and in most cases I think it's
noncontroversial that they would be better off as nonmember nonfriends --
certainly Meyers espouses that principle, and I think I provided sufficient
motivation and justification to make most of the specific basic_string
examples noncontroversial. Those that would be more controversial, like the
ones you note, I specifically put in their own section with discussion and
arguments. I may take some of the above and roll it into the future book
version of this GotW to amplify the material.

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
C++ community program manager, Microsoft (www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002 (www.thecppseminar.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Louis Lavery

unread,
Aug 12, 2002, 11:18:22 AM8/12/02
to
Mods - this is a repost as my original post (fri 12th) has not been
acknowledged.

----- Original Message -----
From: Herb Sutter <hsu...@gotw.ca>
Newsgroups: comp.lang.c++.moderated
Sent: Friday, 09 August, 2002 12:19 PM
Subject: Guru of the Week #84: Solution

[snip - quick rub down with towel]

> Operations That Should Be Members
> ---------------------------------
>
> We asked: Which operations need access to internal data we would
> otherwise have to grant via friendship? Clearly these have a reason
> to be members, and normally ought to be. This list includes the
> following, some of which provide indirect access to (e.g., begin()) or

> change (e.g., reserve()) the internal state of the string:
>
> begin (2)
> end (2)
> rbegin (2)
> rend (2)
> size
> max_size
> capacity
> reserve
> swap
> c_str
> data
> get_allocator
>
> The above ought to be members not only because they're tightly bound
> to basic_string, but they also happen to form the public interface
> that nonmember nonfriend functions will need to use. Sure, you could
> implement these as nonmember friends, but why?
>

I'm glad you asked that.

Because you can't (easily) get rid of members, you're stuck with them
for life, whereas friends can come and go.

But, as I've only recently become a heretic, let me bore you from my
soap box...

(1) With friends you get a uniform interface and so don't have to
remember which functions are members or non-members. I mean
who cares? Friends hide this implementaion detail giving
increased encapsultion over member functions.
And, you have to admit, you'd prefer to write swap(s1,s2)
than the gauche s1.swap(s2), wouldn't you?

(2) Should it become possible to implement a member function in
terms of the other members/friends (maybe because new
functions have been added) you cannot easily remove it.
If it had been a friend there'd be no probelm.
This ability to swap functions in and out has got to be a
big plus for friends, so why avoid them?

(3) A monolithic class will be just as cluttered with function
declarations whether you use members or friends.
But with friends you may be able to group the out of class
declarations of related functions in the same header files,
thus clients can just see only what they want.
Okay, it's a bit weak but I'm trying to drum up support :-)

So, it seems to me, you should favour friends over members unless
there's a compelling reason not to.

Louis.

Attila Feher

unread,
Aug 12, 2002, 11:19:21 AM8/12/02
to
Brian McKeever wrote:
>
> Attila Feher <Attila...@lmf.ericsson.se> wrote in message news:<3D53DA6D...@lmf.ericsson.se>...
>
> > Another comment: there may be containers/implementations where size
> > cannot always be calculated in constant time whereas the "emptyness" can
> > be determined in constant time, but only by knowing the internal
> > architecture of the container. I cannot recall what things are these,
> > but someone convinced me by this argument once... Of course, a
> > non-member overloaded-friend empty() (or inline calling a member) can
> > just do that as well.
>
> Was that someone Scott Meyers? This is Item 4 in his "Effective STL".
> He argues that constant-time splice() requires non-constant-time
> size().
> But can't you write "empty()" as "return begin() != end();" where
> begin() and end() always constant-time calls?

Might be. However I was thinking about _any_ container. Finding out if
it is empty is usually pretty simple, while the number of elements _may_
not be. I am sure you are righ tabout Scott, however I am not sure if I
was getting this from his book... Finding out if sth is empty might
require to compare two pointer (for example), which aren't available for
the general public (like the latest MSVC ;-).

Attila

James Kanze

unread,
Aug 12, 2002, 11:20:01 AM8/12/02
to
Herb Sutter <hsu...@gotw.ca> wrote in message
news:<r3r5lucag9p6rduts...@4ax.com>...

[...]

> The Basics of Strings
> ---------------------

[...]

> What about length()? Easy, again -- it's defined to give the same
> result as size(). What's more, note that the other containers don't
> have length(), and it's there in the basic_string interface as a sort
> of "string thing", but by making it a nonmember suddenly we can
> consisently say "length()" about any container. Not too useful in
> this case because it's just a synonym for size(), I grant you, but a
> noteworthy point in the principle it illustrates -- making algorithms
> nonmembers immediately also makes them more widely useful and usable.

The most obvious question here is: why have a length function at all,
given that we have size, and that the other containers only have size?
As you say, it is not very useful.

It does, however, point up another constraint to keep in mind when
designing an interface: backwards compatibility. There was a proposed
string class in the draft standard long before anyone had heard of the
STL, and it had a member function length (and no member size). By the
time the STL was introduced (with a size function everywhere), some
vendors had already delivered compilers with the earlier versions of
string. So backwards compatibility suggested length, STL compatility
required size, and a compromize ends up with both.

On the whole, I think it was the correct decision, although without the
constraints of backward compatilibity, I'm sure we'd never have seen
length.

--
James Kanze mailto:jka...@caicheuvreux.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

Chris Dearlove

unread,
Aug 12, 2002, 11:20:21 AM8/12/02
to
Herb Sutter (hsu...@gotw.ca) wrote:
: in most cases I think it's

: noncontroversial that they would be better off as nonmember nonfriends

By coincidence I've just finished reading Bertrand Meyer's Object
Oriented Software Construction (the big orange book). Now of
course Meyer (not to be confused with Scott Meyers) is not a
friend of C++, and he's talking about Eiffel, where everything
is a member function. I note however that he then takes this
a stage further, where he notes (page 772)

"it does not hurt to add features to a class if they are conceptually
relevant to it. If you hesitate to include an exported feature because
you are not sure it is absolutely necessary, you should not worry about
its effect on class size. The only criteria that matter involve whether
the class fits in with the rest."

A positive defence of monolithic classes! (To be fair he does provide
a quotation to the contrary - but then dismisses it.)

(I'm programming in C++, not Eiffel, so I'm not supporting Meyer. I
just thought I'd throw this in for comment.)

Anthony Williams

unread,
Aug 12, 2002, 11:22:10 AM8/12/02
to
Herb Sutter <hsu...@gotw.ca> writes:

> On 9 Aug 2002 14:49:43 -0400, "Anthony Williams"
> <ant...@nortelnetworks.com> wrote:
> >Anyway, you've forgotten that operator+= is an assignment operator, and
> >thus must be a member.

> I read that part of the standard carefully several times
> before declaring that operator+= could be a nonmember.

And I've read it several times and come to the opposite conclusion. :-(


> Although in the grammar the *= forms are lexically called
> "assignment-operator"s (5.17), the governing subclause is actually in the
> clause 13 (Overloading) which lays our the rules about which operators can
> be overloaded and how. There, in subclause 13.5.3 (Assignment), the text
> only mentions operator=(), and it does say that it must be a member. Because
> the operator*=() forms are not mentioned in 13.5.3, they fall into the
> general 13.5.2 (Binary operators) bucket which permits their implementation
> as members or nonmembers.

For reference, 13.5.3p1 starts by saying

"An assignment operator shall be implemented by a non-static member function
with exactly one parameter"

However 13.5.3p2 says

"Any assignment operator, even the copy assignment operator, can be virtual"

This implies to me that assignment operators other than copy-assignment are
being considered, and that this includes *all* the assignment operators from
5.17 "Assignment Operators".



> (Yeah, personally, I think it's confusing that 13.5.3 talks about an
> "assignment operator" in normal font and means only operator=(), whereas
> 5.17 uses the grammar tag "assignment-operator" in code font and means not
> just operator=() but also the operator*=() versions -- the only difference I
> can see is a dash and a font. Oh well. That's what language lawyers get paid
> to read.)

Well, 5.17p1 starts "There are several assignment operators"

> Anyway, I thought that the above reasoning was too long, and too
> uninteresting to anyone but standards geeks, to put into the article.

That would be fair enough, if I agreed with your conclusion :-)



> To summarize with an example, the following nonmember operator+=() is
> perfectly legal:
>
> template<class charT, class traits, class Allocator>
> basic_string<charT, traits, Allocator>&
> operator+=( basic_string<charT, traits, Allocator>& s,
> basic_string<charT, traits, Allocator>& t )
> {
> s.append( t );
> return s;
> }

Or not ;-)

> Not only should it work, but it does work fine under BCC 5.5, Comeau 4.3,
> Metrowerks CW 8, and MSVC 7 and "7.1" beta, even using each compiler's
> strict (conforming) mode switch where I happened to remember it off the top
> of my head. I didn't try any other compilers, but that's a good sample to
> give me comfort that I'm not the only one who reads 13.5.3 that way.

Unfortunately, we all know that just because a compiler (or even many/all
compilers) allow a construct doesn't mean that it is legal.

Anthony

Markus Werle

unread,
Aug 12, 2002, 11:32:30 AM8/12/02
to
Herb Sutter wrote:

> On 9 Aug 2002 14:54:35 -0400, Markus Werle <numerical....@web.de>
> wrote:
> >> [...] immediately notice that [...] can be implemented efficiently
> >
> >Herb Sutter once wrote in http://www.gotw.ca/gotw/033.htm: "Programmers
> >are notoriously poor guessers about where their
> >code's true bottlenecks lie."
> >
> >How come You make an assumption about the efficiency of the code during
> >Your design phase that massively affects Your design here? (You are a
> >_guru_, but I am not, so convince me I should follow
> >this way ...)
>
> It's a different issue. The main reason I mention efficiency here is to
> underline (for those who care about efficiency) that there is no downside
> to implementing these functions as nonmember nonfriends. That just erases
> the efficiency counterargument before it arises.

But this counterargument is there for the general case.
The best way to evolve a class is to first concentrate on
correct behaviour. Afterwards we look at profiler's output
and see the hot spots. Then we rethink those parts of the
class that are responsible for the bottlenecks.

If these were mixed member-nonmember functions,
the member functions had to be handled in a different
way compared to the nonmember ones:

Carl Daniel wrote in an answer to my msg:

> I'd say "then add a private member function,
> and make the (inline) non-member a friend
> which forwards to it.

I feel somewhat uncomfortable about this.
It's asymmetric. It hinders reusability,
because it assumes another private member function.
The gain achieved is lost immediately.

As other answers to Your OP show, it is not always
clear, which functions should be written in terms of
the other. So this looks like another chance the
interface changes a lot during development.


I especially dislike declaring functions as friends
just in order to get more efficiency.
A friend declaration stands at the _beginning_
of my design, not at the end. (As an aside:
A friend declaration normally has vanished at the
end of the design phase)


Your aim was at reusability through modularity.
In this point we do agree:
Reusability comes through modularity.

The question is: what is the best way to achieve
modularity? Your key to reusability was to use
free functions (of function-templates) not bound
to any class. (where ADL-issues and order matter).

For me the free function never was a first option.
Rather gather similar things into a (maybe templated)
class and provide a public interface to these.

What is wrong with the class-uses-classes-tree and
the pluggable aspect approach I sketched in the
message on 9 Aug 2002?
I really want to know why You dislike it.

Luke Burton wrote:
> I'm genuinely interested in being re-educated,
> if I'm barking up the wrong tree.

Me, too.

Markus

James Kanze

unread,
Aug 12, 2002, 11:58:23 AM8/12/02
to
Herb Sutter <hsu...@gotw.ca> wrote in message
news:<r3r5lucag9p6rduts...@4ax.com>...

[...]

> The Basics of Strings
> ---------------------

[...]

> Membership Has Its Rewards -- and Its Costs
> -------------------------------------------

> >3. Which ones should be members, and which should not? Why?

> Recall the advice from #1: Where it's practical, break generic
> components down into pieces.

> Guideline: Prefer "one class (or function), one responsibility."

> Guideline: Where possible, prefer writing functions as nonmember
> nonfriends.

> For example, if you write a string class and make searching,
> pattern-matching, and tokenizing available as member functions, you've
> hardwired those facilities so that they can't be used with any other
> kind of sequence.

Before answering any questions of what should or should not be a member,
I think you have to decide what the class is supposed to represent. If
the class is supposed to be an STL container, then obviously, algorithms
don't belong in it. If the class is supposed to represent a piece of
(possibly multi-byte character) text, then the question is more
difficult, since algorithms for containers don't work with text. Of
course, if std::string is supposed to represent a piece of text, then
things like operator[] or the iterators pose a problem, because an
individual byte doesn't necessarily have a meaning outside of its
context.

I suspect that part of the reason std::string is such a monolith is that
no one really knows what it is supposed to be. And I'm sure you realize
that an analysis function by function, as you have done, can't really
make up for missing design up front.

I think that this is an important point, and it weakens much of what you
say in the following. If std::string just did one thing, and one thing
well, it wouldn't have so many member functions that you would feel
bothered, and feel that you had to find those which shouldn't be
members.

> (If this frank preamble is giving you an uncomfortable feeling about
> basic_string, well and good.) On the other hand, a facility that
> accomplishes the same goal but is composed of several parts that are
> also independently usable is often a better design. In this example,
> it's often best to separate the algorithm from the container, which is
> what the STL does most of the time.

> I [2,3] and Scott Meyers [4] have written before on why some nonmember
> functions are a legitimate part of a type's interface, and why
> nonmember nonfriends should be preferred (among other reasons, to
> improve encapsulation). For example, as Scott wrote in his opening
> words of the cited article:

> I'll start with the punchline: If you're writing a function that
> can be implemented as either a member or as a non-friend
> non-member, you should prefer to implement it as a non-member
> function. That decision increases class encapsulation. When you
> think encapsulation, you should think non-member functions. [4]

> So when we consider the functions that will operate on a basic_string
> (or any other class type), we want to make them nonmember nonfriends
> if reasonably possible. Hence, here are some questions to ask about
> the members of basic_string in particular:

> -- Always make it a member if it has to be one: Which operations
> must be members, because the C++ language just says so (e.g.,
> constructors) or because of functional reasons (e.g., they must
> be virtual)? If they have to be, then oh well, they just have
> to be; case closed.

> -- Prefer to make it a member if it needs access to internals:

> Which operations need access to internal data we would

> otherwise have to grant via friendship? These should normally
> be members. (There are some rare exceptions such as operations
> needing conversions on their left-hand arguments and some like
> operator<<() whose signatures don't allow the *this reference
> to be their first parameters; even these can normally be
> nonfriends implemented in terms of (possibly virtual) members,
> but sometimes doing that is merely an exercise in contortionism
> and they're best and naturally expressed as friends.)

> -- In all other cases, prefer to make it a nonmember nonfriend:
> Which operations can work well as nonmember nonfriends? These
> can, and therefore normally should, be nonmembers. This should
> be the default case to strive for.

I don't like this logic at all. The choice of member/non-member should
NOT be based on implementation details, but on class logic. You
certainly don't want to change if you change the implementation.

If the function should logically be a member, make it a member, even if
it doesn't need access to the class internals. And if it shouldn't
logically be a member, don't make it a member; if it needs access to the
internals, that's what friend is for. (The case is probably rare for
named functions, but not necessarily for operators.)

> It's important to ask these and related questions because we want to
> put as many functions into the last bucket as possible.

Let's not confuse the means and the end. Putting functions into the
last bucket is not a goal in itself. (Seeing what functions could go
into the last bucket may help us better understand what the class is
really about, and thus help understanding what does and what doesn't
belong.)

> Operations That Must Be Members
> -------------------------------

> As a first pass, let's sift out those operations that just have to be
> members. There are some obvious ones at the beginning of the list.

> constructors (6)
> destructor
> assignment operators (3)
> [] operators (2)

> Clearly the above functions must be members. It's impossible to write
> a constructor, destructor, assignment operator, or [] operator in any
> other way!

> 12 functions down, 91 to go...

> Operations That Should Be Members
> ---------------------------------

> We asked: Which operations need access to internal data we would
> otherwise have to grant via friendship? Clearly these have a reason
> to be members, and normally ought to be. This list includes the
> following, some of which provide indirect access to (e.g., begin()) or
> change (e.g., reserve()) the internal state of the string:

> begin (2)
> end (2)
> rbegin (2)
> rend (2)
> size
> max_size
> capacity
> reserve
> swap
> c_str
> data
> get_allocator

> The above ought to be members not only because they're tightly bound
> to basic_string, but they also happen to form the public interface
> that nonmember nonfriend functions will need to use. Sure, you could
> implement these as nonmember friends, but why?

Your choices are arbitrary. For example, in my pre-standard String
class, my iterators used the public function operator[], rather than the
reverse.

> There are a few more that I'm going to add to this list as fundamental
> string operations:

> insert (three-parameter version)
> erase (1 -- the "iter, iter" version)
> replace (2 -- the "iter, iter, num, char" and templated versions)

> We'll return to the question of insert(), erase(), and replace() a
> little later. For replace() in particular, it's important to be able
> to choose well and make the most flexible and fundamental version(s)
> into member(s).

This is really the point where I started to diverge. In my pre-standard
String class, I supported four basic functions:

String extract( int from, int to ) const
String replace( String const& newText, int from, int to ) const
String insert( String const& newText, int position ) const
String remove( int from, int to ) const

Both insert and remove were implemented as one-liners in terms of
replace, and could have been non-members. I still think that it was the
correct decision to make them members (along with append).

> Into the Fray: Possibly-Contentious Operations
> That Can Be Nonmember Nonfriends
> -----------------------------------------------

> First in this section, allow me to perform a stunning impersonation of
> a lightning rod by pointing out that all of the following functions
> have something fundamental in common, to wit: Each one could obviously
> as easily -- and as efficiently -- be a nonmember nonfriend.

> at (2)
> clear
> empty
> length

[...]

> Take empty() as a case in point. Can we implement it as a nonmember
> nonfriend? Sure... the standard itself requires the following
> behavior of basic_string::empty(), in the C++ Standard, subclause
> 21.3.3, paragraph 14:

> Returns: size() == 0.

> Well, now, that's pretty easy to write as a nonmember without loss of
> efficiency:

> template<class charT, class traits, class Allocator>

> bool empty( const basic_string<charT, traits, Allocator>& s )
> {
> return s.size() == 0;
> }

> Notice that, while we can make size() a member and implement a
> nonmember empty() in terms of it, we could not do the reverse.

This is actually closely linked to the STL interface requirements.
While a size() function makes sense for std::vector, neither my
pre-standard map nor my pre-standard list classes had anything
equivalent. They all had isEmpty, however. Unless you decide that all
containers must implement size (as the STL did), then you risk having
empty a member for some containers, and a non-member in others. Not a
very pleasent situation, I think.

> In several cases here there's a group of related functions, and
> perhaps more than one could be nominated as a member and the others
> implemented in terms of that one as nonmembers. Which function should
> we nominate to be the member? My advice is to choose the most
> flexible one that doesn't force a loss of efficiency -- that will be
> the enabling flexible foundation on which the others can be built. In
> this case, we choose size() as the member because its result can
> always be cached (indeed, the standard encourages that it be cached
> because size() "should" run in constant time), in which case an
> empty() implemented only in terms of size() is no less efficient than
> anything we could do with full direct access to the string's internal
> data structures.

And you don't find it confusing that some are members, some not. I
could almost buy Scott Meyers' point about making all functions
non-members, but arbitrary mixing according to implementation details?

Again, the criteria for choosing should be logical, and not based on the
implementation. (If I remember Scott's article correctly, his arguments
turned around making life easy for the user. And went so far as to make
the function a friend if necessary, in order to make it non-member, and
the interface consistent.)

[...]

> namespace pollution has never
> been as big a problem as some have made it out to be in the
> past.

This has nothing to do with your article, but I can't resist: have you
mentionned this to the people who promoted two phase look-up in
templates:-)? (Or for that matter, those who promoted namespaces?)

[...]

> More Operations That Can Be Nonmember Nonfriends
> ------------------------------------------------

> Further, all of the remaining functions can be implemented as
> nonmember nonfriends:

> resize (2)
> assign (6)
> += (3)
> append (6)
> push_back

Note that in the case of vector, push_back is specified in terms of
insert( v.end() ...), but that the typical implementation will be
significantly faster because of direct access to the vector internals.
There is thus some justification for making it a member in std::vector.
Are you suggesting that it should be a member in std::vector, and not in
std::string?

> insert (7 -- all but the three-parameter version)
> erase (2 -- all but the "iter, iter" version)
> replace (8 -- all but the "iter, iter, num, char" and templated versions)

Do you really want one version of insert, erase and replace to be a
member, and the others not?

For that matter, both insert and erase are trivially implemented in
terms of replace -- in the first case, the length of the replaced
segment is 0, and in the second, the replacement text is "".

> copy
> substr
> compare (5)
> find (4)
> rfind (4)
> find_first_of (4)
> first_last_of (4)
> find_first_not_of (4)
> find_last_not_of (4)

[...]
> erase( pos, length )
> basic_string& erase(size_type pos = 0, size_type n = npos);

> erase( iter, ... )
> iterator erase(iterator position);
> iterator erase(iterator first, iterator last);

> First, notice that the two families' return types are not consistent:
> the first version returns a reference to the string, whereas the other
> two return iterators pointing immediately after the erased character
> or range. Second, notice that the two families' argument types are
> not consistent: the first version takes an offset and length, whereas
> the other two take an iterator or iterator range; fortunately, we can
> convert from iterators to offsets via pos = iter - begin() and from
> offsets to iterators via iter = begin() + pos.

History, my friend. The std::string class predated the STL by quite
some time. Implementations existed before the STL was proposed, based
on draft versions of the standard. Whence the first version of erase.

[...]

> Back to Work: Replacing replace()
> ---------------------------------

Just a reminder: insert and erase can be trivially implemented in terms
of replace, and so by your logic don't need to be member functions
either.

> Next, replace(): Truth be told, the ten-count'em-ten replace() members
> are less interesting than they are tedious and exhausting.

[...]

> This time, the two families' return types are consistent; that's a
> small pleasure. But, as with erase(), the two families' argument
> types are not consistent: one family is based on an offset and length,
> whereas the other is based on an iterator range.

History. The original string class did everything with offset and
length. STL compatibility required the iterator versions.

[Analysis elided, but based on avoiding extraenous string objects...]

In the end, the main reason why there are so many variants of all of
these functions (assign, append, insert, replace) is to avoid extra
conversions. The only function really needed is to assign, append,
insert or replace an std::string. Except that in the case of something
like:

s.append( "abc" ) ;

this requires the conversion of the string literal "abc" to an
std::string -- an extra object, often with dynamic allocation, etc., and
so expensive.

In the case of std::string, of course, there is a simple solution: the
compiler knows about the layout and the semantics of all of the
functions, and uses internal magic that isn't available to other
classes:-). I would argue that the case of string probably justifies
such means, but of course, this is irrelevant to your discussion (where
string is just an example).

In the end, we are faced with the problem: render the interface more
complicated, in order to optimize frequently occuring special cases, or
keep it simple. (Note that whether the functions are members or
non-members is irrelevant here -- non-members are also part of the
contractual interface of the class if they are documented and delivered
with the class.) My first reaction would be to wait until you have
profiling output, but this isn't always an available option for authors
of general purpose libraries.

In the case of the standard, I would have just specified the basic
versions, using std::string, and said that the implementation is free to
provide others having the same effect, for optimization.

[...]

> Coffee Break #2: Copying copy() and substr()
> --------------------------------------------

> Oh, copy(), schmopy(). Note that copy() is a somewhat unusual beast,
> and that its interface is inconsistent with the std::copy() algorithm.
> Note again the signature:

> size_type copy(charT* s, size_type n, size_type pos = 0) const;

> The function is const; it does not modify the string. Rather, what
> the string object has to do is copy part of itself (up to n
> characters, starting at position pos) and dump it into the target
> buffer (note, I deliberately did not say "C-style string"), s, which
> is required to be big enough -- if it's not, oh well, then the program
> will scribble onward into whatever memory happens to follow the string
> and get stuck somewhere in the Undefined Behavior swamp. And, better
> still, basic_string::copy() does not, repeat not, append a null object
> to the target buffer, which is what makes s not a C-style string
> (besides, charT doesn't need to be char; this function will copy into
> a buffer of whatever kind of characters the string itself is made of).
> It is also what makes copy() a dangerous function.

> Guideline: Never use functions that write to range-unchecked
> buffers (e.g., strcpy(), sprintf(), basic_string::copy()). They
> are not only crashes waiting to happen, but a clear and present
> security hazard -- buffer overrun attacks continue to be a
> perennially popular pastime for hackers and malware writers.

But in this case, the buffer is range checked. Because it is a C style
array, you have to pass the range as a separate parameter (n), but you
have to pass it, and the function will never copy more than n
characters.

The function is ugly, but almost every string class I've seen has had
some form of it. Because like it or not, you almost always have to
interface to C somewhere.

> All of the required work could be done pretty much as simply, and a
> lot more flexibly, just by using plain old std::copy():

> string s = "0123456789";
>
> char* buf1 = new char[5];
> s.copy( buf1, 0, 5 ); // ok: buf will contain the chars
> // '0', '1', '2', '3', '4'
> copy( s.begin(), s.begin()+5, buf1 ); // ok: buf will contain the chars
> // '0', '1', '2', '3', '4'
>
> int* buf2 = new int[5];
> s.copy( buf2, 0, 5 ); // error: first parameter is not char*
> copy( s.begin(), s.begin()+5, buf2 ); // ok: buf2 will contain the values
> // corresponding to '0', '1', '2',
> // '3', '4' (e.g., ASCII values)

How much to you want to bet that copy was there before the STL?

Given the STL and the guarantee that std::vector is contiguous, most of
the uses of copy should be replaced by:

std::vector< char > tmp( s.begin(), s.begin() + 5 )
tmp.push_back( '\0' ) ;
// ...

> Incidentally, the above code also shows how basic_string::copy() can
> trivially be written as a nonmember nonfriend, most trivially in terms
> of the copy() algorithm -- another exercise for the reader, but do
> remember to handle the n == npos special case correctly.

Not to mention bounds checking. Your example code will produce
interesting results if s had been initialized with "123", instead of the
value you used.

> While we're taking a breather, let's knock off another simple one at
> the same time: substr(). Recall its signature:

> basic_string substr(size_type pos = 0, size_type n = npos) const;

> We'll neglect for the moment to comment on the irksome fact that this
> function chooses to take its parameters in the order "position,
> length", whereas the copy() we just considered takes the very same
> parameters in the order "length, position".

The standard order for the pre-STL version of string was position,
length. In the case of copy, the second parameter gives the length of
the target buffer, and the only parameter given for the string is
position.

It makes sense to me.

> Nor will we mention that, besides being aesthetically inconsistent,
> trying to remember which function takes the parameters in which order
> makes for an easy trap for users of basic_string to stumble into
> because both parameters also happen to be of the same type
> (size_type), worse luck, so that if the users get it wrong their code
> will continue to happily compile without any errors or warnings and
> continue to happily run...

There are three ways to specify a substring: position, length; from, to
(as positions in the string, e.g. size_t); or begin, end (iterators).
In all three, the types are the same. And in no implementation I've
every seen nor heard of was the order other than that just given. If
position, length is used, it is ALWAYS the position followed by the
length. (In general, it is always how to find the starting point, then
how to find the end point.)

There's a lot to criticize in std::string. No need to make things worse
than they are. Let's face it, the compiler won't complain if I do
something like:
std::vector< char > v( s.end(), s.begin() ) ;
either.

[...]

> All right, break's over. Back to work again... fortunately we have
> only two families left to consider, compare() and the *find*()s.

And the real question: do we even want them?

> Almost There: Comparing compare()s
> ----------------------------------
>
> The penultimate family is compare(). It has five members, all of
> which can be trivially shown to be implementable efficiently as
> nonmember nonfriends. How? Because in the standard they're specified
> in terms of basic_string's size() and data(), which we already decided
> to make members, and traits::compare(), which does the real work.
>
> Wow, wait a minute. That was almost easy! Let's not question it, but
> move right along...
>
>
> The Home Stretch: Finding the find()s
> -------------------------------------
[...]
> There are six families of find functions, each with exactly four
> members:
[...]

> All of these can be written efficiently as nonmember nonfriends; the
> implementations are left as exercises for the reader. Having said
> that, we're done!

The real question remains: which of these functions are really useful,
and what should their actual semantics be? If std::string is nothing
but an STL container, we already have the free functions std::find which
does all that is needed. If std::string is supposed to represent text,
on the other hand, the question becomes much more complicated. And the
answers are different for std::wstring and for std::string. And
probably belong in locale somewhere. Another alternative is to provide
a decent regular expression class; that will handle all of the existing
cases, and more.

But to be frank, in fact, I'm not sure we know enough about the problem
yet to standardize the answers.

> But let's add one final note about string finding. In fact, you might
> have noticed that, in addition to the extensive bevy of
> basic_string::*find*() algorithms, the C++ Standard also provides a
> not-quite-as-extensive-but-still-plentiful bevy of std::*find*()
> algorithms. In particular:

> -- std::find() can do the work of basic_string::find()

> -- std::find() using reverse_iterators, or std::find_end(), can do
> the work of basic_string::rfind()

> -- std::find_first_of(), or std::find() with an appropriate
> predicate, can do the work of basic_string::find_first_of()

> -- std::find_first_of(), or std::find() with an appropriate
> predicate, using reverse_iterators can do the work of
> basic_string::find_last_of()

> -- std::find() with an appropriate predicate can do the work of
> basic_string::find_first_not_of()

> -- std::find() with an appropriate predicate and using
> reverse_iterators can do the work of
> basic_string::find_last_not_of()

> What's more, the nonmember algorithms are more flexible, because they
> work on more than just strings. Indeed, all of the
> basic_string::*find*() algorithms could be implemented using the
> std::find and std::find_end(), tossing in appropriate predicates
> and/or reverse_iterators as necessary. So what about just ditching
> the basic_string::*find*() families altogether and just telling users
> to use the existing std::find*() algorithms? One caution here is
> that, even though the basic_string::*find*() work can be emulated,
> doing it with the default implementations of std::find*() would incur
> significant loss of performance in some cases, and there's the rub.
> The three forms each of find() and rfind() that search for substrings
> (not just individual characters) can be made much more efficient than
> a brute-force search that tries each position and compares the
> substrings starting at those positions. There are well-known
> algorithms that construct finite state machines on the fly to run
> through a string and find a substring (or prove is absence) in linear
> time, and it might be desirable to take advantage of such techniques.

BM is not only linear, but the constant multiplier can be significantly
less than one. KMP is linear, even when searching for multiple strings
in parallel (but their is no interface which allows this).

In all cases, set up time is non-negligible. So we have to know what
std::string is designed for. If it typically holds words, lines or even
sentences, then such algorithms are uninteresting. If it typically
holds full files, or editing buffers, they become almost essential.

> To take advantage of such optimizations, could we provide overloads
> (not specializations, see [6]) of std::find*() that work on
> basic_string iterators? Yes, but only if basic_string::iterator is a
> class type, not a plain charT*. The reason is that we wouldn't want
> to specialize std::find() for all pointer types is because not all
> character pointers necessarily point into strings; so we would need
> basic_string::iterator to be a distinct type that we can detect and
> partially specialize on. Then those specializations could perform the
> optimizations and work at full efficiency for matching substrings.

A possibly better solution would be to detect that we have a random
access iterator (compile-time), take the difference, and switch to BM is
the distance is sufficiently large. (And the algorithm in question is
std::search, not std::find.)

--
James Kanze mailto:jka...@caicheuvreux.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Markus Werle

unread,
Aug 12, 2002, 5:18:57 PM8/12/02
to

> [4] S. Meyers. "How Non-Member Functions Improve Encapsulation" (C/C++

> Users Journal, 18(2), February 2000). Available online at
> http://www.cuj.com/articles/2000/0002/0002c/0002c.htm.

I think Scott's chain of arguments is easy to break
and I take this chance for completeness here.
You are invited to bring me back onto the right track
via some really convincing arguments.

Scott Meyers writes (a strong and tricky argument
for free functions)

> void nap(Wombat& w) { w.sleep(.5); }
>
> Wombat w;
> ...
> nap(w);
>
>
> And there you have it, your dreaded syntactic inconsistency. When you
> want to feed your wombats, you make member function calls, but when
> you want them to nap, you make non-member calls.

No, because void nap(Wombat& w) { w.sleep(.5); }
is for sure a member of class WombatComplicatedActions.

Btw. I prefer to go into another direction here:
If I want my Wombats to sleep for precisely half an hour
I do it this way:

w.sleep(DefaultWombatValues::WombatNapTime());

Separation of concepts that is, no fattening
of interfaces, just YACAMF (yet another class
and member function)

> If you reflect a bit and are honest with yourself, you'll admit that
> you have this alleged inconsistency with all the nontrivial classes
> you use, because no class has every function desired by every client.
> Every client adds at least a few convenience functions of their own,
> and these functions are always non-members.

Alleged inconsistency is a strong word for a common
code reuse pattern.
Don't call interface translation an inconsistency.

> C++ programers are used to this, and they think nothing of it.

Yes, because it's natural. It is what Geoffrey Furnsih used
to write traits-based container-independent algorithms.
This is not an argument for putting functions outside the class.

later in his article :

> Herb Sutter has explained that the "interface" to a class
> (roughly speaking, the functionality provided by the class)
> includes the non-member functions
> related to the class, and he's shown that the name lookup rules of C++
> support this meaning of "interface" [7,8].

And some long thread this month has shown that the name lookup rules of
C++ make You hit Your head against the wall when it comes to ADL and the
order of included headers starts to matter with free functions vs.
function templates. (while only a few compilers get it right at the
moment, which makes it likely we may find some more problems here in a
while).


Now we come to Scott's namespace based solution:

> // the more encapsulated design
> namespace WidgetStuff {
> class Widget { ... };
> Widget* make( /* params */ );
> };

So if it finally belongs to the interface of the class and
goes into SomeStuff namespace this portion of code is lost for
reusability, which was Your (Herb Sutter's) strongest argument
for this advice.


In the side bar Scott writes:

> For factory functions and similar functions which can be
> given uniform names,

.. like in STL ...

> this means that maximal class
> encapsulation and maximal template utility are at odds.

This is why I think the whole advice is not as good as
I first thought.
If we finally end up with a programming language where
templates and non-template stuff have to be handled in
different ways all the time, even in design, this has to be
reiterated.


I agree that std::string and some other portions of
STL need a rethink. I agree it is stupid to have
std::sort(std::some) _and_ std::some::sort().

For the generic algorithms we could as an alternative
require containers to have a distinct public interface that makes it
easy to adopt std::sort for them.

But the general advice with the non-member functions
still makes me feel unhappy.

Markus

Herb Sutter

unread,
Aug 12, 2002, 11:28:54 PM8/12/02
to
On 12 Aug 2002 11:22:10 -0400, "Anthony Williams"

<ant...@nortelnetworks.com> wrote:
>For reference, 13.5.3p1 starts by saying
>
>"An assignment operator shall be implemented by a non-static member function
>with exactly one parameter"
>
>However 13.5.3p2 says
>
>"Any assignment operator, even the copy assignment operator, can be virtual"
>
>This implies to me that assignment operators other than copy-assignment are
>being considered,

Right, because a "copy" assignment operator is only one (or a few) possible
flavor(s) of operator=() that happens to take the same type as its argument.
All other versions of operator=() are not "copy" assignment operators. That
is why the next part doesn't follow:

>and that this includes *all* the assignment operators from
>5.17 "Assignment Operators".

Saying "any assignment operator, even the copy assignment operator" doesn't
necessarily include the op= forms.

>Well, 5.17p1 starts "There are several assignment operators"

As noted, there are many possible operator=()s without considering the op=
forms.

>> Not only should it work, but it does work fine under BCC 5.5, Comeau 4.3,
>> Metrowerks CW 8, and MSVC 7 and "7.1" beta, even using each compiler's
>> strict (conforming) mode switch where I happened to remember it off the top
>> of my head. I didn't try any other compilers, but that's a good sample to
>> give me comfort that I'm not the only one who reads 13.5.3 that way.
>
>Unfortunately, we all know that just because a compiler (or even many/all
>compilers) allow a construct doesn't mean that it is legal.

Of course. But that all compilers I could get my hands on agreed is
significant, and especially that Comeau (built on EDG's front-end) agreed is
the more significant to me because when my reading of the standard disagrees
with EDG it's almost always my reading of the standard that's in error. :-)

Even so, my reading of the standard could well be wrong. If so, all
compilers I can get my hands on are wrong in the same way. If nothing else,
it might be good to submit this as a potential defect report.

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
C++ community program manager, Microsoft (www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002 (www.thecppseminar.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Herb Sutter

unread,
Aug 12, 2002, 11:33:47 PM8/12/02
to
On 11 Aug 2002 19:53:54 -0400, bran...@cix.co.uk (Dave Harris) wrote:
>Wouldn't it be better to provide /all/ functions as non-members if C++
>allows them to be? If we don't want to use "friend", they can forward to
>member functions in the same way that other non-members do. We'll still
>have some public member functions, but client code would be discouraged
>from using them directly.

While putting the draft of GotW #84 into my in-progress manuscript for
Exceptional C++ Style a few days ago I did add such a note. In particular,
consistently providing functions as nonmembers can make template programming
easier.

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
C++ community program manager, Microsoft (www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002 (www.thecppseminar.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Herb Sutter

unread,
Aug 12, 2002, 11:36:49 PM8/12/02
to
On 11 Aug 2002 19:55:28 -0400, Luke Burton

<lu...@unsolicited.burton.echidna.id.au> wrote:
>- Counter-intuitive. Markus Werle elaborated on this at length - it just
>_feels_ better to invoke an operation on something. Laptop.turnOn()
>rather than TurnOn(laptop). It feels like we're devolving to C !

>- Inconsistent. Now you have, in the case of string, some instances


>where you call string.insert and others where you call insert(string). I
>would prefer one or the other being settled on to simplify matters.
>Maybe that points to _everything_ being nonmember _friends_ ?

See my response to Dave Harris posted just now. People like Andrei would
love to see all functions be nonmembers (or members callable using nonmember
syntax) -- a unified syntax can make template programming easier, as another
advantage.

>- Scattered functionality. Why should users have to include the
>potentially non-related string.h to obtain sequence operations?
>Shouldn't these functions be moved into their own header files, and
>hence become not part of the classes implementation at all?

I think this modularization is a good thing. Two quick counters:

- What about operators, like << and >>? They have to be nonmembers anyway.
Does that confuse anyone? Not that I've seen, but then again the notation
there is usually infix rather than function-call.

- I'll take the easy route and argue with a more clearly algorithm-like
example: If you're not going to be doing find_first'ing, why pull it in? And
if you are going to be doing it, why not have the algorithm in one place
where it can be pulled in by anyone that needs it, and works with various
containers? That's the design of the STL after all.

>- Avoiding utilisation of classes. If you want to redelegate some of
>string's functionality, why not give it up to another class? Instead of
>string.replace(...), do SequenceOperations.replace(string), where string
>now has to implement an abstract interface called Sequence before it can
>be used with this class.

Grouping nonmember functions is the kind of thing namespaces were invented
for. Why make it a class when it has no state? That's what we used to do
before namespaces.

>- Namespaces usage. Unfortunately, my current contract sees me working
>in an environment with a sun compiler that doesn't support namespaces.
>So for these projects, global namespace pollution becomes a real issue.
>Static members of an abstract class could be a work around ...

Fine, but working around nonstandard compiler limitations via alternative
packaging is a separate issue. :-)

Best wishes,

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
C++ community program manager, Microsoft (www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002 (www.thecppseminar.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Herb Sutter

unread,
Aug 12, 2002, 11:41:02 PM8/12/02
to
Thanks, James. Selected comments follow:

On 12 Aug 2002 11:58:23 -0400, ka...@gabi-soft.de (James Kanze) wrote:
>I suspect that part of the reason std::string is such a monolith is that
>no one really knows what it is supposed to be. And I'm sure you realize
>that an analysis function by function, as you have done, can't really
>make up for missing design up front.

Of course. We could add "muddled design" to the list, but I thought the
beating was bad enough and let it be (except for one dig at design by
committee).

I agree with the last paragraph, except that "logical" is too fuzzy. What's
logical to me may not be logical to you, and Abe and Betty down the hall may
well have yet different ideas.

I think the general answer may well be in the direction of not just writing
nonmember nonfriends where at all possible (which is what I say above), but
writing them as nonmembers period where at all possible (either additionally
writing nonmember nonfriend passthroughs for functions which "in this
implementation" did need to be members, or else make them nonmember friends,
but I was deliberately avoiding the friendship issue).

>Your choices [of members vs. nonmembers] are arbitrary.

Not that arbitrary, and if you take the next step noted above in favor of
nonmembers whenever possible, many of the members I cited become nonmembers
and there's no longer any arbitrariness.

>History, my friend.

Mais oui, cher ami. :-)

>> Guideline: Never use functions that write to range-unchecked
>> buffers (e.g., strcpy(), sprintf(), basic_string::copy()). They
>> are not only crashes waiting to happen, but a clear and present
>> security hazard -- buffer overrun attacks continue to be a
>> perennially popular pastime for hackers and malware writers.
>
>But in this case, the buffer is range checked. Because it is a C style
>array, you have to pass the range as a separate parameter (n), but you
>have to pass it, and the function will never copy more than n
>characters.

The point I made is valid (there are two traditional length problems, not
being told a buffer length and not terminating with null which other code
shouldn't but often does assume), but I see I could have phrased it better
to distinguish the two pitfalls. How about this:

Guideline: Never use functions that write to range-unchecked buffers

(e.g., strcpy(), sprintf()) or could fail to null-terminate C-style
strings (e.g., strncpy(), basic_string::copy()). They are not only


crashes waiting to happen, but a clear and present security hazard
-- buffer overrun attacks continue to be a perennially popular
pastime for hackers and malware writers.

(Interesting how both strcpy and strncpy have potential overrun pitfalls,
only different ones, isn't it?)

>std::search, not std::find.)

:-) Now there's a whole 'nuther naming issue...

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
C++ community program manager, Microsoft (www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002 (www.thecppseminar.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Cedric

unread,
Aug 12, 2002, 11:43:56 PM8/12/02
to

"Attila Feher" wrote

> Another comment: there may be containers/implementations where size
> cannot always be calculated in constant time whereas the "emptyness" can
> be determined in constant time, but only by knowing the internal
> architecture of the container. I cannot recall what things are these,
> but someone convinced me by this argument once... Of course, a
> non-member overloaded-friend empty() (or inline calling a member) can
> just do that as well.

I think that's lists. Unless they store the size of the list somewhere,
size() will be a linear time operation, but empty will be constant-time.

The same is true for std::remove. From the VC++ documentation of
std::remove:

"The list Class class has a more efficient member function version of
remove, which also relinks pointers."

In this case, clearly, a non-member function cannot be as efficient as a
member one.

Cedric

John G Harris

unread,
Aug 12, 2002, 11:46:01 PM8/12/02
to
In article <7i7dlu4pvmqfogbmk...@4ax.com>, Herb Sutter
<hsu...@gotw.ca> writes
<snip>

>But in general, the point was to demonstrate just how many did not actually
>need to be members or friends, and in most cases I think it's
>noncontroversial that they would be better off as nonmember nonfriends --
<snip>

Or better off not there at all ?

John
--
John Harris
mailto:jo...@jgharris.demon.co.uk

John Potter

unread,
Aug 13, 2002, 11:25:59 AM8/13/02
to
On 12 Aug 2002 11:22:10 -0400, "Anthony Williams"
<ant...@nortelnetworks.com> wrote:

> Herb Sutter <hsu...@gotw.ca> writes:
>
> > On 9 Aug 2002 14:49:43 -0400, "Anthony Williams"
> > <ant...@nortelnetworks.com> wrote: >Anyway, you've forgotten that
> > operator+= is an assignment operator, and >thus must be a member.
>
> > I read that part of the standard carefully several times before
> > declaring that operator+= could be a nonmember.
>
> And I've read it several times and come to the opposite conclusion.
> :-(

Obviously cause for a DR.

There is no way to appeal to common sense when reading 13.5 since much
of it is non-sense rules.

Since there was no reason to talk about the unary ampersand and comma,
they may be non-members giving different meaning to expressions in
different parts of a program. This is the rationale for making
operator= a member only.

Since there was reason to talk about () [] ->, each of these was
arbitrarily declared to be member only. There is no justification.

I wonder how increment/decrement escaped that?

What is the problem with T& operator= (T&, int)? It is not compiler
generated. Am I missing something which requires it to be member only
if copy assignment is member only?

> Unfortunately, we all know that just because a compiler (or even
> many/all
> compilers) allow a construct doesn't mean that it is legal.

Unfortunately, the standards committee is not perfect when it comes to
changing wording. C++PL/2e base document:

| r.13.4.3 Assignment

| The assignment function operator=() ...

There is no doubt in the base document and, of course, the compilers all
had it right before the standard obfuscated the intent.

John

Wil Evers

unread,
Aug 13, 2002, 11:26:54 AM8/13/02
to
Herb Sutter wrote:

> template<class charT, class traits, class Allocator>

> typename basic_string<charT, traits, Allocator>::iterator
> erase( basic_string<charT, traits, Allocator>& s,
> typename basic_string<charT, traits, Allocator>::iterator
> position )
> {
> return s.erase( position, s.end() );
> }
>
> OK, coffee break's over...

Looks like you should have had that coffee five minutes before you wrote

this. The correct implementation is:

return s.erase(position, position + 1);

Regards,

- Wil

Wil Evers, DOOSYS R&D, Utrecht, Holland
[wil underscore evers at doosys dot com]

Anthony Williams

unread,
Aug 13, 2002, 11:57:27 AM8/13/02
to
Herb Sutter <hsu...@gotw.ca> writes:

> On 12 Aug 2002 11:22:10 -0400, "Anthony Williams"
> <ant...@nortelnetworks.com> wrote:
> >For reference, 13.5.3p1 starts by saying

> >"An assignment operator shall be implemented by a non-static member function
> >with exactly one parameter"

> >However 13.5.3p2 says

> >"Any assignment operator, even the copy assignment operator, can be virtual"

> >This implies to me that assignment operators other than copy-assignment are
> >being considered,

> Right, because a "copy" assignment operator is only one (or a few) possible
> flavor(s) of operator=() that happens to take the same type as its argument.
> All other versions of operator=() are not "copy" assignment operators. That
> is why the next part doesn't follow:

Well, I think it does :-). In particular, 5.17p2 refers to "simple assignment
(=)" whereas the rest of 5.17 refers to "assignment operators", and includes
the op= form. 13.5.3 refers to just "assignment operators", so I read it as
including all the "assignment operators" defined in 5.17, the purpose of which
is, after all, to define the operators, and their semantics.

> >and that this includes *all* the assignment operators from
> >5.17 "Assignment Operators".

> Saying "any assignment operator, even the copy assignment operator" doesn't
> necessarily include the op= forms.

It does if "assignment operator" includes the op= forms.

> >Well, 5.17p1 starts "There are several assignment operators"
>
> As noted, there are many possible operator=()s without considering the op=
> forms.

Yes, but 5.17 doesn't just cover operator=(), it covers all the
assignment-operators = *= /= %= += = >>= <<= &= ^= |=, as given in the
grammar, which forms part of 5.17p1. In particular, I think that in the
context of 5.17 there is only one =; the fact that it may be overloaded is
irrelevant. One could say that there are many possible copy-assignment
operators on that premise, since it is overloaded for each UDT, as well as for
each built-in type.


> >Unfortunately, we all know that just because a compiler (or even many/all
> >compilers) allow a construct doesn't mean that it is legal.
>
> Of course. But that all compilers I could get my hands on agreed is
> significant, and especially that Comeau (built on EDG's front-end) agreed is
> the more significant to me because when my reading of the standard disagrees
> with EDG it's almost always my reading of the standard that's in error. :-)

Yes, in most cases I am inclined to agree with those clever folks at EDG;
however, they're not always right.

> Even so, my reading of the standard could well be wrong. If so, all
> compilers I can get my hands on are wrong in the same way. If nothing else,
> it might be good to submit this as a potential defect report.

I will.

Anthony

Nikolai Pretzell

unread,
Aug 13, 2002, 1:52:48 PM8/13/02
to
Hi,


> > -- In all other cases, prefer to make it a nonmember nonfriend:
> > Which operations can work well as nonmember nonfriends?
These
> > can, and therefore normally should, be nonmembers. This
should
> > be the default case to strive for.

> I don't like this logic at all. The choice of member/non-member
> should NOT be based on implementation details, but on class logic.
> You certainly don't want to change if you change the implementation.

I'm quite happy, that I'm not the only one thinking that way.


> If the function should logically be a member, make it a member, even
> if it doesn't need access to the class internals. And if it shouldn't

> logically be a member, don't make it a member;

Yes, that's exactly what I would try to achieve in every class design.


> > push_back

> Note that in the case of vector, push_back is specified in terms of
> insert( v.end() ...), but that the typical implementation will be
> significantly faster because of direct access to the vector internals.

> There is thus some justification for making it a member in std::vector

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Markus Werle

unread,
Aug 13, 2002, 5:57:55 PM8/13/02
to
Herb Sutter wrote:

> BTW, there are other reasons to prefer nonmember syntax. For one thing, it
> turns out it makes writing templates significantly easier in some cases if
> you know that the functions are nonmembers for all types, instead of
> members for some types and nonmembers for others.

I thought I had seen a thread in this NG how to check if a class had a
memfun or not. I cannot remember well if it really worked.
If yes, with Select<etc> we get the right one pretty fast.

But even without this: It is RealWorld (TM). This means that
my algorithms - if they claim to be generic - already have to
go the traits way, so that everybody can plug in their own classes
after supplying the required Interface Translation:


template <class T>
struct HowToCallDoSomething;

class Some
{
public:
void DoSomething() {};
};

// users adopt their class for my algos
template <>
struct HowToCallDoSomething<Some>
{
static void Execute(Some& some)
{
return some.DoSomething();
}
};


// Herb and Scott may also use my lib
class HerbsSolution
{
friend void DoSomethingSpecial(HerbsSolution& hs);
};

void DoSomethingSpecial(HerbsSolution& hs) {}

template <>
struct HowToCallDoSomething<HerbsSolution>
{
static void Execute(HerbsSolution& hs)
{
return DoSomethingSpecial(hs);
}
};


class GenericAlgorithm
{
public:
template <class T>
void Go(T& t)
{
HowToCallDoSomething<T>::Execute(t);
}
};


int main()
{
Some s;
HerbsSolution h;

GenericAlgorithm GA;

GA.Go(s);
GA.Go(h);
}

> For another, it would
> make (erstwhile-)members and nonmembers overload -- and if you're jumping
> back in horror and about to strenuously resist that as undesirable, well,
> experience has shown that it could often be a Good Thing and some
> committee members feel that in general the overloading of members and
> nonmembers might ought to be added to C++0x

I see no gain by such a feature, because semantics are not catched by this
and still may be different, like for MyVector::operator()(size_t).
Is it 0-based? Is it RangeChecked? Is it thread safe?
Is it exception safe?

Can I easily, automagically plug MyVector in YourAlgorithm?
I think the handcrafted entry is required anyway, either
by creating a wrapper/forwarder class or by some library
mechanism like the one sketched above.


> (this is far from complete
> though, and may never happen -- just pointing out that even experts feel
> it's potentially a good idea). Just ask Andrei how having everything as
> nonmembers would simplify generic programming, but be ready for an earful!

Any link yet?
Examples?
To table with the cards ...
Andrei, we listen.

> But in general, the point was to demonstrate just how many did not
> actually need to be members or friends, and in most cases I think it's
> noncontroversial that they would be better off as nonmember nonfriends --

James Kanze correctly pointed out that historical evolution
with backward compatibility issues yielded that mess we have today.
A post-STL std::string may not have been such an easy-to-hit target.

I believe by now that
A) the much too fat, not-so-clean interface of std::string
which does not fit into STL concepts
and
B) the question where functions have to go in general
are two different things to talk about.

Markus

Sergey P. Derevyago

unread,
Aug 13, 2002, 10:45:50 PM8/13/02
to
Herb Sutter wrote:
> Pursuing that one step further than the (already-long) GotW issue did, we
> could go beyond categorizing which existing members could be nonmember
> nonfriends,
But what do you think about a microkernel class architecture? IMHO it allows
to use members without mess.

> and talk about adding new nonmember nonfriends which could be
> simple passthroughs or synonyms for members functions, specifically to
> address this concern and unify the calling syntax.

But it's too tedious to do this by hand.

> BTW, there are other reasons to prefer nonmember syntax. For one thing, it
> turns out it makes writing templates significantly easier in some cases if
> you know that the functions are nonmembers for all types, instead of members
> for some types and nonmembers for others.

But member calls can easily kick out that terrible ADL :)

> For another, it would make
> (erstwhile-)members and nonmembers overload -- and if you're jumping back in
> horror and about to strenuously resist that as undesirable, well, experience
> has shown that it could often be a Good Thing and some committee members
> feel that in general the overloading of members and nonmembers might ought
> to be added to C++0x (this is far from complete though, and may never happen
> -- just pointing out that even experts feel it's potentially a good idea).

Could you give us appropriate URLs? It seems to be rather interesting.

> Just ask Andrei how having everything as nonmembers would simplify generic
> programming, but be ready for an earful!

Anyway, I'll take the risk :) Andrei! Could you give us some examples to get
the point?


--
With all respect, Sergey. http://cpp3.virtualave.net/
mailto : ders at skeptik.net

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Herb Sutter

unread,
Aug 13, 2002, 10:47:22 PM8/13/02
to
On 13 Aug 2002 11:26:54 -0400, Wil Evers <bou...@dev.null> wrote:
>Herb Sutter wrote:
>> template<class charT, class traits, class Allocator>
>> typename basic_string<charT, traits, Allocator>::iterator
>> erase( basic_string<charT, traits, Allocator>& s,
>> typename basic_string<charT, traits, Allocator>::iterator
>> position )
>> {
>> return s.erase( position, s.end() );
>> }
>>
>> OK, coffee break's over...
>
>Looks like you should have had that coffee five minutes before you wrote
>this. The correct implementation is:
>
> return s.erase(position, position + 1);

Good catch! (It was a long night.)

While you're at it, did you catch the other subtle bug in the other erase()
function immediately preceding that one? Those are the only two corrections
I currently know about in the article. They'll both the updated when I next
refresh the page (can't do it from this location).

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
C++ community program manager, Microsoft (www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002 (www.thecppseminar.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Anthony Williams

unread,
Aug 14, 2002, 8:23:28 AM8/14/02
to
"Anthony Williams" <ant...@nortelnetworks.com> writes:

> Herb Sutter <hsu...@gotw.ca> writes:
> > Of course. But that all compilers I could get my hands on agreed is
> > significant, and especially that Comeau (built on EDG's front-end) agreed
> > is the more significant to me because when my reading of the standard
> > disagrees with EDG it's almost always my reading of the standard that's
> > in error. :-)
>
> Yes, in most cases I am inclined to agree with those clever folks at EDG;
> however, they're not always right.

In this case, they are --- as Steve Adamczyk pointed out in comp.std.c++, this
was resolved by core issue 221, to indicate that only operator= needs to be a
member; the compound assignment operators can be non-members.

Simon Turner

unread,
Aug 14, 2002, 8:30:17 AM8/14/02
to
Herb Sutter <hsu...@gotw.ca> wrote in message news:<2oqflukg1rb5h488k...@4ax.com>...

> Thanks, James. Selected comments follow:
>
> On 12 Aug 2002 11:58:23 -0400, ka...@gabi-soft.de (James Kanze) wrote:
<snip>
<snip>

If we *do* turn (almost) all member functions into nonmembers,
it leaves us writing code of the form
verb( subject, object )
which is (at least sometimes) less clear to read than
subject.verb( object )
since there is less differentiation between the verb's subject and object.
(I think I have subject and object the right way round in terms of English
grammar - hope the meaning is clear).

One of the readability advantages of member functions is this
differentiation, and std::string::append is more readable than std::strcat
because it is more obvious which party is being affected in:
std::string s1;
...
s1.append( s2 );

than in:
char* s1;
...
strcat( s1, s2 );

Right, now to the meat. Since:

1. the interface principle suggests that nonmember functions (in the same
namespace as a class, and acting on it) are part of that class's interface,

2. the problem with having some parts of the interface implemented as
member and some parts as nonmember functions is the _syntactic_ (rather
than functional) difference in the function calls,

3. this leads to decisions about whether or not a function forming part of
a class's interface should be a member or nonmember, being made for reasons
of personal aesthetic preference or the readability-as-English of the call
(the subject-verb-object discussion above),

why not remove this syntactic distinction?
What would happen (apart from probably a change to name lookup, *sigh*)
if all functions in a class's interface, whether members or nonmembers,
could be called using either syntax?

Eg. in this situation:

namespace X
{
class C
{
public:
void f();
};

g( C& );
}

X::C c;

you would be able to use these interchangeably:
c.f();
f( c );

and these interchangeably:
g( c );
c.g();

Just a thought. Any takers?

John Potter

unread,
Aug 14, 2002, 8:35:29 AM8/14/02
to
On 13 Aug 2002 11:57:27 -0400, "Anthony Williams"
<ant...@nortelnetworks.com> wrote:

> > As noted, there are many possible operator=()s without considering the op=
> > forms.

> In particular, I think that in the


> context of 5.17 there is only one =; the fact that it may be overloaded is
> irrelevant. One could say that there are many possible copy-assignment
> operators on that premise, since it is overloaded for each UDT, as well as for
> each built-in type.

Just to be clear. There are exactly 20 copy assignment operator
signatures for any UDT.

R operator= (T cv1&) cv2;
R operator= (T) cv2;

where cv1 and cv2 are empty, const, volatile, const volatile.

There are infinately many operator= signatures which are not copy
assignment operators for any UDT.

template <class U>
R operator= (U) cv2;

John

Francis Glassborow

unread,
Aug 14, 2002, 3:55:05 PM8/14/02
to
In article <ea3f115.02081...@posting.google.com>, Simon
Turner <s_j_t...@yahoo.co.uk> writes

>3. this leads to decisions about whether or not a function forming part of
>a class's interface should be a member or nonmember, being made for reasons
>of personal aesthetic preference or the readability-as-English of the call
>(the subject-verb-object discussion above),
>
>why not remove this syntactic distinction?
>What would happen (apart from probably a change to name lookup, *sigh*)
>if all functions in a class's interface, whether members or nonmembers,
>could be called using either syntax?
>
>Eg. in this situation:
>
> namespace X
> {
> class C
> {
> public:
> void f();
> };
>
> g( C& );
> }
>
> X::C c;
>
>you would be able to use these interchangeably:
> c.f();
> f( c );
>
>and these interchangeably:
> g( c );
> c.g();
>
>Just a thought. Any takers?

It is certainly one of the things that some of us are looking at but it
is a name look-up issue and those who have been around language design
for any length of time will know that such issues are non-trivial :-)


--
Francis Glassborow ACCU
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

Tom Plunket

unread,
Aug 14, 2002, 9:16:26 PM8/14/02
to
Markus Werle wrote:

> > I'd say "then add a private member function,
> > and make the (inline) non-member a friend
> > which forwards to it.
>
> I feel somewhat uncomfortable about this.
> It's asymmetric. It hinders reusability,
> because it assumes another private member function.
> The gain achieved is lost immediately.

I figured, based on everything that I've read, that the technique was to
AVOID friendship, not to create private functions and have friend
functions forward to them...

I'm of the feeling that if some function needs access to the internals
that it should be a member function. If that function needs to be
accessible to the outside world(*) then it needs to be in the public
interface.

* - ...and we're *really* sure that it needs to be accessed from
the outside world...

-tom!

Carl Daniel

unread,
Aug 15, 2002, 7:02:29 PM8/15/02
to
"Markus Werle" <numerical....@web.de> wrote in message
news:aj8583$19itm4$1...@ID-153032.news.dfncis.de...

> Carl Daniel wrote in an answer to my msg:
>
> > I'd say "then add a private member function,
> > and make the (inline) non-member a friend
> > which forwards to it.
>
> I feel somewhat uncomfortable about this.
> It's asymmetric. It hinders reusability,
> because it assumes another private member function.
> The gain achieved is lost immediately.

My point was: if your original design criteria prove to be incorrect
(i.e. you made a function a nonmember nonfriend and it turns out that
you can implement it much more efficiently as a friend), you don't have
a disaster - you simply have a non-optimal design, but one which can
always be every bit as efficient as the optimal design.

If you've already written 1,000,000 lines of code which use your
non-optimal design and published lavish documentation in hardcover ,
it's probably a better choice to add the member version & have the
non-member forward to it than it is to track down & change every use of
the function.

-cd

John Potter

unread,
Aug 15, 2002, 9:17:32 PM8/15/02
to
On 13 Aug 2002 11:57:27 -0400, "Anthony Williams"
<ant...@nortelnetworks.com> wrote:

> > As noted, there are many possible operator=()s without considering
> the op= > forms.

> In particular, I think that in the


> context of 5.17 there is only one =; the fact that it may be
> overloaded is irrelevant. One could say that there are many possible
> copy-assignment operators on that premise, since it is overloaded for
> each UDT, as well as for each built-in type.

Just to be clear. There are exactly 20 copy assignment operator
signatures for any UDT.

R operator= (T cv1&) cv2;
R operator= (T) cv2;

where cv1 and cv2 are empty, const, volatile, const volatile.

There are infinately many operator= signatures which are not copy
assignment operators for any UDT.

template <class U>
R operator= (U) cv2;

John

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Russell Cagle

unread,
Aug 15, 2002, 9:45:46 PM8/15/02
to
Herb Sutter <hsu...@gotw.ca> wrote in message news:<r3r5lucag9p6rduts...@4ax.com>...
> ______________________________________________________________________
>
> GotW #84: Monoliths "Unstrung"
>
> Difficulty: 3 / 10
> ______________________________________________________________________
>
>
> [stuff...]

>
> Guideline: Where possible, prefer writing functions as nonmember
> nonfriends.
>
>

There is one case where you can't use the nonmember guideline: virtual
functions have to be member functions, even if the overriding version
could be implemented as a nonmember function.

struct A { virtual void foo(int); };
struct B { void foo(int) { /* only uses B's public interface */ } };

I this is a good argument for finer access control notation. The
problem could also be solved with polymorphic dispatch for non-member
functions:


struct A {};
struct B : public A {};

virtual int myfunc(A &, A &) = 0;
int myfunc(B &x, A &y) {}
int myfunc(A &x, B &y) {}
int myfunc(B &x, B &y) {}

The compiler could transform code like the above into virtual member
functions, and further pick a function based on an index in the 2nd
arg's vtable.

In either case, it looks like it would be a good idea to generalize
access control and polymorphic dispatching.

--Russell

Herb Sutter

unread,
Aug 16, 2002, 8:45:12 AM8/16/02
to
On 14 Aug 2002 08:30:17 -0400, s_j_t...@yahoo.co.uk (Simon Turner) wrote:
>why not remove this syntactic distinction?
>What would happen (apart from probably a change to name lookup, *sigh*)
>if all functions in a class's interface, whether members or nonmembers,
>could be called using either syntax?
[...]

>you would be able to use these interchangeably:
> c.f();
> f( c );
>
>and these interchangeably:
> g( c );
> c.g();
>
>Just a thought. Any takers?

Indeed, there's been some early talk about possibly doing this. Nothing even
half-baked yet, though. And the name lookup rules are already complex -- any
paper that proposes doing this in the name of simplification would have to
include extensive analysis of the effect on the existing rules in current
programs. I'd be happy to read such a paper, though.

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
C++ community program manager, Microsoft (www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002 (www.thecppseminar.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Herb Sutter

unread,
Aug 16, 2002, 8:45:38 AM8/16/02
to
On 14 Aug 2002 08:23:28 -0400, "Anthony Williams"

<ant...@nortelnetworks.com> wrote:
>"Anthony Williams" <ant...@nortelnetworks.com> writes:
> > Herb Sutter <hsu...@gotw.ca> writes:
> > > Of course. But that all compilers I could get my hands on agreed is
> > > significant, and especially that Comeau (built on EDG's front-end) agreed
> > > is the more significant to me because when my reading of the standard
> > > disagrees with EDG it's almost always my reading of the standard that's
> > > in error. :-)
> >
> > Yes, in most cases I am inclined to agree with those clever folks at EDG;
> > however, they're not always right.
>
>In this case, they are --- as Steve Adamczyk pointed out in comp.std.c++, this
>was resolved by core issue 221, to indicate that only operator= needs to be a
>member; the compound assignment operators can be non-members.

Thanks. While I'm glad that my reading of the standard was indeed what the
committee intended, when I write postings that quote from the standard (as I
did here) I usually also check the issues lists to see if there were any
existing issues on those clauses (I didn't do that this time).

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
C++ community program manager, Microsoft (www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002 (www.thecppseminar.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Herb Sutter

unread,
Aug 16, 2002, 5:47:57 PM8/16/02
to
On 15 Aug 2002 21:45:46 -0400, russell...@yahoo.com (Russell Cagle)
wrote:

> Herb Sutter <hsu...@gotw.ca> wrote in message
> news:<r3r5lucag9p6rduts...@4ax.com>...

>> Guideline: Where possible, prefer writing functions as nonmember
>> nonfriends.

> There is one case where you can't use the nonmember guideline: virtual
> functions have to be member functions, even if the overriding version
> could be implemented as a nonmember function.

Right, I specifically mentioned that case. Here's the quote:

>> -- Always make it a member if it has to be one: Which operations
>> must be members, because the C++ language just says so (e.g.,
>> constructors) or because of functional reasons (e.g., they must
>> be virtual)? If they have to be, then oh well, they just have
>> to be; case closed.

That's part of the "where possible"... Thanks for highlighting it,

Herb

---
Herb Sutter (www.gotw.ca)

Secretary, ISO WG21/ANSI J16 (C++) standards committee
(www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal
(www.gotw.ca/cuj)
C++ community program manager, Microsoft
(www.gotw.ca/microsoft)

Check out "THE C++ Seminar" - Oct 28-30, 2002
(www.thecppseminar.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Johan Ericsson

unread,
Aug 16, 2002, 6:49:08 PM8/16/02
to
> struct A { virtual void foo(int); };
> struct B { void foo(int) { /* only uses B's public interface */ } };
>
> I this is a good argument for finer access control notation. The
> problem could also be solved with polymorphic dispatch for non-member
> functions:
>
>
> struct A {};
> struct B : public A {};
>
> virtual int myfunc(A &, A &) = 0;
> int myfunc(B &x, A &y) {}
> int myfunc(A &x, B &y) {}
> int myfunc(B &x, B &y) {}
>
> The compiler could transform code like the above into virtual member
> functions, and further pick a function based on an index in the 2nd
> arg's vtable.

I don't think it is possible to get away from the use of a vtable to
dispatch virtual methods. In the example above, how would the
compiler know
that 'A' needs a vtable? It can only figure that out when the
compiler hits
the declaration of "myfunc". Currently "myfunc" can be declared in a
different translation unit from 'A'.

So, the compiler would generate 'A' without a vtable (sizeof(A) ==
1), when
it really needs one.

This seems unworkable.

Johan

Juan Valio

unread,
Aug 17, 2002, 6:03:40 AM8/17/02
to

"Herb Sutter" <hsu...@gotw.ca> escribi en el mensaje
news:5sbolu4iif5rfnh6k...@4ax.com...

> On 14 Aug 2002 08:30:17 -0400, s_j_t...@yahoo.co.uk (Simon Turner)
wrote:
> >why not remove this syntactic distinction?
> >What would happen (apart from probably a change to name lookup, *sigh*)
> >if all functions in a class's interface, whether members or nonmembers,
> >could be called using either syntax?
> [...]
> >you would be able to use these interchangeably:
> > c.f();
> > f( c );
> >
> >and these interchangeably:
> > g( c );
> > c.g();
> >
> >Just a thought. Any takers?
>
> Indeed, there's been some early talk about possibly doing this. Nothing
even
> half-baked yet, though. And the name lookup rules are already complex --
any
> paper that proposes doing this in the name of simplification would have to
> include extensive analysis of the effect on the existing rules in current
> programs. I'd be happy to read such a paper, though.
>

Maybe...
c.f(....) is
c.f(....) if this is not an error
and
f(c,....) otherwise

But I suppose things are not so easy...

Dave Harris

unread,
Aug 17, 2002, 7:03:09 AM8/17/02
to
live...@hotmail.com (Johan Ericsson) wrote (abridged):

> > struct A {};
> > struct B : public A {};
> >
> > virtual int myfunc(A &, A &) = 0;
> > int myfunc(B &x, A &y) {}
> > int myfunc(A &x, B &y) {}
> > int myfunc(B &x, B &y) {}
> >
> > The compiler could transform code like the above into virtual member
> > functions, and further pick a function based on an index in the 2nd
> > arg's vtable.
>
> I don't think it is possible to get away from the use of a vtable to
> dispatch virtual methods. In the example above, how would the
> compiler know that 'A' needs a vtable? It can only figure that out
> when the compiler hits the declaration of "myfunc". Currently
> "myfunc" can be declared in a different translation unit from 'A'.

It may need more work at link-time.

Some of the issues are discussed under "multi-methods" in Stroustrup's
book "Design and Evolution".

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

Francis Glassborow

unread,
Aug 17, 2002, 6:48:54 PM8/17/02
to
In article <ajk3l7$1cbe67$1...@ID-102451.news.dfncis.de>, Juan Valio
<juan...@terra.es> writes

> > >you would be able to use these interchangeably:
> > > c.f();
> > > f( c );
> > >
> > >and these interchangeably:
> > > g( c );
> > > c.g();
> > >
> > >Just a thought. Any takers?
> >
> > Indeed, there's been some early talk about possibly doing this.
> > Nothing
>even
> > half-baked yet, though. And the name lookup rules are already
> > complex --
>any
> > paper that proposes doing this in the name of simplification would
> > have to include extensive analysis of the effect on the existing
> > rules in current programs. I'd be happy to read such a paper,
> > though.
> >
>
>Maybe...
>c.f(....) is
>c.f(....) if this is not an error
>and
>f(c,....) otherwise
>
>But I suppose things are not so easy...

I think I would prefer it the other way round, so that given a call to
f(c,...) that could not be satisfied by a free function c.f(...) was
checked. Unfortunately neither works because of issues of overload
resolution.

Currently I am leaning towards the idea of being able to mark member
functions so that the compiler will treat them as if there were also an
inline free function forwarding to the member function. So

struct X {
export int foo();
export int foo() const;
....
};

will place the following signatures in the enclosing namespace:

int foo(X&);
int foo(X const &);

and use the member function implementations for them.

Where member function syntax is used, only member functions are
considered for overload resolution. Where free function syntax is used,
normal overload generation for free function calls is used. If the
programmer defines a free function with the same signature then it is a
redefinition error.


--
Francis Glassborow ACCU
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

John Potter

unread,
Aug 18, 2002, 7:21:25 AM8/18/02
to
On 17 Aug 2002 18:48:54 -0400, Francis Glassborow
<francis.g...@ntlworld.com> wrote:

> I think I would prefer it the other way round, so that given a call to
> f(c,...) that could not be satisfied by a free function c.f(...) was
> checked. Unfortunately neither works because of issues of overload
> resolution.

> Currently I am leaning towards the idea of being able to mark member
> functions so that the compiler will treat them as if there were also an
> inline free function forwarding to the member function. So

> struct X {
> export int foo();
> export int foo() const;
> ....
> };

> will place the following signatures in the enclosing namespace:

> int foo(X&);
> int foo(X const &);

> and use the member function implementations for them.

> Where member function syntax is used, only member functions are
> considered for overload resolution. Where free function syntax is used,
> normal overload generation for free function calls is used. If the
> programmer defines a free function with the same signature then it is a
> redefinition error.

Consider:

cout << 4 << 'F';

Which is it?

cout.operator<<(4).operator<<('F');
operator<<(operator<<(cout, 4), 'F');
operator<<(cout.operator<<(4), 'F');
operator<<(cout, 4).operator<<('F');

Yes, I know which one it is. Which other one would your rule allow
and which other one would most programmers prefer? I like yours,
but suspect that I am in a minority.

John

Sungbom Kim

unread,
Aug 18, 2002, 10:49:35 AM8/18/02
to
Herb Sutter wrote:
>
> On 11 Aug 2002 19:53:54 -0400, bran...@cix.co.uk (Dave Harris) wrote:
> >Wouldn't it be better to provide /all/ functions as non-members if C++
> >allows them to be? If we don't want to use "friend", they can forward to
> >member functions in the same way that other non-members do. We'll still
> >have some public member functions, but client code would be discouraged
> >from using them directly.
>
> While putting the draft of GotW #84 into my in-progress manuscript for
> Exceptional C++ Style a few days ago I did add such a note. In particular,
> consistently providing functions as nonmembers can make template programming
> easier.

However it is widely agreed that some should really be members.
Then how can you consistently provide functions as nonmembers,
unless you let friend functions delegate to private members
(which I think is not quite meaningful)?

--
Sungbom Kim <musi...@bawi.org>

Simon Turner

unread,
Aug 18, 2002, 10:53:14 AM8/18/02
to
Francis Glassborow <francis.g...@ntlworld.com> wrote in message news:<N2iW5SC2...@robinton.demon.co.uk>...

I had briefly considered exactly the opposite option,
of being able to mark (somehow) a free function of the form:
int foo( X&, ... );

as being eligible for consideration in lookup and overload
resolution in an expression such as:
X.foo(...);

> will place the following signatures in the enclosing namespace:

> int foo(X&);
> int foo(X const &);

> and use the member function implementations for them.

> Where member function syntax is used, only member functions are
> considered for overload resolution. Where free function syntax is used,
> normal overload generation for free function calls is used. If the
> programmer defines a free function with the same signature then it is a
> redefinition error.

I originally though of just making the member/nonmember call syntax
interchangeable (my original post),
but this would clearly change the behaviour of existing code.

Is it necessary both to allow nonmember interface functions to use member
function call syntax, and to allow member interface functions to use
free function call syntax?

0 new messages