Non-member function call, subscripting and dereferencing operators


I. Introduction

This is a proposal to allow operator(), operator[] and operator-> to be defined as non-member functions.


II. Motivation And Scope

The main motivation of this proposal is to get a better language consistency by removing design exceptions. It is also to facilitate generic programming by making easier to adapt existing types to concepts required by generic algorithms / data structures.

a. Consistency

Most operators can be defined as member functions or as non-member functions taking the instance as a first parameter.

The only exceptions, that is the only operators that can only be defined as member functions, are:

- operator= (assignment)

- operator() (function call)

- operator[] (subscripting)

- operator-> (class member access)

From this group, operator= stands apart as it is under most conditions automatically generated by the compiler when not explicitly defined by the user. Allowing to define operator= as a non-member function could change the meaning of assignment for a given type in the "middle" of a translation unit. This would arguably result in non-obvious code.

The three other operators ((), [] and ->) do not suffer this problem. They do not seem in any manner different than other operators that can already be defined as non-member functions.

For example, operator() and operator[] may modify the class instance, but so are all @= operators (@ standing for +, -, *, etc.)

operator* and operator& handle dereferencing and memory aspects, so operator-> should be allowed to be defined under the same rules.

It is important to favor consistency and to limit exceptions in the design of a language. Every design exception will lend to unnecessary complications (see examples below). Powerful designs are those that are the most consistent, as can be seen in mathematics for example.


b. Generic programming

() and [] provide a very concise and clear syntax for function call and subscripting. These operators are also already used by built-in types and standard components (native functions and closures for (), pointers, arrays and most standard containers for [], etc.).

It is thus highly desirable to rely on these operators when writing user-defined types, so as to avoid an unnecessary syntactic diversity for semantically equivalent operations (here, function call and element access).

This is especially important in generic programming, whose goal is to provide algorithms and data structures easily usable with the maximum number of types. As programmers, in our everyday work we often use types defined in third-party libraries for which we don't have access to the source code. It is often the case that one of these types almost models the required concept for a generic algorithm we want to use, except for one operation that is named in a different way.

For instance, it is common that a type has a predominant action whose name can vary: run, trigger, call, apply, launch, invoke, etc. According to Bjarne Stroustrup, this is the perfect case for the function call operator ("The C++ Language" 4ed, 19.2.2).

Because operator() cannot be defined as a non-member function, our only possibility to make our third-party type model the required concept is to write a wrapper type that duplicates its interface only to forward calls, and to finally add a non-static member operator(). This may be a lot of work, especially if the interface of the third-party type is rich.

Note: lamdbas are often not the right tool for this job, as their interface may be too restrictive (they are strictly function objects), and their behavior can be too limited (no default construction in the general case, no ordering, etc., see section VI. 3. for an example).

Indeed, it would be much easier to externally define a non-member operator().

operator[] and operator-> are in the same situation. For instance, element access is often given various names (get, element, value, etc.) that hinder generic programming.

With non-member operators, it is easy to adapt existing types we don't have access to the source code, in order to homogenize syntactic differences and make them model the required concepts for our generic algorithms and data structures (see section III for examples).


c. Unclear initial motivation

Finally, it seems the initial motivation for the restrictions on operator(), operator[] and operator-> is not clear. As Bjarne Stroustrup declared in "The Design and Evolution of C++":

> in the original design of C++ [...], I restricted
> operators [], () and -> to be members. It seemed a harmless
> restriction that eliminated the possibility of some obscure errors
> because these operators invariably depend on and typically modify the
> state of their left-hand operand. However, it is probably a case of
> unnecessary nannyism."

> ("The Design and Evolution of C++", Section 3.6.2 "Members and Friends", p. 83)

As shown above, many operators already modify their left-hand operand (@=, etc.), so operator[], operator() and operator-> are not different with this respect.


III. Examples

a. Uniform access to elements

    // A third-party type we cannot modify the definition.
    class url {
        // ...
    public:
        std::string_view scheme() const;
        std::string_view authority() const;
        std::string_view path() const;
        // ...
    };

    // In another third-party library:
    class kv_doc {
        // ...
    public:
        value const& get(key k) const;
        // ...
    };

    // We want to access elements in url and kv_doc in an
    // uniform way by subscripting.

    enum class url_part {
        scheme,
        authority,
        path,
        // ...
    };

    // _Not_ currently allowed:
    std::string_view operator[](url const& u, url_part part) {
        // Forward to the right method.
        switch (part) {
        case url_part::scheme: return u.scheme();
        // ...
        }
    }

    // _Not_ currently allowed:
    kv_doc::value const& operator[](kv_doc const& doc, kv_doc::key k) {
        // Simply forward to get().
        return doc.get(k);
    }

    // Now we can reuse some generic algorithm relying on subscripting.
    // Compare the amount of work we've done here to writing wrapper
    // types to add non-static member operator[].

    template<typename Range, typename Index, typename Value>
    auto some_algo(Range const& range, Index idx, Value const& value) {
        // ...
        for (auto const& elem: range) {
            if (elem[idx] == value) // ...
        }
        // ...
    }

    // urls is a range of url.
    auto result_https = some_algo(urls, url_part::scheme, "https");

    // vec3s is a range of 3-dimensional float vectors.
    auto result_abscissa = some_algo(vec3s, 1, 0.f); // Will test: elem[1] == 0.f

    // docs is a range of kv_doc.
    auto result_n32 = some_algo(docs, "type", "N32"); // Will test: elem["type"] == "N32"

See also the definition of bounded_range in "Elements of Programming" (Section 12.1 "Simple Composite Objects"), where operator[] is forced to be treated as an exception.


b. Making functional patterns easier to use

  1. Function composition

We want to adapt third-party types to make them function objects,
so that we can use them with our (function) composition tools.

    // In a third-party library
    // (we don't have access to the source code):
    class parser {
        // ...
    public:
        document parse(std::istream*) const;
        // ...
    };

    safe_document sanitize(document const&);

    class database {
        // ...
    public:
        transaction_id store(safe_document const&);
        // ...
    };


    // In our code, we adapt these types to make them function objects.
    // We just forward to the right method, so non-member overloading is
    // much easier than wrapping.
    // _Not_ currently allowed:
    auto operator()(parser const& p, std::istream* i) {
        // Simply forward to parse().
        return p.parse(i);
    }

    // _Not_ currently allowed:
    auto operator()(database& db, safe_document const& doc) {
        // Simply forward to store().
        return db.store(doc);
    }

    // ... and be able to use our composition tools:
    // parse is a parser instance,
    // store is a database instance
    // files is a container of std::istream*

    // Will perform store(sanitize(parse(file))) on each file.
    // See section VI. a. for a simple implementation of |.
    std::for_each(std::execution::par, 
        begin(files), end(files),
        parse | sanitize | store);


  2. Monadic composition

Non-member operator() makes the use of functional patterns easier, such as basic function composition but richer compositions are possible.

    // In our code, we replace the previous operator() definitions
    // by these ones.
    // We now return std::optionals that are empty in case of error.

    // _Not_ currently allowed:
    std::optional<document> operator()(parser const& p, std::istream* i) {
        try {
            return {p.parse(i)};
        } catch (std::exception const& e) {
            // log error
        }
        return {};
    }

    std::optional<safe_document> sanitize_total(document const&) {
        try {
            return {sanitize(doc)};
        } catch (std::exception const& e) {
            // log error
        }
        return {};
    }

    // _Not_ currently allowed:
    std::optional<transaction_id> operator()(database& db, safe_document const& doc) {
        try {
            return {db.store(doc)};
        } catch (std::exception const& e) {
            // log error
        }
        return {};
    }

    // Will perform the equivalent of store(sanitize(parse(doc))), 
    // but will propagate any empty optional.
    // See section VI. b. for an naive implementation of >>.
    std::vector<std::optional<transaction_id>> transactions;
    boost::range::transform(files, std::back_inserter(transactions),
        parse >> sanitize_total >> store);


See also section VI. c. for an example of lambdas' limitations explained above.


c. Duality between pure functions and collections of values

  1. Pure function as a collection of values

    // Third-party code:

    // A linear recurrence computes a value based on the n previous values.
    // Fibonacci is such an example, based on the 2 previous values.
    // The designer decided to implement it as a function object...
    class some_linear_recurrence {
        // Typically, a mutable state can memoize the results if necessary.
    public:
        value_type operator()(value_type const& x) const;
        // ...
    };

    // Our code:

    // ... but a pure function can also be viewed as a collection of values.
    // In our context, it is easier to manipulate this linear recurrence
    // as a collection of values, so we define the subscripting operator:
    template<typename T>
    T operator[](some_linear_recurrence const& f, T const& x) {
        return f(x);
    }

  2. Collection of values as a pure function

    // Third-party code:

    // The designer decided to implement it as a collection of values...
    class some_linear_recurrence {
        // Typically, a mutable state can memoize the results if necessary.
    public:
        value_type operator[](value_type x) const;
        // ...
    };

    // Our code:

    // ... but a collection of values can also be viewed as a pure function.
    // In our context, it is easier to manipulate this linear recurrence
    // as a pure function, so we define the function call operator:
    template<typename T>
    T operator()(some_linear_recurrence const& f, T const& x) {
        return f[x];
    }


IV. Impact On The Standard

Being a pure extension, this proposal shouldn't break any existing code.

The behavior of non-member operators is well-known, so no surprise is to be expected.


V. Standardese

The following changes are based on the document n4659.pdf.

1. Replace 16.5.4 [over.call] by (modifications are between `*`):

operator() shall be *implemented either by* a non-static member function (12.2.1) with an arbitrary number of parameters *or by a non-member function with at least one parameter, the first one evaluating to a class object*.
It can have default arguments*, except for the first argument of the non-member form*.
It implements the function call syntax

    postfix-expression ( expression-list-opt )

where the postfix-expression evaluates to a class object and the possibly empty expression-list matches the parameter list of an operator() member function of the class *, or postfix-expression followed by the expression-list matches the parameter list of the non-member form*.
Thus, a call x(arg1,...) is *either* interpreted as x.operator()(arg1, ...) for a class object x of type T if *T::operator()(T1, ...)* exists, *or is interpreted as operator()(x, arg1, ...), depending on wether* the operator is selected as the best match function by the overload resolution mechanism (16.3.3).

2. Replace 16.5.5 [over.sub] by (modifications are between `*`):

operator[] shall be *implemented either by* a non-static member function with exactly one parameter *or by a non-member function with two parameters, the first one evaluating to a class object*. It implements the subscripting syntax

    postfix-expression [ expr-or-braced-init-list ]

Thus, a subscripting expression x[y] is *either* interpreted as x.operator[](y) for a class object x of type T if T::operator[](T1) exists, *or is interpreted as operator[](x, arg1, ...), depending on wether* the operator is selected as the best match function by the overload resolution mechanism (16.3.3).

3. Replace 16.5.6 [over.ref] by (modifications are between `*`):

operator-> shall be *implemented either by* a non-static member function taking no parameters *or by a non-member function taking one parameter evaluating to a class object*. It implements the class member access syntax that uses ->.

postfix-expression -> template-opt id-expression
postfix-expression -> pseudo-destructor-name

An expression x->m is interpreted as (x.operator->())->m for a class object x of type T if T::operator->() exists*, or as (operator->(x))->m, depending on wether* the operator is selected as the best match function by the overload resolution mechanism (16.3).


VI. Annexes

a. Simple function composition

    #define MEMBER_OP_EQUAL_AND_LESS_THAN_1(type, member0)          \
        bool operator==(type const& x) const {                      \
            return member0 == x.member0;                            \
        }                                                           \
        bool operator<(type const& x) const {                       \
            return member0 < x.member0;                             \
        }

    #define MEMBER_OP_EQUAL_AND_LESS_THAN_2(type, member0, member1) \
        bool operator==(type const& x) const {                      \
            return member0 == x.member0 && member1 == x.member1;    \
        }                                                           \
        bool operator<(type const& x) const {                       \
            return member0 < x.member0 ||                           \
                (!(x.member0 < member0) && member1 < x.member1);    \
        }

    // Function<U (T)> F,
    // Function<V (U)> G
    template<typename F, typename G>
    struct compose {
        F f;
        G g;
    // Regular:
        MEMBER_OP_EQUAL_AND_LESS_THAN_2(compose, f, g)
    // Function<V (T)>:
        template<typename T>
        auto operator()(T const& t) { // should perfect-forward
            // Doesn't handle the void case.
            // Should void be regular, this code would be sufficient...
            return g(f(t)); 
        }
    };

    template<typename F, typename G>
    compose<F, G> operator|(F f, G g) {
        return {f, g};
    }


b. Simple monad composition

    // Calls g if the optional is not empty, otherwise returns an empty optional.
    //
    // Function<std::optional<V> (U)> G
    template<typename U, typename G>
    auto monad_bind(std::optional<U> const& opt, G g) {
        using OptV = std::decay_t<decltype(g(*opt))>;
        if (opt) return g(*opt);
        else     return OptV{};
    };

    // Function<std::optional<U> (T)> F,
    // Function<std::optional<V> (U)> G
    template<typename F, typename G>
    struct monad_compose {
        F f;
        G g;
    // Regular:
        MEMBER_OP_EQUAL_AND_LESS_THAN_2(monad_compose, f, g)
    // Function<std::optional<V> (T)>:
        template<typename T>
        auto operator()(T const& t) {
            return monad_bind(f(t), g);
        }
    };

    template<typename F, typename G>
    monad_compose<F, G> operator>>(F f, G g) {
        return {f, g};
    }


c. Transformation-based iterator

    // Computes a new value on increment.
    //
    // Function<Value (Value)> Transfo
    template<typename Value, typename Transfo>
    struct transfo_iter {
        Value v;
        Transfo f;
    // Regular:
        MEMBER_OP_EQUAL_AND_LESS_THAN_2(transfo_iter, v, f)
    // ForwardIterator:
        transfo_iter& operator++() {
            v = f(v);
            return *this;
        }
        Value const& operator*() const {
            return v;
        }
        // ...

        // We want transfo_iter to be default constructible, comparable,
        // ordonnable (to roughly follow the definition of the Regular concept
        // in Elements of Programming, section 1.5), so Value _and_ Transfo
        // must also be.
    };

    // Example of transfo_iter use for drawing "orbits".
    struct point {
        double x, y;
    // Regular:
        MEMBER_OP_EQUAL_AND_LESS_THAN_2(point, x, y)
    };

    void draw_segment(point const& a, point const& b);

    // Draws segments for the given points.
    //
    // ForwardIterator TotallyOrdered I
    template<typename I>
    void draw_nonempty(I begin, I end) {
        // Precondition: ++begin is defined
        I previous; // _Here_: won't work if transfo_iter contains a stateful lambda
        do {
            previous = begin;
            ++begin;
            draw_segment(*previous, *begin);
        } while (begin < end); // _Here_: won't work if transfo_iter contains a lambda
    }

    // We define a few point transformations.
    struct rotate {
        double theta;
    // Regular:
        MEMBER_OP_EQUAL_AND_LESS_THAN_1(rotate, theta)
    // Transformation:
        point operator()(point const& p) const {
            using namespace std;
            return {cos(theta) * p.x - sin(theta) * p.y, 
                    sin(theta) * p.x + cos(theta) * p.y};
        }
    };

    struct scale {
        double d;
    // Regular:
        MEMBER_OP_EQUAL_AND_LESS_THAN_1(scale, d)
    // Transformation:
        point operator()(point const& p) const {
            return {d * p.x, d * p.y};
        }
    };

    struct translate {
        double x, y;
    // Regular:
        MEMBER_OP_EQUAL_AND_LESS_THAN_2(translate, x, y)
    // Transformation:
        point operator()(point const& p) const {
            return {x + p.x, y + p.y};
        }
    };

    // f is a transformation on point: it takes a point and return a new point.
    // We reuse our function composition tool (see VI. a).
    auto f = rotate{pi / 8} | scale{1.1} | translate{0.0, 1.0};
    using I = transfo_iter<point, decltype(f)>;

    // Draw until we go beyond {10, 10}.
    draw_nonempty(I{{1.0, 1.0}, f}, I{{10.0, 10.0}, f});


VII. References

Elements of Programming, Stepanov A. & McJones P., 2009, Addison-Wesley

The Design and Evolution of C++, Stroustrup B., 1994, Addison-Wesley Professional