taking arrays/containers as member function arguments of an API

Hicham Mouline

unread,

Dec 31, 2008, 10:48:12 AM12/31/08

to

hello,

In the main class of the API of a library, we have a function

class M {
public:
void Calculate( CalcStatus* status, double* result, int tag, Operation
operation );
};

this calculates 1 result, returns a status for 1 tag and 1 operation.
We would like to extend this API with more functions to calculate N
results/statues for N tags and 1 operation,
M results/statues for 1 tag for M operations
M*N results/statues for N tags and M operations for each tag.

M and N are runtime variables. M is of the order of 10. N is of the order of
up to 1000
The question is whether to native arrays or std::vector. Performance is
especially relevant

// This API assumes statuses and resultes arrays are allocated by caller
with appropriate sizes
class M {
public:
void Calculate( CalcStatus statuses[], double results[], const int tags[],
size_t N, Operation operation );
void Calculate( CalcStatus statuses[], double results[], int tag, const
Operation operations[], size_t M);
// Have caller interpret statues and results linearily: (m,n) result is
m*M + n index for e.g.
void Calculate( CalcStatus statuses[], double results[], const int tags[],
size_t N, const Operation operations[], size_t M );
};

Also, the question of clarity/elegance/cleanness arises. This API is to be
used by 3rd party users.

of

// size determined from the const input vectors
class M {
public:
void Calculate( vector<CalcStatus>& statuses, vector<double>& results,
const vector<int>& tags, Operation operation );
void Calculate( vector<CalcStatus>& statuses, vector<double>& results, int
tag, const vector<Operation>& operations);
// use boost::multi_array in this last case
void Calculate( vector<CalcStatus>& statuses, vector<double>& results,
const vector<int>& tags, const vector<Operation>& operations);
};
// perhaps template these function with a template argument that is a
template itself (STL containers) to not force the user to provide a vector

Are there any style conventions re such a case?

regards,

alfps

unread,

Dec 31, 2008, 11:24:44 AM12/31/08

to

On 31 Des, 16:48, "Hicham Mouline" <hic...@mouline.org> wrote:
> hello,
>
> In the main class of the API of a library,

"main class"?

> we have a function
>
> class M {
> public:
> void Calculate( CalcStatus* status, double* result, int tag, Operation
> operation );
>
> };
>
> this calculates 1 result, returns a status for 1 tag and 1 operation.

This seems to mean the function returns a CalcStatus and a double.

Using out-parameters is OK.

However, unless the function supports null-pointers for those
parameters, the possibility of incorrect usage (and possibility of
time wasted on determining correct usage) is greatly reduced by using
pass by reference.

> We would like to extend this API with more functions to calculate N
> results/statues for N tags and 1 operation,
> M results/statues for 1 tag for M operations
> M*N results/statues for N tags and M operations for each tag.

The question is, does such wrapping simplify or complicate the client
code?

In other words, what is the perceived advantage, the reason why this
is deemed desirable?

> M and N are runtime variables. M is of the order of 10. N is of the order of
> up to 1000
> The question is whether to native arrays or std::vector. Performance is
> especially relevant

For performance nothing beats a simple loop in the client code.

That also provides the greatest flexibility.

E.g., it may be that the client doesn't need all those data points
stored anywhere, but just uses one pair of values at a time. For
another example, error/failure handling may depend on which
computation. And so on.

:-)

General: make the client code simple, provide flexibility, make it
hard to use incorrectly.

Instead of parallel (logical) arrays I would probably choose to
combine a status and corresponding 'double' result in a class type
object.

And instead of passing (logical) arrays it will provide greater
flexibility to provide some callback mechanism, e.g. iterators. If the
"efficiency" is with respect to memory usage then that may also
increase efficiency. Otherwise it may be in conflict with the
efficiency goal.

However, consider whether typical client code is really simplified by
having these wrapper functions.

One way of going about it is to take some typical client code, re-
express it in an "ideal" simple way, and then consider whether and how
that can be implemented efficiently.

Cheers & hth.,

- Alf

James Kanze

unread,

Jan 1, 2009, 6:09:40 AM1/1/09

to

On Dec 31 2008, 5:24 pm, alfps <alf.p.steinb...@gmail.com> wrote:
> On 31 Des, 16:48, "Hicham Mouline" <hic...@mouline.org> wrote:

> > In the main class of the API of a library,

> "main class"?

I think he means "main" in the general sense of principle, and
not in the sense of the "main" function.

> > we have a function

> > class M {
> > public:
> > void Calculate( CalcStatus* status, double* result, int tag, Operation
> > operation );
> > };

> > this calculates 1 result, returns a status for 1 tag and 1 operation.

> This seems to mean the function returns a CalcStatus and a double.

> Using out-parameters is OK.

But not very idiomatic. In this case, I'd probably return a
Fallible (but my implementation of Fallible supports extended
status codes); returning status and using an out parameter for
result is also very common.

> However, unless the function supports null-pointers for those
> parameters, the possibility of incorrect usage (and
> possibility of time wasted on determining correct usage) is
> greatly reduced by using pass by reference.

> > We would like to extend this API with more functions to
> > calculate N
> > results/statues for N tags and 1 operation,
> > M results/statues for 1 tag for M operations
> > M*N results/statues for N tags and M operations for each tag.

> The question is, does such wrapping simplify or complicate the
> client code?

> In other words, what is the perceived advantage, the reason
> why this is deemed desirable?

> > M and N are runtime variables. M is of the order of 10. N is
> > of the order of up to 1000 The question is whether to native
> > arrays or std::vector. Performance is especially relevant

> For performance nothing beats a simple loop in the client
> code.

Not even a simple loop in a template function? The "idiomatic"
solution would probably be something like:

struct Result // More likely something more
// complicated, perhaps a Fallible.
// The standard likes std::pair for
// this, but that's really bad
// engineering.
{
CalcStatus status ;
double value ;
} ;

// Constraints:
//
// InputIterator1::value_type convertible to int
// InputIterator2::value_type can be called as a
// function with ... arguments, returning a Result
// OutputIterator supports assignment of a Result
// ---------------------------------------------------------
template< typename InputIterator1, typename InputIterator2,
typename OutputIterator >
void Calculate(
InputIterator1 beginTag,
InputIterator1 endTag,
InputIterator2 beginOperation,
InputIterator2 endOperation,
OutputIterator result )
{
while ( beginTag != endTag ) {
while ( beginOperation != endOperation ) {
*result ++ = (*beginOperation)( ... ) ;
++ beginOperation ;
}
++ beginTag ;
}
}

I'll not argue for or against such a solution; you make some
very good points below concerning usability in client code (and
I'm very far from being convinced that the STL is a model of
good software design). But it is the idiotmatic solution in
modern C++. And it probably won't have any performance
problems; at least no more than any other solution.

> That also provides the greatest flexibility.

> E.g., it may be that the client doesn't need all those data
> points stored anywhere, but just uses one pair of values at a
> time. For another example, error/failure handling may depend
> on which computation. And so on.

> :-)

Yes. All very good points, which should be considered at the
design level. In this case, the rule to apply is probably not
to bother with such a function until you find yourself having to
write the loops several times, in more or less the same format.
In which case, refactor. (Of course, if you're working on
library code and don't have access to the client code to see how
it really uses the functions, this is more difficult. But as
Alf more or less says, designing something which will handle all
of the possible use cases is far from obvious.)

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Hicham Mouline

unread,

Jan 2, 2009, 6:17:36 AM1/2/09

to

"James Kanze" <james...@gmail.com> wrote in message
news:26cdd188-98f2-4ac7...@b41g2000pra.googlegroups.com...

On Dec 31 2008, 5:24 pm, alfps <alf.p.steinb...@gmail.com> wrote:
> On 31 Des, 16:48, "Hicham Mouline" <hic...@mouline.org> wrote:

>> > In the main class of the API of a library,

>> "main class"?

>I think he means "main" in the general sense of principle, and
>not in the sense of the "main" function.

Yep

> > we have a function

> > class M {
> > public:
> > void Calculate( CalcStatus* status, double* result, int tag,
> > Operation
> > operation );
> > };

> > this calculates 1 result, returns a status for 1 tag and 1 operation.

> This seems to mean the function returns a CalcStatus and a double.

> Using out-parameters is OK.

>But not very idiomatic. In this case, I'd probably return a
>Fallible (but my implementation of Fallible supports extended
>status codes); returning status and using an out parameter for
>result is also very common.

> However, unless the function supports null-pointers for those
> parameters, the possibility of incorrect usage (and
> possibility of time wasted on determining correct usage) is
> greatly reduced by using pass by reference.

We adopt stroustrup p99 middle of the page
"reference arguments should be used only where the name of the function
gives a
strong hint that the reference argument is modified"
Whether "Calculate" gives that hint or not is debated inhouse.
It is also debated that "result" is explicit enough to be ok to be used as a
reference.

Actually, the signature is currently:
CalcStatus Calculate( double* result, int tag, Operation operation );
I just changed it to make it similar to the other 3 signatures.

> > We would like to extend this API with more functions to
> > calculate N
> > results/statues for N tags and 1 operation,
> > M results/statues for 1 tag for M operations
> > M*N results/statues for N tags and M operations for each tag.

> The question is, does such wrapping simplify or complicate the
> client code?
> In other words, what is the perceived advantage, the reason
> why this is deemed desirable?

I realize I regrettable didn't include an essential piece of information.
Function 1 is pure virtual, while 2,3,4 are virtual non pure.
Classes derived from M override 1.
M provides a default implementation for 2,3,4,
but these should be overriden by classes derived from M who
know faster/smarter ways to calculate for the multiple case.

I apologize as this kind of changes everything.

re the templated Calculate below, it is what I was thinking about not using
vector.
However, as it is a virtual function, it can't be templated.

alfps

unread,

Jan 2, 2009, 8:22:59 AM1/2/09

to

On 2 Jan, 12:17, "Hicham Mouline" <hic...@mouline.org> wrote:
> On Dec 31 2008, 5:24 pm, alfps <alf.p.steinb...@gmail.com> wrote:
>
> > However, unless the function supports null-pointers for those
> > parameters, the possibility of incorrect usage (and
> > possibility of time wasted on determining correct usage) is
> > greatly reduced by using pass by reference.
>
> We adopt stroustrup p99 middle of the page
> "reference arguments should be used only where the name of the function
> gives a
> strong hint that the reference argument is modified"
> Whether "Calculate" gives that hint or not is debated inhouse.

Argument passing conventions are not (sensibly, with any degree of
rationalism) chosen from routine names.

Nor the opposite way around.

Routine names and argument passing conventions are (rationally) chosen
from basically the same set of higher level considerations about what
the routine does.

If the routine name is then badly chosen, change it.

But don't change argument passing conventions on the basis of routine
names -- that's lunacy.

Hicham Mouline

unread,

Jan 5, 2009, 5:59:06 AM1/5/09

to

"James Kanze" <james...@gmail.com> wrote in message
news:26cdd188-98f2-4ac7...@b41g2000pra.googlegroups.com...

---------------------------------------------------------------------------------------------------------------------------------
The bottom line is that, as Calculate is a virtual function (to allow for
smarter operation that the default), it cannot be templated.

Thanks,

Michael DOUBEZ

unread,

Jan 5, 2009, 7:37:01 AM1/5/09

to

Hicham Mouline a écrit :

> "James Kanze" <james...@gmail.com> wrote in message
> news:26cdd188-98f2-4ac7...@b41g2000pra.googlegroups.com...
> On Dec 31 2008, 5:24 pm, alfps <alf.p.steinb...@gmail.com> wrote:
>> On 31 Des, 16:48, "Hicham Mouline" <hic...@mouline.org> wrote:
>
> Not even a simple loop in a template function? The "idiomatic"
> solution would probably be something like:

> [snip]

> template< typename InputIterator1, typename InputIterator2,
> typename OutputIterator >
> void Calculate(
> InputIterator1 beginTag,
> InputIterator1 endTag,
> InputIterator2 beginOperation,
> InputIterator2 endOperation,
> OutputIterator result )
> {
> while ( beginTag != endTag ) {
> while ( beginOperation != endOperation ) {
> *result ++ = (*beginOperation)( ... ) ;
> ++ beginOperation ;
> }
> ++ beginTag ;
> }
> }
>
> ---------------------------------------------------------------------------------------------------------------------------------
> The bottom line is that, as Calculate is a virtual function (to allow for
> smarter operation that the default), it cannot be templated.

It is possible to use iterators in a virtual function by using type
erasure but it has a cost.

--
Michael

Hicham Mouline

unread,

Jan 6, 2009, 8:16:02 AM1/6/09

to

"Michael DOUBEZ" <michael...@free.fr> wrote in message
news:4961fcef$0$21819$426a...@news.free.fr...

---------------------------------------------------------------------------------------------------------------------------------
If you could elaborate for the function above?

cheers