For example, it is common for a function f(x) to expect x to be simply
iterable, without caring of its exact type. Is it ok though for f to
return a list for some types/values of x, a tuple for others and a
generator for everything else (assuming it's documented), or it should
always return the most general (iterator in this example) ?
To take it further, what if f wants to return different types,
differing even in a duck-type sense? That's easier to illustrate in a
API-extension scenario. Say that there is an existing function `solve
(x)` that returns `Result` instances. Later someone wants to extend f
by allowing an extra optional parameter `foo`, making the signature
`solve(x, foo=None)`. As long as the return value remains backward
compatible, everything's fine. However, what if in the extended case,
solve() has to return some *additional* information apart from
`Result`, say the confidence that the result is correct ? In short,
the extended API would be:
def solve(x, foo=None):
'''
@rtype: `Result` if foo is None; (`Result`, confidence)
otherwise.
'''
Strictly speaking, the extension is backwards compatible; previous
code that used `solve(x)` will still get back `Result`s. The problem
is that in new code you can't tell what `solve(x,y)` returns unless
you know something about `y`. My question is, is this totally
unacceptable and should better be replaced by a new function `solve2
(x, foo=None)` that always returns (`Result`, confidence) tuples, or
it might be a justifiable cost ? Any other API extension approaches
that are applicable to such situations ?
George
In your example I would possibly suggest returning a 'Result' object and
then later subclassing to give 'ConfidenceResult' which has the
additional 'confidence' attribute.
I think the only time when it's OK to return instances of different
classes is when one of them is None, for example the re module where
match() returns either a MatchObject (if successful) or None (if
unsuccessful); apart from that, a function should always return an
instance of the same class (or perhaps a subclass) or, if a collection
then the same type of collection (eg always a list and never sometimes a
list, sometimes a tuple).
> That's more of a general API design question but I'd like to get an idea
> if and how things are different in Python context. AFAIK it's generally
> considered bad form (or worse) for functions/methods to return values of
> different "type" depending on the number, type and/or values of the
> passed parameters. I'm using "type" loosely in a duck- typing sense, not
> necessarily as a concrete class and its descendants, although I'm not
> sure if even duck-typing is endorsed for return values (as opposed to
> input parameters).
>
> For example, it is common for a function f(x) to expect x to be simply
> iterable, without caring of its exact type. Is it ok though for f to
> return a list for some types/values of x, a tuple for others and a
> generator for everything else (assuming it's documented), or it should
> always return the most general (iterator in this example) ?
Arguably, if the only promise you make is that f() returns an iterable,
then you could return any of list, tuple etc and still meet that promise.
I'd consider that acceptable but eccentric. However, I'd consider it bad
form to *not* warn that the actual type returned is an implementation
detail that may vary.
Alternatively, I'm very fond of what the built-in filter function does:
it tries to match the return type to the input type, so that if you pass
a string as input, it returns a string, and if you pass it a tuple, it
returns a tuple.
> To take it further, what if f wants to return different types, differing
> even in a duck-type sense? That's easier to illustrate in a
> API-extension scenario. Say that there is an existing function `solve
> (x)` that returns `Result` instances. Later someone wants to extend f
> by allowing an extra optional parameter `foo`, making the signature
> `solve(x, foo=None)`. As long as the return value remains backward
> compatible, everything's fine. However, what if in the extended case,
> solve() has to return some *additional* information apart from `Result`,
> say the confidence that the result is correct ? In short, the extended
> API would be:
>
> def solve(x, foo=None):
> '''
> @rtype: `Result` if foo is None; (`Result`, confidence)
> otherwise.
> '''
>
> Strictly speaking, the extension is backwards compatible; previous code
> that used `solve(x)` will still get back `Result`s. The problem is that
> in new code you can't tell what `solve(x,y)` returns unless you know
> something about `y`. My question is, is this totally unacceptable and
> should better be replaced by a new function `solve2 (x, foo=None)` that
> always returns (`Result`, confidence) tuples, or it might be a
> justifiable cost ? Any other API extension approaches that are
> applicable to such situations ?
I dislike that, although I've been tempted to write functions like that
myself. Better, I think, to create a second function, xsolve() which
takes a second argument, and refactor the common parts of solve/xsolve
out into a third private function so you avoid code duplication.
--
Steven
> In your example I would possibly suggest returning a 'Result' object and
> then later subclassing to give 'ConfidenceResult' which has the
> additional 'confidence' attribute.
That's indeed one option, but not very appealing if `Result` happens
to be a builtin (e.g. float or list). Technically you can subclass
builtins but I think, in this case at least, the cure is worse than
the disease.
George
you probably want to look up substitutability:
http://www.google.cl/search?q=substitutability+principle
andrew
actually, this is better:
http://www.google.cl/search?q=substitution+principle
the idea being that if the "contract" for your function is that it returns
a certain type, then any subclass should also be ok (alternatively, that
subclasses should be written so that they can be returned when a caller
was expecting the superclass)
andrew
I'm not sure if Liskov substitution addresses the same problem. The
question here is, what's the scope of the contract ? Does it apply to
the original signature only or any future extended version of it ? In
the former case, the contract is still valid: whenever someone calls
"solve(x)" gets the promised type. The original contract didn't
specify what should the result be when the function is called as "solve
(x, y)" (since the function didn't support a second argument
originally). Only if one interprets the contract as applicable to the
current plus all future extensions, then Liskov substitution comes
into play.
Perhaps that's more obvious in statically typed languages that allow
overloading. IIRC the presence of a method with the signature
float foo(float x);
does not preclude its overloading (in the same or a descendant class)
with a method
char* foo(float x, int y);
The two methods just happen to share the same name but other than that
they are separate, their return value doesn't have to be
substitutable. Is this considered bad practice ?
George
For list/tuple/iterable the correlation with the argument's type is
purely superficial, *because* they're so compatible. Why should only
tuples and lists get special behaviour? Why shouldn't every other
argument type return a list as well?
A counter example is python 3.0's str/bytes functions. They're
mutually incompatible and there's no default.
> To take it further, what if f wants to return different types,
> differing even in a duck-type sense? That's easier to illustrate in a
> API-extension scenario. Say that there is an existing function `solve
> (x)` that returns `Result` instances. Later someone wants to extend f
> by allowing an extra optional parameter `foo`, making the signature
> `solve(x, foo=None)`. As long as the return value remains backward
> compatible, everything's fine. However, what if in the extended case,
> solve() has to return some *additional* information apart from
> `Result`, say the confidence that the result is correct ? In short,
> the extended API would be:
>
> def solve(x, foo=None):
> '''
> @rtype: `Result` if foo is None; (`Result`, confidence)
> otherwise.
> '''
>
> Strictly speaking, the extension is backwards compatible; previous
> code that used `solve(x)` will still get back `Result`s. The problem
> is that in new code you can't tell what `solve(x,y)` returns unless
> you know something about `y`. My question is, is this totally
> unacceptable and should better be replaced by a new function `solve2
> (x, foo=None)` that always returns (`Result`, confidence) tuples, or
> it might be a justifiable cost ? Any other API extension approaches
> that are applicable to such situations ?
At a minimum it's highly undesirable. You lose a lot of readability/
maintainability. solve2/solve_ex is a little ugly, but that's less
overall, so it's the better option.
If your tuple gets to 3 or more I'd start wondering if you should
return a single instance, with the return values as attributes. If
Result is already such a thing I'd look even with a tuple of 2 to see
if that's appropriate.
> On Apr 6, 3:02 pm, George Sakkis <george.sak...@gmail.com> wrote:
>
> > For example, it is common for a function f(x) to expect x to be simply
> > iterable, without caring of its exact type. Is it ok though for f to
> > return a list for some types/values of x, a tuple for others and a
> > generator for everything else (assuming it's documented), or it should
> > always return the most general (iterator in this example) ?
>
> For list/tuple/iterable the correlation with the argument's type is
> purely superficial, *because* they're so compatible. Why should only
> tuples and lists get special behaviour? Why shouldn't every other
> argument type return a list as well?
That's easy; because the result might be infinite. In which case you
may ask "why shouldn't every argument type return an iterator then",
and the reason is usually performance; if you already need to store
the whole result sequence (e.g. sorted()), why return just an iterator
to it and force the client to copy it to another list if he needs
anything more than iterating once over it ?
> A counter example is python 3.0's str/bytes functions. They're
> mutually incompatible and there's no default.
As already mentioned, another example is filter() that tries to match
the input sequence type and falls back to list if it fails.
> > To take it further, what if f wants to return different types,
> > differing even in a duck-type sense? That's easier to illustrate in a
> > API-extension scenario. Say that there is an existing function `solve
> > (x)` that returns `Result` instances. Later someone wants to extend f
> > by allowing an extra optional parameter `foo`, making the signature
> > `solve(x, foo=None)`. As long as the return value remains backward
> > compatible, everything's fine. However, what if in the extended case,
> > solve() has to return some *additional* information apart from
> > `Result`, say the confidence that the result is correct ? In short,
> > the extended API would be:
>
> > def solve(x, foo=None):
> > '''
> > @rtype: `Result` if foo is None; (`Result`, confidence)
> > otherwise.
> > '''
>
> > Strictly speaking, the extension is backwards compatible; previous
> > code that used `solve(x)` will still get back `Result`s. The problem
> > is that in new code you can't tell what `solve(x,y)` returns unless
> > you know something about `y`. My question is, is this totally
> > unacceptable and should better be replaced by a new function `solve2
> > (x, foo=None)` that always returns (`Result`, confidence) tuples, or
> > it might be a justifiable cost ? Any other API extension approaches
> > that are applicable to such situations ?
>
> At a minimum it's highly undesirable. You lose a lot of readability/
> maintainability. solve2/solve_ex is a little ugly, but that's less
> overall, so it's the better option.
That's my feeling too, at least in a dynamic language. For a static
language that allows overloading, that should be a smaller (or perhaps
no) issue.
George
You've got two different use cases here. sorted() clearly cannot be
infinite, so it might as well always return a list. Other functions
that can be infinite should always return an iterator.
> > A counter example is python 3.0's str/bytes functions. They're
> > mutually incompatible and there's no default.
>
> As already mentioned, another example is filter() that tries to match
> the input sequence type and falls back to list if it fails.
That's fixed in 3.0. It's always an iterator now.
> > > To take it further, what if f wants to return different types,
> > > differing even in a duck-type sense?
> >
> > At a minimum it's highly undesirable. You lose a lot of readability/
> > maintainability. solve2/solve_ex is a little ugly, but that's less
> > overall, so it's the better option.
>
> That's my feeling too, at least in a dynamic language. For a static
> language that allows overloading, that should be a smaller (or perhaps
> no) issue.
Standard practices may encourage it in a static language, but it's
still fairly confusing. Personally, I consider python's switch to a
different operator for floor division (//) to be a major step forward
over C-like languages.
For this particular trick, I would always use a unique sentinel value so
that *only* passing an argument would change the result signature:
sentinel = object()
def solve(x, foo=sentinel):
'''
@rtype: `Result` if foo is sentinel; (`Result`, confidence) otherwise.
'''
But I agree with other respondents that this is a code stink.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/
Why is this newsgroup different from all other newsgroups?