do we need the default() op? can we have None-valued variables in computations?

60 views
Skip to first unread message

James Bergstra

unread,
Feb 11, 2010, 9:42:40 AM2/11/10
to thean...@googlegroups.com
There is an unusual op in tensor.basic called default.

It should probably be moved because it has nothing to do with tensors,
but that's not my main point. The trouble is that the Op's behaviour
is difficult to distinguish from a buggy Op in DebugMode, and I wonder
if we need that behaviour at all. This Op is related to 2/3 errors in
the Theano buildbot.

What does default() do? default(x,y) equiv to switch(is_None(x), y,
x). We don't have an is_None Op, but you can imagine what it does.

The trouble is that DebugMode thinks that x is not provided when it's
storage has value None.

One option is to fix DebugMode so that it can distinguish an input
taking value None from an input that is not provided. That is a good
idea regardless of whether we use default(), but it is substantial
work. And there is another way to fix this problem.

There is a second question in play here that might justify a
work-around for DebugMode, and also clear up some design confusion.
The question is whether Ops in general are supposed to be ready for
None-valued inputs when their inputs are supposed to something in
particular, such as numpy.ndarrays.

The possibilities for how to deal with None-valued inputs and outputs are:

A - inputs and outputs can be None, and Ops should raise sensible
errors if they can't deal with None-valued inputs.

B - inputs can be None, outputs cannot be None, and Ops are allowed to
crash in arbitrary ways if that happens [this is the current FAST_RUN
and FAST_COMPILE behaviour]

C - inputs can be None, outputs cannot be None. If an Op returns a
None for a TensorVariable output, then it's an error.

D - inputs cannot be None and outputs cannot be None either.

E - inputs cannot be None, but outputs can [for completeness' sake
only... this sounds crazy to me]

I like D the best, because it allows Op implementations to be simpler,
faster, and easier to test. It is also the least permissive. Is that
a problem?
In that view (D), default() is useless, because we shouldn't be able
to pass None for x. So what was default() for anyway? How is it
different from default-values for inputs when there are not any Ops
[that I know of] that could produce a None in the middle of a graph
anyway?

--
http://www-etud.iro.umontreal.ca/~bergstrj

Frédéric Bastien

unread,
Feb 11, 2010, 10:32:00 AM2/11/10
to thean...@googlegroups.com
I perfer D as I don't use the utility to allow None as input when we can recompile the fct with different output/inputs depending of what is needed...

Fred

Olivier Delalleau

unread,
Feb 11, 2010, 10:35:47 AM2/11/10
to theano-dev
On 11 fév, 09:42, James Bergstra <james.bergs...@gmail.com> wrote:
> D - inputs cannot be None and outputs cannot be None either.
> (...)

> I like D the best, because it allows Op implementations to be simpler,
> faster, and easier to test.  It is also the least permissive.  Is that
> a problem?

There's one option you didn't mention:

F - inputs and outputs can be None, and Ops are allowed to crash in
arbitrary ways if they get some None inputs they do not expect.

I like the idea of being able to use None in Ops. None is handy in
some situations (like default values, initialization), and is cheap.
It may look pointless in graphs manipulating data arrays, but for more
computations with Ops being viewed as generic functions, I think it'd
be sad to prevent people from using None.

Maybe it could be possible that by default Ops are not supposed to
deal with None values (as input or output), but someone could
explicitely change this behavior for a specific Op that would benefit
from it? (Not sure exactly how to do that, I guess some class
hierarchy, or adding some attributes to the Op class could work, but
there may be better ways).

- Olivier

Yoshua Bengio

unread,
Feb 11, 2010, 10:38:29 AM2/11/10
to thean...@googlegroups.com

How about 'missing inputs'? None would be a convenient way to
represent that.
If some inputs are missing, an Op could decide that it raises an
error, or
it could decide to output 'missing' as well, so I would vote for A.

-- Yoshua

Olivier Delalleau

unread,
Feb 11, 2010, 10:51:41 AM2/11/10
to theano-dev
On 11 fév, 10:35, Olivier Delalleau <olivier.delall...@gmail.com>
wrote:

> I like the idea of being able to use None in Ops. None is handy in
> some situations (like default values, initialization), and is cheap.
> It may look pointless in graphs manipulating data arrays, but for more
> computations with Ops being viewed as generic functions, I think it'd
> be sad to prevent people from using None.

Here's an example of what I mean by "generic functions": say you
create an Op "isOp" that takes as input two objects a and b, and
outputs the boolean "a is b". "None" may be a perfectly valid value
for a and/or b, that has no special meaning (like being a missing
input).

--
Olivier

Pascal Lamblin

unread,
Feb 11, 2010, 11:06:20 AM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010, Olivier Delalleau wrote:
> Here's an example of what I mean by "generic functions": say you
> create an Op "isOp" that takes as input two objects a and b, and
> outputs the boolean "a is b". "None" may be a perfectly valid value
> for a and/or b, that has no special meaning (like being a missing
> input).

In that case, wouldn't 'Generic' be the thing to use?
I would think that 'None' isn't a valid value for a tensor, but can be
a valid value for other types, including Generic.

I'm not sure of what that would imply for the existing ops, however.
--
Pascal

Razvan Pascanu

unread,
Feb 11, 2010, 11:20:22 AM2/11/10
to thean...@googlegroups.com
I also like the idea of using None for inputs and outputs. This would actually come in handy with what I am working right now with Mike ( and what Xavier did) namely dealing with missing inputs for SdAs.

Maybe a crazy suggestion .. but using None for ops might be a step forward towards some form of lazy evaluation in theano ( at least for some special ops .. like switch). But of course more is needed beside allowing None for that.

Olivier Delalleau

unread,
Feb 11, 2010, 11:22:50 AM2/11/10
to theano-dev

RIght, I kinda missed the part in James' post that says "when their


inputs are supposed to something in particular, such as

numpy.ndarrays.". I'm fine with forbidding None in such a case
(although I hope that if one wants to make an Op taking either an
array or None as input, saying the input is generic is not going to
actually make a big difference in computations / optimizations
(besides the lack of type-checking and the risks associated with it)).

In which case the example given here, "default", is not really a
problem since it seems to be a generic Op whose inputs should probably
be declared as generic, no?

--
Olivier

James Bergstra

unread,
Feb 11, 2010, 12:15:53 PM2/11/10
to thean...@googlegroups.com
I can appreciate the utility of using None as a value, and I agree
that DebugMode should permit None.

You are also right that None is a perfectly valid output right now, as
long as that output has no clients (is not the input to anything).

So the main question is whether None is a suitable value for a
TensorVariable. On reflection I think I have to agree with you that
it is. Container's implement the logic that None can be assigned to
any .value field, regardless of Type. The C implementation of
TensorVariable handles None just fine, and most if not all Op c_code
implementations do check for Null inputs I think.

So I guess the fault must go to DebugMode. (Damn).

James

--
http://www-etud.iro.umontreal.ca/~bergstrj

James Bergstra

unread,
Feb 11, 2010, 12:24:17 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010 at 11:22 AM, Olivier Delalleau
<olivier....@gmail.com> wrote:
> On 11 fév, 11:06, Pascal Lamblin <lambl...@iro.umontreal.ca> wrote:
>> On Thu, Feb 11, 2010, Olivier Delalleau wrote:
>> > Here's an example of what I mean by "generic functions": say you
>> > create an Op "isOp" that takes as input two objects a and b, and
>> > outputs the boolean "a is b". "None" may be a perfectly valid value
>> > for a and/or b, that has no special meaning (like being a missing
>> > input).
>>
>> In that case, wouldn't 'Generic' be the thing to use?

Not really... if a Variable is of Type Generic, then you can't use it
in almost any Theano expression.

We would need to add an Op that would try to convert Generic->Tensor,
which would try to convert whatever was the generic variable to a
Tensor of a certain number of dimensions, and a certain dtype. This
sounds like a pretty reasonable Op, but we don't have it.

>> I would think that 'None' isn't a valid value for a tensor, but can be
>> a valid value for other types, including Generic.

I thought so at first too, but there are several cases
(Container.value, unused-outputs, most c_code(), etc.) in which None
is defacto valid, and for good reasons.

So I think that we should just admit that None is a valid value for
*any* Type, not just Generic.

>>
>> I'm not sure of what that would imply for the existing ops, however.
>
> RIght, I kinda missed the part in James' post that says "when their
> inputs are supposed to something in particular, such as
> numpy.ndarrays.". I'm fine with forbidding None in such a case
> (although I hope that if one wants to make an Op taking either an
> array or None as input, saying the input is generic is not going to
> actually make a big difference in computations / optimizations
> (besides the lack of type-checking and the risks associated with it)).
>
> In which case the example given here, "default", is not really a
> problem since it seems to be a generic Op whose inputs should probably
> be declared as generic, no?

default() is generic in the sense that it should work on any
*particular* type and have an output of the *same* type. It isn't
necessary that those types be the Generic type.

For example, default(scalar(), scalar()) should return a scalar right?

--
http://www-etud.iro.umontreal.ca/~bergstrj

Pascal Lamblin

unread,
Feb 11, 2010, 12:58:44 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010, James Bergstra wrote:
> You are also right that None is a perfectly valid output right now, as
> long as that output has no clients (is not the input to anything).
>
> So the main question is whether None is a suitable value for a
> TensorVariable. On reflection I think I have to agree with you that
> it is. Container's implement the logic that None can be assigned to
> any .value field, regardless of Type. The C implementation of
> TensorVariable handles None just fine, and most if not all Op c_code
> implementations do check for Null inputs I think.

So, all the code handling Containers directly accepts None as a value
for an array, and I understand it can be useful in some circumstances.

However, if we accept None as an input at the Op level, we need to
define what the expected behavior of the Ops should be, in general, when
encountering None as an input. More importantly, this behavior has to be
natural for the users, and not get in the way. It should especially not
make debugging harder.

For me, the natural, expected result of "1 + None" is really not clear.
Should "None" be silently propagated by Ops, like "NaN" sometimes is?
I think it would be an important source of problems, and not an easily
trackable one.
In some cases, we can treat None as a "neutral" value, or 0, so 1 + None
would be equal to 1. But then how about 1 * None? a[None]? exp(None)?

In my opinion, allowing None as an input at the Op level is not
the solution to any precise problem we have, but it would bring
lots of them, and not only the "how-do-we-do-that" type, but the
"what-should-we-do" type.

> So I guess the fault must go to DebugMode. (Damn).

Regarding DebugMode, maybe the default value of an Op's output (if it's
not assigned at all in the code) should not be None if None is a missing
value. That way, DebugMode would tell the difference between "user wants
the output to be None" and "user forgot to set / improperly set the
output". This could be useful even in the other Modes, after all.

Hth,
--
Pascal

Olivier Breuleux

unread,
Feb 11, 2010, 1:01:51 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010 at 9:42 AM, James Bergstra
<james.b...@gmail.com> wrote:
> In that view (D), default() is useless, because we shouldn't be able
> to pass None for x.  So what was default() for anyway?  How is it
> different from default-values for inputs when there are not any Ops
> [that I know of] that could produce a None in the middle of a graph
> anyway?

default() is for situations like

y_0 = f(x)
y_n = g(x, y_{n-1})
-->
y = matrix()
tmpy = default(y, f(x))
newy = g(x, tmpy)

There is no appropriate default-value for y since y_0 is itself a
Theano computation. Alternate solutions would be to use a degenerate
tensor with shape [0]*ndim (contrived, doesn't work for scalars, and
I'm not even sure we support these properly) or add a n variable and
use switch() (but then you need to track n in addition to y). I found
default() to be the cleanest and most economical solution, noting that
None is already the default state of an input.

Olivier

Olivier Delalleau

unread,
Feb 11, 2010, 1:30:03 PM2/11/10
to theano-dev
On 11 fév, 12:58, Pascal Lamblin <lambl...@iro.umontreal.ca> wrote:
> For me, the natural, expected result of "1 + None" is really not clear.
> Should "None" be silently propagated by Ops, like "NaN" sometimes is?
> I think it would be an important source of problems, and not an easily
> trackable one.

I think it should be up to the "+" Op to crash so we know something
weird's going on. Unless the Op is called sometihng like
"AddWithNoneTreatedAsZero" so we are aware of what kind of behavior
this Op has.

--
Olivier

Olivier Breuleux

unread,
Feb 11, 2010, 1:38:41 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010 at 12:58 PM, Pascal Lamblin
<lamb...@iro.umontreal.ca> wrote:
> On Thu, Feb 11, 2010, James Bergstra wrote:
>> You are also right that None is a perfectly valid output right now, as
>> long as that output has no clients (is not the input to anything).
>>
>> So the main question is whether None is a suitable value for a
>> TensorVariable.  On reflection I think I have to agree with you that
>> it is. Container's implement the logic that None can be assigned to
>> any .value field, regardless of Type.  The C implementation of
>> TensorVariable handles None just fine, and most if not all Op c_code
>> implementations do check for Null inputs I think.
>
> So, all the code handling Containers directly accepts None as a value
> for an array, and I understand it can be useful in some circumstances.

It doesn't just "accept" None. It is the default value. If you don't
initialize an input or a state variable, it will be None.

> However, if we accept None as an input at the Op level, we need to
> define what the expected behavior of the Ops should be, in general, when
> encountering None as an input. More importantly, this behavior has to be
> natural for the users, and not get in the way. It should especially not
> make debugging harder.

No C Op currently accepts None, thanks to the following code:

if (py_%(name)s == Py_None) {
// We can either fail here or set %(name)s to NULL and
rely on Ops using
// tensors to handle the NULL case, but if they fail to do
so they'll end up
// with nasty segfaults, so this is public service.
PyErr_SetString(PyExc_ValueError, "expected an ndarray, not None");
%(fail)s
}

In hindsight I might have done this differently now, but I think we
might be stuck with that behavior since most C Ops assume the tensors
to be well formed and we don't want to fix all of them.

Python implementations, on the other hand, can handle None.

> For me, the natural, expected result of "1 + None" is really not clear.
> Should "None" be silently propagated by Ops, like "NaN" sometimes is?
> I think it would be an important source of problems, and not an easily
> trackable one.
> In some cases, we can treat None as a "neutral" value, or 0, so 1 + None
> would be equal to 1. But then how about 1 * None? a[None]? exp(None)?

None is useful to mark an intentionally uninitialized value. "1 +
None" has to be an error and so do all the other examples you gave.
And they are already errors, so we have nothing to change. That does
not mean, however, that some specific ops can't choose to handle
uninitialized values if it makes sense for them to do it.

> In my opinion, allowing None as an input at the Op level is not
> the solution to any precise problem we have, but it would bring
> lots of them, and not only the "how-do-we-do-that" type, but the
> "what-should-we-do" type.

None is already allowed. It means "missing input", but there are
situations where it's ok for inputs to be missing because the user
can't know what to put there. For instance, if a state is meant to be
initialized by the result of a computation.

>> So I guess the fault must go to DebugMode. (Damn).
>
> Regarding DebugMode, maybe the default value of an Op's output (if it's
> not assigned at all in the code) should not be None if None is a missing
> value. That way, DebugMode would tell the difference between "user wants
> the output to be None" and "user forgot to set / improperly set the
> output". This could be useful even in the other Modes, after all.

Problem: Theano's specifications say that the output storage may
either contain what the Op previously allocated or None. This means
many Ops explicitly check for None being in their own output storage
to know if they need to allocate something. This also means Ops can
get away with not setting the output, if the output is already what
they want to return, so I don't think it is even possible to reliably
check if an Op forgets to set its output.

Olivier

James Bergstra

unread,
Feb 11, 2010, 3:17:18 PM2/11/10
to thean...@googlegroups.com

I see what you mean. I was thinking of the form of your f and g
though... I couldn't think of anything in Theano right now except for
default() itself that could produce None as an output. Of course,
you or anyone else might have written more Ops that do produce None
output. I was (and continue to be) reluctant to revise DebugMode to
accommodate what seems like an relatively hypothetical case.

Discussing with Pascal, we also came to recognize a few other problems
with allowing None as a valid value for TensorVariables.

If None is a valid TensorVariable value, then we sort of force
ourselves to document and enforce how *every Op* deals with every
possible case of None-valued inputs. Like what does Theano think is
(None + 5) ? What about None[3:5] ? Should None behave like a scalar
NaN? Should it behave like an n-dimensional NaN? should it compare
true with itself? etc. etc. These are all really dumb questions to
have to spend time on in a way, because None is not a tensor.

There is also the problem of testability. Right now, it is an easy
mistake for a new Op-writer to type out[0] = result rather than
out[0][0] = result or something like that, or maybe out=result.
DebugMode can catch that right now because None is the initial output
value, so if it is the final output value too then there is a problem.
It seems in theory an obvious fix that the initial value for outputs
should be a special UNDEFINED symbol rather than None, but it isn't
obvious how to implement it... should every Mode do this? Should
containers do this? Or should DebugMode do this internally?

Moreover, if all of these changes are made, then it actually *removes*
one of the reasons that made me think that de-facto, None is a valid
TensorType value. For example, right now, if you assign None to a
Container's value, then it skips the Type.filter call, and puts the
None right in. So it effectively implements the idea that None is a
valid value for any Type, regardless of what the Type.filter would
say. If we use an UNDEFINED symbol, then it implement the idea that
any Type can be UNDEFINED regardless of what type.filter would say.
In that case, None is *no longer* a valid value for that Type, unless
the filter() function said so. I don't think TensorType.filter(None)
should work.

--
http://www-etud.iro.umontreal.ca/~bergstrj

James Bergstra

unread,
Feb 11, 2010, 3:35:10 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010 at 1:38 PM, Olivier Breuleux <breu...@gmail.com> wrote:
>>> So I guess the fault must go to DebugMode. (Damn).
>>
>> Regarding DebugMode, maybe the default value of an Op's output (if it's
>> not assigned at all in the code) should not be None if None is a missing
>> value. That way, DebugMode would tell the difference between "user wants
>> the output to be None" and "user forgot to set / improperly set the
>> output". This could be useful even in the other Modes, after all.
>
> Problem: Theano's specifications say that the output storage may
> either contain what the Op previously allocated or None. This means
> many Ops explicitly check for None being in their own output storage
> to know if they need to allocate something. This also means Ops can
> get away with not setting the output, if the output is already what
> they want to return, so I don't think it is even possible to reliably
> check if an Op forgets to set its output.
>
> Olivier
>

You have a good point.. How about printing a warning though, if after
executing an Op, all of its outputs are None. It's not necessarily an
error, but I'm guessing that it often would be.

Maybe debugmode can look for an Op attribute like
_may_produce_all_None_output or something completely specific like
that, to supress the warning when it's the expected behaviour.

--
http://www-etud.iro.umontreal.ca/~bergstrj

Olivier Breuleux

unread,
Feb 11, 2010, 3:59:36 PM2/11/10
to thean...@googlegroups.com

I don't think any Ops produce None as an output, besides these that
just pass on one of their inputs. But who knows if it might come in
handy in the future.

> Discussing with Pascal, we also came to recognize a few other problems
> with allowing None as a valid value for TensorVariables.
>
> If None is a valid TensorVariable value, then we sort of force
> ourselves to document and enforce how *every Op* deals with every
> possible case of None-valued inputs.  Like what does Theano think is
> (None + 5) ?  What about None[3:5] ?  Should None behave like a scalar
> NaN?  Should it behave like an n-dimensional NaN? should it compare
> true with itself? etc. etc.  These are all really dumb questions to
> have to spend time on in a way, because None is not a tensor.

We don't have to spend time on it. There is nothing wrong with the
current behavior.

In C, NULL is a valid pointer that functions manipulating pointers
need to take care of, and it is useful to have it. When you pass NULL
to a function that documents no support for it, you expect a failure
or a segfault, that's about it, and that's what Theano gives you. A
Theano type is like (Maybe type = Nothing | Just type) in Haskell. We
just set Nothing as a valid value for any type from the get go and an
Op can either handle it specially or reject it altogether.

> There is also the problem of testability.  Right now, it is an easy
> mistake for a new Op-writer to type out[0] = result rather than
> out[0][0] = result or something like that, or maybe out=result.
> DebugMode can catch that right now because None is the initial output
> value, so if it is the final output value too then there is a problem.
>  It seems in theory an obvious fix that the initial value for outputs
> should be a special UNDEFINED symbol rather than None, but it isn't
> obvious how to implement it... should every Mode do this? Should
> containers do this? Or should DebugMode do this internally?

Even this policy has issues: if an Op checks what's in its container
to decide whether to allocate something or not, with UNDEFINED, you
still can't test the correctness of its behavior when its last
allocation is already in the container. If you want to catch these
errors, the right policy would probably be to override __setitem__ in
the storage passed by out, though that might not work for C
implementations.

Frankly, I'm not even sure what the point is here. If you write an Op
and you mess up setting the output, it's difficult to miss. We just
need an Op writing FAQ with a "My Op returns None. What's up with
that?" entry.

> Moreover, if all of these changes are made, then it actually *removes*
> one of the reasons that made me think that de-facto, None is a valid
> TensorType value.  For example, right now, if you assign None to a
> Container's value, then it skips the Type.filter call, and puts the
> None right in.  So it effectively implements the idea that None is a
> valid value for any Type, regardless of what the Type.filter would
> say.  If we use an UNDEFINED symbol, then it implement the idea that
> any Type can be UNDEFINED regardless of what type.filter would say.
> In that case, None is *no longer* a valid value for that Type, unless
> the filter() function said so.  I don't think TensorType.filter(None)
> should work.

How about

UNDEFINED = None

Olivier

Olivier Delalleau

unread,
Feb 11, 2010, 4:12:26 PM2/11/10
to theano-dev
On 11 fév, 15:17, James Bergstra <james.bergs...@gmail.com> wrote:
> Discussing with Pascal, we also came to recognize a few other problems
> with allowing None as a valid value for TensorVariables.
>
> If None is a valid TensorVariable value, then we sort of force
> ourselves to document and enforce how *every Op* deals with every
> possible case of None-valued inputs.  Like what does Theano think is
> (None + 5) ?  What about None[3:5] ?  Should None behave like a scalar
> NaN?  Should it behave like an n-dimensional NaN? should it compare
> true with itself? etc. etc.  These are all really dumb questions to
> have to spend time on in a way, because None is not a tensor.

I don't know, it seems rather easy to me to say that any Op that does
not explicitly / obviously handle None should just crash with a None
input. So (None + 5) = crash, None[3:5] = crash. Should None behave
like a scalar NaN? No. Should it behave like an n-dimensional NaN? No.
Should it compare true with itself? Yes (because it's how it is in
Python). Basically, None would behave like one would expect it to
behave in Python.

Another use case of None I just thought of is the following: an Op
where you can optionally provide as input an array whose content will
be overwitten with the result of some computation by the Op (I believe
that's something is allo. And the ability to provide None to tell the
Op that this computation should not be saved is simple and cleaner
than something like "an empty array". Now I wouldn't be surprised if
there would be other ways to achieve this, but using None seems
natural to me here.

--
Olivier

Olivier Delalleau

unread,
Feb 11, 2010, 4:15:08 PM2/11/10
to theano-dev
On 11 fév, 16:12, Olivier Delalleau <olivier.delall...@gmail.com>
wrote:

> be overwitten with the result of some computation by the Op (I believe
> that's something is allo.

Oops, sorry for not reading my email before hitting send, ignore the
end of this sentence (I actually went to check the doc and realized it
was possible with the destroy_map, then forgot to remove this ;).

--
Olivier

Olivier Breuleux

unread,
Feb 11, 2010, 4:18:35 PM2/11/10
to thean...@googlegroups.com
This said, there *is* some elegance in the Haskell way. We could
forbid both None and UNDEFINED as valid values for a type, and add a
Maybe class that explicitly adds None/UNDEFINED to the pool of
accepted values. Thus the type signature of default would become
(Maybe TensorType -> TensorType -> TensorType), or in general (Maybe a
-> a -> a). Essentially, matrix('x') could not be None, but
Maybe(matrix)('x') could. It is less error prone and we can have nicer
error messages, so if the change can be made easily, I support it.

Olivier

Pascal Lamblin

unread,
Feb 11, 2010, 4:53:27 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010, Olivier Breuleux wrote:
> No C Op currently accepts None, thanks to the following code:
>
> if (py_%(name)s == Py_None) {
> // We can either fail here or set %(name)s to NULL and
> rely on Ops using
> // tensors to handle the NULL case, but if they fail to do
> so they'll end up
> // with nasty segfaults, so this is public service.
> PyErr_SetString(PyExc_ValueError, "expected an ndarray, not None");
> %(fail)s
> }

Indeed, that's an interesting safeguard. Now, there are two problems
left :
- It is not automatic, meaning someone writing a new implementation
will not always think about it, and can assume that kind of
filtering has already been done. Indeed, the examples on
<http://deeplearning.net/software/theano/extending/cop.html>, this code
is not called.
- Some similar code has to be added to the Python implementation of
(almost) each and every Op.

> > For me, the natural, expected result of "1 + None" is really not clear.
> > Should "None" be silently propagated by Ops, like "NaN" sometimes is?
> > I think it would be an important source of problems, and not an easily
> > trackable one.
> > In some cases, we can treat None as a "neutral" value, or 0, so 1 + None
> > would be equal to 1. But then how about 1 * None? a[None]? exp(None)?
>
> None is useful to mark an intentionally uninitialized value. "1 +
> None" has to be an error and so do all the other examples you gave.
> And they are already errors, so we have nothing to change.

OK, let's try, with linker 'py' to avoid the C implementation.
>>> m = tensor.matrix()
>>> f = theano.function([m], m+1, mode=theano.Mode(linker='py'))
>>> f(None)
[...]
AttributeError: ("'NoneType' object has no attribute 'shape'", Elemwise{add,no_inplace}(<TensorType(float64, matrix)>, InplaceDimShuffle{x,x}.0))

It crashes indeed, but it's not really understandable.

Let's try another one:
>>> g = theano.function([m], m*1, mode=theano.Mode(linker='py'))
>>> print g(None)
None

>>> h = theano.function([m], m+0, mode=theano.Mode(linker='py'))
>>> print h(None)
None

So, in some cases, 1 * None = None, and None + 0 = None.
I consider this as a bug, and I don't think we "have nothing to change".

So I see two solutions :
- manually add safeguards against None to almost all Python
implementations of Ops,
- add a global mechanism preventing None as inputs to Ops, possibly
with a way of disabling it for the cases where we really want None to be
an accepted value.

--
Pascal

Olivier Breuleux

unread,
Feb 11, 2010, 5:13:03 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010 at 4:53 PM, Pascal Lamblin
<lamb...@iro.umontreal.ca> wrote:
> On Thu, Feb 11, 2010, Olivier Breuleux wrote:
>> No C Op currently accepts None, thanks to the following code:
>>
>>         if (py_%(name)s == Py_None) {
>>             // We can either fail here or set %(name)s to NULL and
>> rely on Ops using
>>             // tensors to handle the NULL case, but if they fail to do
>> so they'll end up
>>             // with nasty segfaults, so this is public service.
>>             PyErr_SetString(PyExc_ValueError, "expected an ndarray, not None");
>>             %(fail)s
>>         }
>
> Indeed, that's an interesting safeguard. Now, there are two problems
> left :
>  - It is not automatic, meaning someone writing a new implementation
> will not always think about it, and can assume that kind of
> filtering has already been done. Indeed, the examples on
> <http://deeplearning.net/software/theano/extending/cop.html>, this code
> is not called.
>  - Some similar code has to be added to the Python implementation of
> (almost) each and every Op.

It is automatic. The code is in TensorType.c_extract and it should
also be in ScalarType.c_extract, though I didn't look. That code is
always executed when transferring data from Python, before an Op gets
to do anything.

I'm pretty sure it is the optimizer's fault. m*1 and m+0 are optimized
to m. All in all it is not a big issue, unless you plan to compile
identity.

> So I see two solutions :
>  - manually add safeguards against None to almost all Python
> implementations of Ops,

It will not work, unless the optimizer simplifies m+0 to
CrashIfNone(m), and that seems sort of futile. The error messages
could be improved, though, I'll agree with that.

>  - add a global mechanism preventing None as inputs to Ops, possibly
> with a way of disabling it for the cases where we really want None to be
> an accepted value.

Or make it so that None is not an accepted value of certain types such
as TensorType. No need to make a mechanism more general than it should
be. The "way of disabling it" then becomes extremely simple: a type
constructor that wraps any other type and accepts None, i.e. matrix ->
Maybe(matrix).

Olivier

Olivier Breuleux

unread,
Feb 11, 2010, 5:18:43 PM2/11/10
to thean...@googlegroups.com
In fact the tutorial on making a new Type does reject None:

http://deeplearning.net/software/theano/extending/ctype.html#defining-the-methods

See c_extract. So none of the Ops on the page after that needs to
handle this case at all.

Olivier

Pascal Lamblin

unread,
Feb 11, 2010, 5:36:00 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010, Olivier Breuleux wrote:
> >> No C Op currently accepts None, thanks to the following code:
>
> It is automatic. The code is in TensorType.c_extract and it should
> also be in ScalarType.c_extract, though I didn't look. That code is
> always executed when transferring data from Python, before an Op gets
> to do anything.

So, currently, there's no way for a C Op to accept None as a value,
right? But we still want Python Ops to accept them?

> > So, in some cases, 1 * None = None, and None + 0 = None.
> > I consider this as a bug, and I don't think we "have nothing to change".
>
> I'm pretty sure it is the optimizer's fault. m*1 and m+0 are optimized
> to m. All in all it is not a big issue, unless you plan to compile
> identity.

Well, should identity(None) crash, or return None?
What about, say, first(None, 0)? first(None, None)?
If we decide to allow None as inputs to Ops, we will have to answer to
those questions.

> > So I see two solutions :
> > �- manually add safeguards against None to almost all Python
> > implementations of Ops,
>
> It will not work, unless the optimizer simplifies m+0 to
> CrashIfNone(m), and that seems sort of futile. The error messages
> could be improved, though, I'll agree with that.
>
> > �- add a global mechanism preventing None as inputs to Ops, possibly
> > with a way of disabling it for the cases where we really want None to be
> > an accepted value.
>
> Or make it so that None is not an accepted value of certain types such
> as TensorType.

That's what I meant.
--
Pascal

James Bergstra

unread,
Feb 11, 2010, 5:44:41 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010 at 4:18 PM, Olivier Breuleux <breu...@gmail.com> wrote:
> This said, there *is* some elegance in the Haskell way. We could
> forbid both None and UNDEFINED as valid values for a type, and add a
> Maybe class that explicitly adds None/UNDEFINED to the pool of
> accepted values. Thus the type signature of default would become
> (Maybe TensorType -> TensorType -> TensorType), or in general (Maybe a
> -> a -> a). Essentially, matrix('x') could not be None, but
> Maybe(matrix)('x') could. It is less error prone and we can have nicer
> error messages, so if the change can be made easily, I support it.
>
> Olivier

This was my preferred solution too, but I think this requires
addressing the same technical challenges as when trying to add shape
or stride information to TensorType.

The trouble is that we don't want to say that MaybeTensorType ==
TensorType, but we do want many optimizations on TensorType to also
work on MaybeTensorType.

TensorType is slightly more restrictive than MaybeTensorType, and we
didn't plan ahead when designing Types to account for the various
set-relationships that might exist between Types.

Not that someone shouldn't try to do it this way... it would be great
to have a stronger type system.
--
http://www-etud.iro.umontreal.ca/~bergstrj

Olivier Breuleux

unread,
Feb 11, 2010, 6:08:11 PM2/11/10
to thean...@googlegroups.com
On Thu, Feb 11, 2010 at 5:44 PM, James Bergstra
<james.b...@gmail.com> wrote:
> On Thu, Feb 11, 2010 at 4:18 PM, Olivier Breuleux <breu...@gmail.com> wrote:
>> This said, there *is* some elegance in the Haskell way. We could
>> forbid both None and UNDEFINED as valid values for a type, and add a
>> Maybe class that explicitly adds None/UNDEFINED to the pool of
>> accepted values. Thus the type signature of default would become
>> (Maybe TensorType -> TensorType -> TensorType), or in general (Maybe a
>> -> a -> a). Essentially, matrix('x') could not be None, but
>> Maybe(matrix)('x') could. It is less error prone and we can have nicer
>> error messages, so if the change can be made easily, I support it.
>>
>> Olivier
>
> This was my preferred solution too, but I think this requires
> addressing the same technical challenges as when trying to add shape
> or stride information to TensorType.
>
> The trouble is that we don't want to say that MaybeTensorType ==
> TensorType, but we do want many optimizations on TensorType to also
> work on MaybeTensorType.

No, not really, we don't. The point is, TT is not the same type as
Maybe TT. We don't really want to support arithmetic operators on
Maybe TT, which would only be for ops like the default op (and note
that the type signature I gave for the default op under that system
would always output TT, not Maybe TT).

Optimization wise, there is nothing to do, much to the contrary: x+0
-> x is true only if x is a scalar or a tensor. If x is None, we
expect this to crash and in that situation optimization would change
the behavior. Currently, some optimizations are technically wrong
because one value of the domain (None) does not behave numerically,
although I would say that this is not a significant problem.

> TensorType is slightly more restrictive than MaybeTensorType, and we
> didn't plan ahead when designing Types to account for the various
> set-relationships that might exist between Types.
>
> Not that someone shouldn't try to do it this way... it would be great
> to have a stronger type system.

We don't really need to do that. TT and Maybe TT need not be
interoperable, it should only be possible to cast from one to the
other, and of course a cast from Maybe TT to TT would fail if the data
is None, so that the output is indeed guaranteed to be a tensor, as
one would expect. I believe this is relatively easy to implement.

Olivier

James Bergstra

unread,
Feb 11, 2010, 6:12:21 PM2/11/10
to thean...@googlegroups.com

If you don't care about making them interoperable, then would Generic
fit the bill?
(Along with the cast you mention, from Generic->TensorType)

--
http://www-etud.iro.umontreal.ca/~bergstrj

Olivier Breuleux

unread,
Feb 11, 2010, 6:16:19 PM2/11/10
to thean...@googlegroups.com
You've pretty much convinced me that removing None from TensorType's
domain is a good idea and it would be better if it worked like that.
The issue is more about implementing that change versus doing nothing.
Now, I don't know if this issue has caused any significant problems in
the past (I'd wager that it did not), but I don't think doing nothing
is a big deal.

So essentially, I think it's worth trying just to see if it can be
done easily, and if it can't, too bad, let's just leave it at that.
The current behavior is sensible, even if it's not perfect.

Olivier

Olivier Breuleux

unread,
Feb 11, 2010, 6:21:07 PM2/11/10
to thean...@googlegroups.com
In theory, Generic fits any bill. The issue here, though, is what
TensorType you cast Generic to: is it a matrix, a vector or what? At
that point, you don't know.

Olivier

On Thu, Feb 11, 2010 at 6:12 PM, James Bergstra

James Bergstra

unread,
Feb 11, 2010, 7:04:32 PM2/11/10
to thean...@googlegroups.com
Perhaps slightly off-topic, I was thinking that it might be useful to
have a general cast function. This Op would work on any input-output
type pair, and the implementation of perform looks something like
this:

out[0][0] = node.outputs[0].type.filter(input[0], strict=False)

If the cast fails, then it fails. If it succeeds, then the output is
guaranteed to have the required properties.

James

--
http://www-etud.iro.umontreal.ca/~bergstrj

Reply all
Reply to author
Forward
0 new messages