Theano.clone with Multiple Replacements

26 views
Skip to first unread message

John Coolidge

unread,
Oct 14, 2016, 6:37:02 PM10/14/16
to theano-users
Hello,

I'm trying to use theano.clone to implement dropout in my MLP network.  Because I want to apply dropout at multiple layers, I pass the clone call multiple key value pairs to its replacement parameter: replace={layer1:mask*layer1, layer2:mask*layer2, etc} however the graph that's returned seems to have only actually made one of the replacements.  I suspect this is because clone is doing the replacements sequentially and once it's done one replacement it generates a new graph for which the other key value pairs no longer correspond.

Here is some example code that demonstrates the unexpected behavior:

    v = T.lscalar()
    exp1 = 2*v
    exp2 = 4*exp1
    exp3 = 6*exp2
    exp4 = 8*exp3

    print theano.pp(exp4)
    exp5 = theano.clone(exp4, replace={exp1:(3*exp1), exp2:(5*exp2), exp3:(7*exp3)})
    print theano.pp(exp5)
    t = theano.function(inputs=[v], outputs=exp5)
    print t(1) 


The output is:
(TensorConstant{8} * (TensorConstant{6} * (TensorConstant{4} * (TensorConstant{2} * <TensorType(int64, scalar)>))))
(TensorConstant{8} * (TensorConstant{7} * (TensorConstant{6} * (TensorConstant{4} * (TensorConstant{2} * <TensorType(int64, scalar)>)))))
2688

Although the clone adds the 7 factor to the new graph, it does not add the 3 or 5 factors such that the output for an input value of 1 is 8*7*6*4*2*1 instead of 8! as I would have expected.

I'm guessing this is how the clone function is supposed to work, but does anyone see how to get the desired behavior I'm looking for?  Perhaps I could apply the replacements one at a time and after each replacement update the remaining replacement key value pairs to point to corresponding points in the new graph, but I'm not sure how to find these corresponding points.  Or perhaps there's a function like the clone but that actually makes the replacements in place so that the other replacement key value pairs would not be invalidated after the first replacement?  Any ideas would be greatly appreciated!

Pascal Lamblin

unread,
Oct 14, 2016, 7:36:03 PM10/14/16
to theano...@googlegroups.com
Hi,

Yes, it is an actual problem that we never managed to fix in a
satisfactory way. The current behaviour is inconsistent.

Doing the substitution one at a time is a workaround, I think Blocks
does that for dropout, but it can be cumbersome to have everything
cloned over and over again.

Another option, still experimental, may be the `map_variables` function
in scan_modules/scan_utils.

Finally, it is actually possible to replace Apply nodes inputs manually.
In your case, you could do something like:

>>> exp2.owner.inputs[1] = 3*exp1
>>> exp3.owner.inputs[1] = 5*exp2
>>> exp4.owner.inputs[1] = 7*exp3
>>> print(theano.pp(exp4))
(TensorConstant{8} * (TensorConstant{7} * (TensorConstant{6} * (TensorConstant{5} * (TensorConstant{4} * (TensorConstant{3} * (TensorConstant{2} * <TensorType(int64, scalar)>)))))))
>>> exp4.eval({v: 1})
array(40320)

But it can get hard to get right if the same expression is re-used
several times.
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to theano-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


--
Pascal

Pascal Lamblin

unread,
Oct 14, 2016, 8:13:53 PM10/14/16
to theano...@googlegroups.com
On Sat, Oct 15, 2016, Pascal Lamblin wrote:
> Another option, still experimental, may be the `map_variables` function
> in scan_modules/scan_utils.

There seem to be some challenges regarding scalar constants with that
function, but I was able to do the following:

>>> theano.tensor.basic.constant.enable = False
>>> v = theano.tensor.lscalar('v')
>>> exp1 = 2 * v
>>> exp1.name = 'exp1'
>>> exp2 = 4 * exp1
>>> exp2.name = 'exp2'
>>> exp3 = 6 * exp2
>>> exp3.name = 'exp3'
>>> exp4 = 8 * exp3
>>> exp4.name = 'exp4'
>>> replace_dict = {'exp1': (3*exp1), 'exp2': (5*exp2), 'exp3': (7*exp3)}
>>> def replace(var):
... return replace_dict.get(var.name, var)
>>> exp5, = theano.scan_module.scan_utils.map_variables(replace, [exp4])
>>> theano.printing.debugprint(exp5)
Elemwise{mul,no_inplace} [id A] 'exp4'
|TensorConstant{8} [id B]
|Elemwise{mul,no_inplace} [id C] ''
|TensorConstant{7} [id D]
|Elemwise{mul,no_inplace} [id C] ''

The issue is that it introduced a cycle in the graph: it replaced exp3
by 7*exp3, where exp3 is the new one...

I guess that illustrates the challenge of getting replacements right.

John Coolidge

unread,
Oct 14, 2016, 10:08:24 PM10/14/16
to theano-users
I see, thanks Pascal!  Shame map_variables doesn't do the trick in this case.  I think I'll go with the manual approach you recommended as it seems the most efficient and relatively straight forward in my case.
Reply all
Reply to author
Forward
0 new messages