As I'm relatively new to Theano I have the question if anybody
currently have
a working RNN implementation available (BPTT or RTRL).
It would be really nice if he can post his code for studying reasons,
or some
tips how to start ;) I will post my code if I manage to do this ;)
(thinking on a Hessian Free (Martens) Implementation as well ;),
but at the moment I'm fighting with some language (theano) related
issues ;)
----------------------------------------------------------------------
# I think this should be the right way to start
----------------------------------------------------------------------
1. Op that takes three arguments (hidden_step,hidden_step_grad,
params)
2. Op implements a transformation from (hidden_init,input) ->
(output,).
3. Op contains the recurrence within itself.
----------------------------------------------------------------------
Thanks in advance!
..
Mat
I have some code to do recurrence that is available as part of
http://code.google.com/p/pynnet/. It does BPTT only though.
But if you want to write your own thing, you should use scan [
http://deeplearning.net/software/theano/library/scan.html ] rather
than write a new recurrent op.
<snipped>
> Vanilla RNNs are trivial to implement and you do not need to add any new op
> to Theano. I attached the code implementing it. I think it might make sense
> even to convert this into a tutorial on RNNs .. which I will probably do
> soon.
<snipped>
Hi! =)
Big thanks for the code you offered! =)
It will help me a lot & saves time ;)
<snipped>
> Doing something different then BPTT might be an interesting extension of
> the scan op, though I think a revision of scan (which is on my TODO list)
> might be much much more important (scan has some issues with 2nd order
> gradients ..). If you would be interested in helping with such a revision
> let me know.
<snipped>
Regarding the Hessian free; I will try to implement a simple python/
theano
version. If this works, and if there is much free time I will drop you
a mail, as
I'm interested in a version running completely in theano as well.
But at the moment I'm have to finish my work on some basic sequence
learners tasks.
Maybe this is a nice thing to have for the tutorial section as
well ;), but you will have
to wait until the code is ready ;) (will drop you a mail, if you're
interested in this)
thx. Mat
I am trying to implement a modular Neural Network object library using Theano.Implementing the non-recurrent network was simple, assuming each Block object and Connection object knows what its connected toand each Block/Connection object offers a compile() function that returns a theano variable caclulated from the compile() of its input objects.If I were to call compile() on the outblock below, the system walks its way back throughthe compile() of connection3, hiddenblock2, connection2, hiddenblock2, connection1 and finally the input variable in inblockWorks great for non-recurrent models.<inblock> ----(connection1)-----> <hiddenblock1> ----(connection2)----> <hiddenblock2> ----(connection3)----> <outblock>I am having trouble figuring out the best way to implement recurrent connections in a generic way.From your example I sorta, kinda understood how to implement recurrence from one node back to itself.A simplified version of your code segmentdef step(inputsignaldotW,hidden_tm1, recurrentW):return inputsignaldotWsequence+T.dot(hidden_tm1, recurrentW)hidden=scan(sequences=T.dot(inputsignal, W), output_info=[hidden0], recurrentW)Is there a simple modular local way of doing this kind of recurrent connection when the connection is going back through multiple layers ?Sarvi
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
HTH, Razvan
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscribe@googlegroups.com.
Let me see if I got this right.clone helps clone a graph replacing parts of the graph with another graph?
But I am not sure I understand your example though.The solution I am hearing is that, for each block we should create the output variable of that blockusing the signal/output variable of all its non-recurrent connection objects and their inputs, as well as dummy variables for recurrent connections.And these variables representing recurrent connections are expected to be replaced through the theano.clone() function with graphs created for the recurrent connections through scan() ???Do I have it right?
Are there tutorial examples on theano.clone()? I am trying to understand how clone identifies which inputs get replaced, how it relates scan() and to implementing recurrence??
Can you point me to these recurrent implementations you refer to, for me to look at.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users...@googlegroups.com.
ValueError: dimension mismatch in args to gemv (50,50)x(5)->(50)
Apply node that caused the error: GpuGemv{no_inplace}(GpuGemv{inplace}.0, TensorConstant{1.0}, GpuDimShuffle{1,0}.0, <CudaNdarrayType(float32, vector)>, TensorConstant{1.0})
Inputs types: [CudaNdarrayType(float32, vector), TensorType(float32, scalar), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, vector), TensorType(float32, scalar)]
HINT: Use another linker then the c linker to have the inputs shapes and strides printed.
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Apply node that caused the error: forall_inplace,gpu,scan_fn&scan_fn}(Shape_i{0}.0, GpuSubtensor{int64:int64:int8}.0, GpuIncSubtensor{InplaceSet;:int64:}.0, GpuIncSubtensor{InplaceSet;:int64:}.0, Shape_i{0}.0, Shape_i{0}.0, <CudaNdarrayType(float32, matrix)>, <CudaNdarrayType(float32, matrix)>, <CudaNdarrayType(float32, matrix)>)
Inputs types: [TensorType(int64, scalar), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix), TensorType(int64, scalar), TensorType(int64, scalar), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]
Inputs shapes: [(), (10, 5), (1, 5), (11, 5), (), (), (5, 50), (50, 50), (50, 5)]
Inputs strides: [(), (5, 1), (0, 1), (5, 1), (), (), (50, 1), (50, 1), (5, 1)]
Inputs values: [array(10L, dtype=int64), 'not shown', <CudaNdarray object at 0x0000000022DD20F0>, 'not shown', array(10L, dtype=int64), array(10L, dtype=int64), 'not shown', 'not shown', 'not shown']
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
ValueError: dimension mismatch in args to gemv (50,50)x(5)->(50)
This mean that a dot between a matrix and a vector do have have inputs with good shapes.
There is this HINT that I suggest you use, normally, it tell you exactly where in your code this dot is comming from:
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
Fred
I think the message is clear, replace a drift by an OrderedDict. Where did you got this code? It use an older interface of Theano. Maybe the code could be updated at the origin to prevent other people from having this problem.
Fred