You could use a `Mapping` transformer to do it. Define a function
```python
def add_dimension(batch):
# I assume that your input is batch[0], it is 2D, (time x batch), every element is an integer in the dictionary
# First, I put the batch as the first dim:
inp = batch[0].dimshuffle(1, 0)
# Then I add two dummy dimensions
inp = inp[:, :, None, None]
return inp, batch[1]
```
and apply the transformer `stream = Mapping(add_dimension, data_stream=stream)`.
But probably you want to embed your input using a `Linear` brick first. In this case you can do similar transformation but with theano operations. Remember, that the embedded input will be 3D (time, batch, feature) and you want to transform to (batch, time, feature, dummy) or (batch, time, dummy, feature) depending if you want to convolve over your features.