I understood the way to compute the forward part in Deep learning. Now, I want to understand the backward part. Let's take `X(2,2)` as an example. The backward at the position `X(2,2)` can compute as the figure bellow

My question is that How to compute `dE/dY` (such as `dE/dY(1,1)`,`dE/dY(1,2)`...) at the first iteration? Does it randomly initial?