Fwd: About Clarification On Some Notation

7 views

Skip to first unread message

TUNABERK ALMACI

unread,

Oct 25, 2021, 9:48:51 AM10/25/21

to COMP541

---------- Forwarded message ---------
From: TUNABERK ALMACI <talm...@ku.edu.tr>
Date: Mon, Oct 25, 2021 at 4:25 PM
Subject: About Clarification On Some Notation
To: <dyu...@ku.edu.tr>

Merhabalar Hocam,

Dive into Deep Learning kitabında Chapter 6'i tekrar okurken daha önce gözümden kaçan ve anlamadığım bir notasyonla ilgili sorum vardı.

Burdaki notasyonda H matrixi "hidden representation of input pixels" olarak ifade edilmiş. Bundan tam olarak ne kast edildiğini anlamadım çünkü X ve H aynı boyuttalar. Ayrıca fourth-degree weight tensor W nun da neden bu şekilde ifade edildiğini anlamadım.

Yardımcı olabilirseniz sevinirim.

Teşekkürler,

İyi günler.

Tunaberk Almaci

Shadi Sameh Hamdan

unread,

Nov 2, 2021, 7:13:00 AM11/2/21

to TUNABERK ALMACI, COMP541

Hello Tunaberk,

I wasn't able to understand your question directly as I do not speak Turkish but here is a response based on what I understood from the translation, please let me know if I didn't address all your questions:

Essentially what the authors are doing is reformulating regular MLPs. Instead of flattening the input image of size [L, W] into a vector X of size [L*W], they are keeping the original shape as it is. In a regular MLP, we would multiply a weight matrix W of size [O, L*W], where O is the dimension of the output, by the input vector to get an output vector H of size [O]. Let's assume that the output is of the same size as the input [L*W]. The weight matrix in this case would also be [L*W, L*W].

In this scenario, element Hi in the output hidden vector is as follows (excluding bias terms):

Now, they change both the input and output to the shape of [L, W] instead of [L*W]. To (hopefully) make it easier to understand, I will split it into two steps. First, let's assume we did not flatten the image, and instead we simply kept it in the original dimension of [L, W]. The output hidden vector, H, which is still of size [L*W], can be expressed as follows:

Notice here that the weight matrix becomes 3 dimensional. In the first example, each input element can be identified by just its index (k) and similarly for the elements in the output hidden vector (i). The weight connecting these two elements can simply be denoted as Wi,k. However, when the input becomes two dimensional, we need two indices (k,l) to identify its position. Thus, the weight connecting input element at position (k,l) to output element at position (i) is now denoted as Wi,k,l. Now, instead of denoting the output as just a vector, we can have it as a matrix of the same size as the input. The output hidden representation matrix H is now of size [L, W] and can be expressed as follows:

In the next step, they simply reindex k as i+a and l as j+b to obtain the next formula.

Note that all of this is still a fully connected approach, i.e. every input element is connected to every output element. We are just changing or reshaping the inputs, outputs, and the weights without affecting the underlying computations. In the next steps in that chapter, they modify this representation slightly to enable translation invariance and thus convolutions.

Please let me know if anything is unclear.

Best,

Shadi

--
You received this message because you are subscribed to the Google Groups "COMP541" group.
To unsubscribe from this group and stop receiving emails from it, send an email to COMP541+u...@ku.edu.tr.
To view this discussion on the web visit https://groups.google.com/a/ku.edu.tr/d/msgid/COMP541/CAHY873yo-AOLvPP2AnFbCN6%2BYpbmbgU5fHTqvME9A6TgimLaCg%40mail.gmail.com.

Reply all

Reply to author

Forward

0 new messages