Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Message from discussion Basic Backpropagation Learning Strategy (Tutorial)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Greg Heath  
View profile  
 More options Apr 19 2008, 2:32 am
Newsgroups: comp.ai.neural-nets
From: Greg Heath <he...@alumni.brown.edu>
Date: Fri, 18 Apr 2008 23:32:49 -0700 (PDT)
Local: Sat, Apr 19 2008 2:32 am
Subject: Basic Backpropagation Learning Strategy (Tutorial)
BASIC BACKPROPAGATION LEARNING STRATEGY

INTRODUCTION

The text below contains first draft notes for a  lecture on the basic
strategy behind backpropagation learning.

Comments and corrections are welcome.

To limit the number of subscripts, only one output layer neuron, one
hidden layer (with H nodes) and one training input vector (with I
components ) are considered. This excludes consideration of batch
learning which in many cases is superior.

Zero mean inputs and tanh sigmoid hidden node activation units are
usually recommended for fast stable learning. However, for tutorial
purposes, logistic sigmoid activation units are used below for both
layers. Therefore

s(t)     =  1 / ( 1 + exp(-t) )
ds/dt  =  s'(t)  =  s *(1-s)

The governing equations for the desired output y resulting from the
applied input training vector x = (x1, x2, ...xI) are

1. Net input to the jth hidden node

uj     =   SUM( i  =  0, I ){ w1ji *xi }

2. Output of the jth hidden node

hj     =   s(uj)

3. Net input to the output node

v      =   SUM( j  =  0, H ){ w2j *hj }

4. Output of the output node

z      =   s(v)

5. Output node error

e      =   z - y

6.  Output node squared error

SE   =   e^2

The training strategy for changing weights to minimize the
nonnegative
squared error SE is to choose w1ji and w2j so that

SE >= 0   ==>

d(SE)/d(w1ji)  <=  0,
and
d(SE)/d(w2j)   <=  0.

The following weight changes are sufficient for that purpose:

Dw1ji   =  - eta1 *e *w2j *xi,   (0 < eta1 < 1)
and
Dw2j   =  - eta2 *e *hj,            (0 < eta2 < 1),

where eta1 and eta2 are empirical learning rates.

The term backpropagation is used to emphasize the fact that
the error in the output layer, e, is used to modify the hidden node
weights w1ji instead of trying to use the hidden node errors which
are unknown. This is made possible by a straightforward application
of the derivative chain rule.

OUTPUT LAYER TRAINING

Net input to the output layer neuron

v   =   SUM( j  =  0, H ){ w2j *hj }

v        Net input to the output neuron
H       Number of hidden neurons
hj       Output of the jth hidden neuron
          ( j = 1, 2 ...H )
w2j    Weight applied to hj
h0      Output of a constant bias node
          h0  ==  1
w20   Output bias ( weight )

Change in v due to a change in w2j

Dv  =  ( dv/dw2j ) *Dw2j  =  hj *Dw2j

Output of the output neuron

z      =   s(v)

Change in output due to a change in v

Dz   =   ( dz/dv ) *Dv
        =  s'(v) *Dv
        =   z *( 1 - z ) *Dv

Change in output due to a change in w2j

Dz   =   ( dz/dw2j ) *Dw2j
        =   ( dz/dv ) *(dv/dw2j)  *Dw2j
        =   z *( 1 - z) *hj *Dw2j

Output Error

e      =   z - y            Output error
SE   =   e^2            Squared Error

Change in SE due to a change in w2j

DSE   =  ( dSE/dw2j ) *Dw2j

           =  2 *e *( dz/dw2j ) *Dw2j
           =  2 *e *z*( 1 - z ) *hj *Dw2j

Error Minimization Strategy

Dw2j   =  - eta2 *e *hj,    (0 < eta2 < 1)

DSE  =  -2 *eta2 *z*(1-z) *(e*hj)^2 <= 0

HIDDEN LAYER TRAINING

Net input to the jth hidden layer neuron

uj   =   SUM( i  =  0, I ){ w1ji *xi }

uj          Net input to the jth hidden neuron
I            Number of input fan-in units
xi          Output of the ith fan-in unit
             ( i = 1, 2 ...I )
w1ji      Weight applied to xi
x0        Output of a constant bias node
            x0  ==  1
w1j0    Bias for the jth hidden node

Change in uj due to a change in w1ji

Duj  =  ( du/dw1ji ) *Dw1ji  =  xi *Dw1ji

Output of the jth hidden neuron

hj      =   s(uj)

Change in hj due to a change in uj

Dhj   =   ( dhj/duj ) *Duj
         =  s'(uj) *Duj
        =   hj *( 1 - hj ) *Duj

Change in hj due to a change in w1ji

Dhj   =   ( dhj/dw1ji ) *Dw1ji
        =   ( dhj/duj ) *(duj/dw1ji)  *Dw1ji
        =   hj * ( 1 - hj ) * xi *Dw1ji

Output of the output neuron

z      =   s(v)

v     =   SUM( j  =  0, H ){ w2j *hj }
hj    =   s(uj)
uj   =   SUM( i  =  0, I ){ w1ji *xi }

Change in output due to a change in w1ji

Dz   =   ( dz/dw1ji ) *Dw1ji
        =   ( dz/dv ) *(dv/hj)  *(dhj/dw1ji) * Dw1ji
        =   z *( 1 - z) *w2j *hj *(1 - hj ) *xi *Dw1ji

Output error

e      =   z - y            Output error
SE   =   e^2    Squared Error

Change in SE due to a change in w1ji

DSE   =  ( dSE/dw1ji ) * Dw1ji

           =  2 *e *(dz/w1ji) * Dw1ji

           =  2 *e *z*( 1 - z ) *w2j *hj *(1 - hj ) *xi *Dw1ji

Error Minimization Strategy

Dw1ji   =  - eta1 *e *w2j *xi ,  ( 0 < eta1 < 1)

DSE  =  -2 *eta1 *z*( 1 - z ) *hj *(1 - hj ) ( e *w2j *xi )^2 <= 0

Hope this helps.

Greg


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google