UC Irvine School of Social Sciences

Step 5:

New steps are shown in green

function [W1, W2] = BackpropXOR(W1,W2,X,D);

alpha=0.9; %learning rate

[R C]=size(X); %Get the number of rows and columns of the input matrix X
%R = number of training trials. C = number of input nodes

for k=1:R %each row is a training trials.

x=X(k,:)'; %Extract each training trial (row of X). Note the transpose symbol.
    d=D(k); %Extract the correct answer for that trial.

    v1=W1*x; %calculate the value of the nodes of the hidden layer (1st layer)
    y1=1./(1+exp(-v1)); %Sigmoid activation function

    v=W2*y1; %calculate the value of the output node
    y=1./(1+exp(-v)); %Sigmoid activation function

e=d-y; %calculate the error of the output node

    delta=y.*(1-y).*e; %calculate lower-case delta.
    %This is just the network's error times the derivative of the activation function.

%********************** BACKPROPAGATION *************************
    %Start the backprop process here
    e1=W2'*delta; %Calculate the error of the hidden layer nodes.
    % Note that you don't technically need a transpose here (W2') since there is only
% one output node and therefore delta is just a single number (scalar).   However,
% to allow generalization to a network with multiple nodes in the output layer,
% we'll add the transpose here (it doesn't hurt).

    delta1= y1.*(1-y1).*e1; %Calculate the delta of the hidden layer node

end; % for k=1:R