Step 6:
New steps are shown in green
function [W1, W2] = BackpropXOR(W1,W2,X,D);
alpha=0.9; %learning rate
[R C]=size(X); %Get the number of rows and columns of the input matrix X
%R = number of training trials. C = number of input nodesfor k=1:R %each row is a training trials.
x=X(k,:)'; %Extract each training trial (row of X). Note the transpose symbol.
d=D(k); %Extract the correct answer for that trial.
v1=W1*x; %calculate the value of the nodes of the hidden layer (1st layer)
y1=1./(1+exp(-v1)); %Sigmoid activation function
v=W2*y1; %calculate the value of the output node
y=1./(1+exp(-v)); %Sigmoid activation function
e=d-y; %calculate the error of the output node
delta=y.*(1-y).*e; %calculate lower-case delta.
%This is just the network's error times the derivative of the activation function.
%********************** BACKPROPAGATION *************************
%Start the backprop process here
e1=W2'*delta; %Calculate the error of the hidden layer nodes.
delta1= y1.*(1-y1).*e1; %Calculate the delta of the hidden layer node
%now that you have all the lower case deltas calculated (for the output
%and hidden layer nodes), determine how the weights of the connections
%should be updated.
dw1 = alpha*delta1*x'; %note the transpose symbol on x
W1=W1+dw1; %updated weights for connections between input layer and hidden layer
dw2=alpha*delta*y1'; %Note that y1 is the output of the hidden layer nodes which...
%serve as the INPUT to the output layer node. Also note the transpose symbol on y1
W2=W2+dw2; %updated weights for connections between hidden layer and output layer
end; % for k=1:R