Step 6:  

New steps are shown in green

function [W1, W2] = BackpropXOR(W1,W2,X,D);

alpha=0.9;  %learning rate

[R C]=size(X);  %Get the number of rows and columns of the input matrix X
%R = number of training trials. C = number of input nodes

for k=1:R  %each row is a training trials.   

    x=X(k,:)';  %Extract each training trial (row of X). Note the transpose symbol.
    d=D(k);  %Extract the correct answer for that trial.
   
    v1=W1*x;  %calculate the value of the nodes of the hidden layer (1st layer)
    y1=1./(1+exp(-v1));  %Sigmoid activation function
   
    v=W2*y1; %calculate the value of the output node
    y=1./(1+exp(-v));  %Sigmoid activation function

     e=d-y; %calculate the error of the output node
   
    delta=y.*(1-y).*e;  %calculate lower-case delta.
    %This is just the network's error times the derivative of the activation function.

%********************** BACKPROPAGATION *************************
    %Start the backprop process here
    e1=W2'*delta;  %Calculate the error of the hidden layer nodes.
   
    delta1= y1.*(1-y1).*e1; %Calculate the delta of the hidden layer node
 
    %now that you have all the lower case deltas calculated (for the output
    %and hidden layer nodes), determine how the weights of the connections
    %should be updated.
   
    dw1 = alpha*delta1*x';  %note the transpose symbol on x
    W1=W1+dw1;  %updated weights for connections between input layer and hidden layer
   
    dw2=alpha*delta*y1';  %Note that y1 is the output of the hidden layer nodes which...
    %serve as the INPUT to the output layer node. Also note the transpose symbol on y1
    W2=W2+dw2; %updated weights for connections between hidden layer and output layer

end; % for k=1:R