Step 9:  

Add a standard fully connected (FC) layer with 10 hidden layer nodes.  The FC layer automatically knows how many input values it needs based on the previous layers.  It's output value should be specified to be 10 so that it can categorize 10 digits.

clear all;
close all;
clc;


[XTrain,YTrain] = digitTrain4DArrayData;
size(XTrain)      %images
size(YTrain)      %correct answer labels


XTrain=1-XTrain;  % Reverse the black and white colors.  Save and run the program to see the difference.

perm = randperm(size(XTrain,4),20);  % Randomize the order of images in XTrain
for i = 1:20
subplot(4,5,i);
imshow(XTrain(:,:,:,perm(i)));
end
 

layers = [
imageInputLayer([28 28 1])   
convolution2dLayer(3,8,'Padding','same')
batchNormalizationLayer
reluLayer
maxPooling2dLayer(2,'Stride',2)

convolution2dLayer(3,16,'Padding','same')
batchNormalizationLayer
reluLayer
maxPooling2dLayer(2,'Stride',2)

convolution2dLayer(3,32,'Padding','same')
batchNormalizationLayer
reluLayer
averagePooling2dLayer(7)
 
fullyConnectedLayer(10)     % 10 output layer nodes

In the next step, add a a softmax layer and a categorization layer. The categorization layer will decide which digit was given to the network to classify and will provide it's best guess.