Welcome to the Part II of the CNN architecture series, I hope you liked Part I of this series which was based on VGG-16. Today, we will discuss about AlexNet, so let’s dive deep without wasting any time.If you are new and haven’t checked part I, I would like to give a small description. Today, Convolutional Neural Networks(CNNs) are used in all computer vision tasks, so it becomes very necessary for us to understand what is the architecture and how it works.
In case you missed Part I of this series, you can always go back and read it and comeback again.
CNN Architecture Series — Hands on with VGG-16
What is AlexNet?
AlexNet competed in the ImageNet Large Scale Visual Recognition Challenge in 2012. The CNN network achieved a top-5 error of 15.3%, which is 10.8 percentage more than that of the runner up.
Note- If you want a head start on Convolutional Neural Network, do go through my article given below.
AlexNet is a 8 layer convolutional neural network architecture which consists of Convolutional layers, Activation layers, Max Pooling layers and Dense layers. It has 5 convolutional layers, 3 max pooling layers, and 3 dense layers with one output dense-softmax layer.
Conv_1 consists of 11x11 filters, while Conv_2 uses 5x5 filters, and Conv_3 uses 3x3 filters, Conv_4 uses 3x3 filters and Conv_5 uses 3x3 filters.
As you can see above, the first layer is a convolution layer which has a filter size of 11x11 which is very interesting.(Note — VGG-16 which came after AlexNet used many small 3x3 filters which increased the model’s accuracy and performance).
Now, we will train an image dataset with AlexNet in Keras. So, let’s get started hands-on.
As we all need to run our models to see, I am going to give a demo on how to run models on Google Colab so that you can just download the notebook and run it on Colab. I am using a cats and dogs dataset from Kaggle, so let’s get started.
So, we have got all the packages required to run our model. We need keras, tensorflow for running our model and for visualization we will need matplotlib library.
Download the cats and dogs dataset from Kaggle through this link. Put it into the right folder and then we are good to go.
This is a keras preprocessing library called ImageDataGenerator where we preprocess our data. Here, we are setting the size of all the images as 227x227 as that is the standard input size for a AlexNet network. (Note-For VGG-16, the standard input was 224).
Note — (Please give the right directory for the train and test images)
Now we will define our model using Keras Sequential API. We will define the AlexNet network using Convolution Layers, Max Pooling Layers and Dense Layers.
Now we need to define the optimizer and the loss function.
Now, we need to train our model, we have used two callbacks(ModelCheckpoint and EarlyStopping), I am not getting into the details of those two, but if you have doubt, do comment in the comment section and I will help you out.
The training starts...
Epoch 1/10 10/10 [==============================] — 83s 8s/step — loss: 8.0139 — acc: 0.4875 — val_loss: 8.4620 — val_acc: 0.4750 Epoch 00001: val_acc improved from -inf to 0.47500, saving model to alexnet_1.h5 Epoch 2/10 10/10 [==============================] — 80s 8s/step — loss: 7.7065 — acc: 0.5219 — val_loss: 7.8072 — val_acc: 0.5156 Epoch 00002: val_acc improved from 0.47500 to 0.51562, saving model to alexnet_1.h5 Epoch 3/10 10/10 [==============================] — 80s 8s/step — loss: 8.6131 — acc: 0.4656 — val_loss: 7.8576 — val_acc: 0.5125 Epoch 00003: val_acc did not improve from 0.51562 Epoch 4/10 10/10 [==============================] — 80s 8s/step — loss: 7.8072 — acc: 0.5156 — val_loss: 8.4620 — val_acc: 0.4750 Epoch 00004: val_acc did not improve from 0.51562 Epoch 5/10 10/10 [==============================] — 81s 8s/step — loss: 8.3613 — acc: 0.4813 — val_loss: 8.3109 — val_acc: 0.4844 Epoch 00005: val_acc did not improve from 0.51562 Epoch 6/10 10/10 [==============================] — 79s 8s/step — loss: 7.5554 — acc: 0.5312 — val_loss: 7.6561 — val_acc: 0.5250 Epoch 00006: val_acc improved from 0.51562 to 0.52500, saving model to alexnet_1.h5 Epoch 7/10 10/10 [==============================] — 78s 8s/step — loss: 8.3109 — acc: 0.4844 — val_loss: 8.3049 — val_acc: 0.4847 Epoch 00007: val_acc did not improve from 0.52500 Epoch 8/10 10/10 [==============================] — 79s 8s/step — loss: 8.1094 — acc: 0.4969 — val_loss: 8.3109 — val_acc: 0.4844 Epoch 00008: val_acc did not improve from 0.52500 Epoch 9/10 10/10 [==============================] — 80s 8s/step — loss: 7.2531 — acc: 0.5500 — val_loss: 7.1524 — val_acc: 0.5563 Epoch 00009: val_acc improved from 0.52500 to 0.55625, saving model to alexnet_1.h5 Epoch 10/10 10/10 [==============================] — 80s 8s/step — loss: 8.3109 — acc: 0.4844 — val_loss: 7.7568 — val_acc: 0.5188 Epoch 00010: val_acc did not improve from 0.55625
So we have trained our model for 10 epochs, you can obviously train it for larger epochs for more accuracy.
Now we will perform inference on this model.
So this model gives the right prediction that this is a dog. So, finally the model is built and the next set of steps maybe to deploy it in GCP or using flask, which we will eventually cover in future articles.
So, this is all for this article, if you like the article, do give a clap and also wait for our next Part which will cover other CNN architectures. If you have any problems regarding this, please do comment in the comment section or contact me on Twitter.
The Google Colab Notebook for this article is given in this link. Do check it out and run it on Google Colab with the dark mode on, it’s really awesome, do check that out too.