Deep Learning Architectures Series — Hands on with VGG-16(Part-I)

Updated: Nov 29, 2020


VGG16 Architecture

Welcome guys to the journey of Convolutional Neural Networks where we will take an in-depth look at all the standard CNN architectures that exist in deep learning, let it be as simple as VGG-16 or as complex as Xception. In today’s world, Convolutional Neural Networks has come a long way in terms of capability. It can solve complex image classification tasks that are even impossible for humans, like classifying rare species of dogs. In Part I of this series, we will take a look at VGG-16 architecture.

VGG16 Architecture

What is VGG-16?

VGG-16 is a convolutional neural network architecture, its name VGG-16 comes from the fact that it has 16 layers. Its layers consist of Convolutional layers, Max Pooling layers, Activation layers, Fully connected layers.

VGG16 Architecture

There are 13 convolutional layers, 5 Max Pooling layers, and 3 Dense layers which sum up to 21 layers but only 16 weight layers. Conv 1 has 64 filters while Conv 2 has 128 filters, Conv 3 has 256 filters while Conv 4 and Conv 5 has 512 filters. VGG-16 network is trained on ImageNet dataset which has over 14 million images and 1000 classes and achieves 92.7% top-5 accuracy. It surpasses AlexNet network by replacing large filters of size 11 and 5 in the first and second convolution layers with small size 3x3 filters. Now, we will train an image dataset with VGG-16 in Keras. So, let’s get started hands-on.

Implementation


So the next few lines will contain a tutorial on how to run models in Google Colab so that people without GPUs in their system can use Google Colab to run their deep learning models on GPU provided by Google. Here we will be using the cats and dogs dataset from Kaggle, so let's get started.

!pip3 install keras
!pip3 install tensorflow-gpu==1.14
import keras
import matplotlib.pyplot as plt
from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten, MaxPool2D
from keras.optimizers import Adam,SGD,Adagrad,Adadelta,RMSprop
from keras.utils import to_categorical

So, after getting all the packages installed in our Google Colab server we are good to run our model. We will be using Keras, TensorFlow libraries for running our deep learning model, and for visualization purposes, we will need the Matplotlib library. Download the cats and dogs dataset from Kaggle through this link. You have to put it in the right folder and then we are good to go to preprocess our data using ImageDataGenerator.

from keras.preprocessing.image import ImageDataGenerator
import numpy as np
trdata = ImageDataGenerator()
traindata = trdata.flow_from_directory(directory="training_set/training_set",target_size=(224,224))
tsdata = ImageDataGenerator()
testdata = tsdata.flow_from_directory(directory="test_set/test_set", target_size=(224,224))

ImageDataGenerator is a Keras preprocessing library where we preprocess our data. Here, we are setting a standard size of all the images to have the dimensions 224x224 as that is the standard input size for a VGG-16 network. Note - (Please give the right directory for the train and test images) We will now define our deep learning model using Keras Sequential API. We will use the VGG-16 network which has Convolution Layers, Max Pooling Layers, and Dense Layers in it.

model = Sequential()
model.add(Conv2D(input_shape=(224,224,3),filters=64,kernel_size=(3,3),padding="same", activation="relu"))
model.add(Conv2D(filters=64,kernel_size=(3,3),padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Flatten())
model.add(Dense(units=4096,activation="relu"))
model.add(Dense(units=4096,activation="relu"))
model.add(Dense(units=2, activation="softmax"))

Now we have to define what kind of optimizer we will use along with the type of loss function.

from keras.optimizers import Adam opt = Adam(lr=0.001) model.compile(optimizer=opt, loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])

Now, we will be training our model, here we have used two callbacks(ModelCheckpoint and EarlyStopping), which we will not get into the details of those two right now, but if you have doubt, do post a comment and I will surely help.

from keras.callbacks import ModelCheckpoint, EarlyStopping
checkpoint = ModelCheckpoint("vgg16_1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)
early = EarlyStopping(monitor='val_acc', min_delta=0, patience=20, verbose=1, mode='auto')
hist = model.fit_generator(steps_per_epoch=10,generator=traindata, validation_data= testdata, validation_steps=10,epochs=10,callbacks=[checkpoint,early])

Now as we can see our model has started training.

Epoch 1/10 10/10 [==============================]14s 1s/step — loss: 6.8730 — acc: 0.5125 — val_loss: 7.8576 — val_acc: 0.5125 
Epoch 00001: val_acc improved from -inf to 0.51250, saving model to vgg16_1.h5 
Epoch 2/10 10/10 [==============================]3s 342ms/step — loss: 7.1524 — acc: 0.5563 — val_loss: 8.4620 — val_acc: 0.4750 
Epoch 00002: val_acc did not improve from 0.51250 
Epoch 3/10 10/10 [==============================]3s 343ms/step — loss: 8.5124 — acc: 0.4719 — val_loss: 7.5050 — val_acc: 0.5344 
Epoch 00003: val_acc improved from 0.51250 to 0.53438, saving model to vgg16_1.h5 
Epoch 4/10 10/10 [==============================]3s 343ms/step — loss: 8.1598 — acc: 0.4938 — val_loss: 8.1598 — val_acc: 0.4938 
Epoch 00004: val_acc did not improve from 0.53438 
Epoch 5/10 10/10 [==============================]3s 340ms/step — loss: 8.5627 — acc: 0.4688 — val_loss: 8.3109 — val_acc: 0.4844 
Epoch 00005: val_acc did not improve from 0.53438 
Epoch 6/10 10/10 [==============================]3s 341ms/step — loss: 7.4043 — acc: 0.5406 — val_loss: 7.9583 — val_acc: 0.5062 
Epoch 00006: val_acc did not improve from 0.53438 
Epoch 7/10 10/10 [==============================]4s 370ms/step — loss: 8.6635 — acc: 0.4625 — val_loss: 7.9225 — val_acc: 0.5085 
Epoch 00007: val_acc did not improve from 0.53438 
Epoch 8/10 10/10 [==============================]3s 347ms/step — loss: 8.3613 — acc: 0.4813 — val_loss: 8.1094 — val_acc: 0.4969 
Epoch 00008: val_acc did not improve from 0.53438 
Epoch 9/10 10/10 [==============================]3s 342ms/step — loss: 8.5124 — acc: 0.4719 — val_loss: 7.6561 — val_acc: 0.5250 
Epoch 00009: val_acc did not improve from 0.53438 
Epoch 10/10 10/10 [==============================]3s 342ms/step — loss: 8.6131 — acc: 0.4656 — val_loss: 7.9079 — val_acc: 0.5094 
Epoch 00010: val_acc did not improve from 0.53438

So after training our deep learning model for 10 epochs, we will stop, you can obviously train the deep learning model for larger epochs for more accuracy.

Inferencing


Now we will perform inference on our deep learning model.

from keras.preprocessing import image
img = image.load_img("website-donate-mobile.jpg",target_size=(224,224))
img = np.asarray(img)
plt.imshow(img)
img = np.expand_dims(img, axis=0)
from keras.models import load_model
saved_model = load_model("vgg16_1.h5")
output = saved_model.predict(img)
if output[0][0] > output[0][1]:
    print("cat")
else:
    print('dog')

VGG16 Prediction Result

So this computer vision model gives the right prediction that this is a dog. So, finally, the deep learning model is built and the next set of steps could be to deploy it in the cloud or build an android app out of it, which we will get into eventually in future artificial intelligence and computer vision blogs. So, this is all for this article, if you like the article, do give a clap and also wait for our next Part which will cover other CNN architectures. If you have any problems regarding this, please do comment in the comment section or contact me on Twitter. The Google Colab Notebook for this article is given in this link. Do check it out and run it on Google Colab with the dark mode on, it’s really awesome, do check that out too. Do check my other articles. For any help do mail me at subham.tiwari186@gmail.com and don't forget to subscribe to this blog for more such awesome content on deep learning, data science, computer vision, and AI. Thank you.



 

Drop Me a Line, Let Me Know What You Think

                                                                                                  © Subham Tewari                                                                                        Privacy Policy