Vgg16 is a Convolutional Neural Network model made for the image classification task. Vgg16 won the 2014 ImageNet competition. It was able to classify 1000 images of 1000 different categories with 92.7% accuracy. Vgg model is free to use and you can use this in your projects. Vgg16 TensorFlow implementation can make it easier to code.
Short introduction of Convolutional Neural Network (CNN):
Convolutional Neural Networks have fully connected Neural Networks which takes Image as input and finds the patterns inside the image and uses these features like the training purpose so that it can classify the unseen image. The architecture of a ConvNet is analogous to that of the connectivity pattern of Neurons in the Human Brain and was inspired by the organization of the Visual Cortex.
The architecture of Vgg16:
The input to the cov1 layer is of size 224 x 224 RGB image. The image is passed through a stack of convolutional layers, where the filters were used with a very small receptive field: 3×3. In one of the configurations, it also utilizes the 1×1 convolution filter, which can be seen as a linear transformation of the input channels. The convolution stride is fixed to 1 pixel; the spatial padding of Conv. layer input is such that the spatial resolution is preserved after convolution, i.e. the padding is 1-pixel for 3×3 Conv. layers. Spatial pooling is carried out by five max-pooling layers, which follow some of the Conv. layers (not all the conv. layers are followed by max-pooling). Max-pooling is performed over a 2×2-pixel window, with stride 2.
Layers of Vgg16:
The first and second layer has 4096 channel each, third performs 1000-way ILSVRC classification with 1000 channels. The final layer is the SoftMax layer. All hidden layers are equipped with the ReLU.