Motivation
In the previous lecture, we’ve introduced the components of CNN. However, we still don’t know how can we combine these components to create a good CNN architecture. Hence, this lecture will go through each year’s best model in ImageNet classification challenge and introduce some common rules in building CNN architecture
Resource Computation
Memory Usage
We define the memory usage of a layer to be the size of memory of the layer’s output
Parameters
We define the number of parameters of a layer to be the number of “learnable parameters” it has
FLOPs
FLOPs stands for Floating Point Operations. It counts the number of mathematical operations required for the layer
An easy way to calculate FLOP is to calculate with the following way
Models
AlexNet
AlexNet demonstrates that deep CNN models could achieve great performance on large-scale image recognition tasks
VGG
In AlexNet, researchers use trial-and-error to find model with better performance. VGG introduces some general rules in designing deep network architecture
GoogLeNet
GoogLeNet introduce many innovations for boosting efficiency: reduce parameter count, memory usage, and computation
Residual Networks (ResNet)
It is intuitively that deeper model can emulate shallow model, thus we believe deeper is better in neural network. However, researcher finds deeper models have both larger training and test error, which infer we have optimization problem. Residual network resolves this optimization problem
D-DL4CV-Lec08d-ResidualNetwork
ResNeXt
ResNeXt improve ResNet by introducing “group convolution”