References:
- https://www.ibm.com/topics/convolutional-neural-networks
- https://d2l.ai/chapter_natural-language-processing-applications/sentiment-analysis-cnn.html
- https://developer.ibm.com/articles/introduction-to-convolutional-neural-networks/
What are CNNs?
- Three main layers: Convolutional, Pooling and Fully-Connected Layer
- Convolutional Layer:
- Core building block
- Requires input data, filter and feature map.
- Pooling layer – downsampling, dimensionality reduction
- FC Layer – peforms task
- Types – AlexNet, VGGNet, GoogleNet, ResNet, LeNet-5 (classic CNN architecture)
Different steps:
- Image channels
- Way to represent image in numerical format.
- Pixel represents image. Each pixel mapped to number between 0 to 255 (representing color ranging from 0 to 255)
- Image represented in 3D array each for Red, Green and Blue values
- Convolution
- Extract key features from image
- Filter or kernel used
- Filter strided over input array to produce output array called feature map.
- Pooling
- Adding extra layers of zeros to outer rows and columns in the input array.
- Flattening
- make outputs compatible with an artificial neural network
- multidimensional array to A matrix with one column and n rows (nx1) also called a column vector.
- Full Connection
- typically towards the end of neural network architecture
- also known as dense layers