What is Autoencoders?
Difference with Regular Encoders
In regular encoders, we need label to optimize our encoder.
However, most of the data in the world doesn’t have label. Hence, the goal of autoencoder is to optimize our encoder without the help of label
How Encoder Works?
We pass in our input data , and the encoder should compress and reorganize, then output a feature vector
Implementation
How we optimize without label ?
After encoding, we’ll end up with a feature vector, this feature vector should contain compressed information about the input
Hence, we try to decompress the feature vector with a decoder, then compare the decoder’s output and the encoder’s input to compute the loss
Feature vector should have lower dimension than input
Reason 1: Prevent Trivial Solutions
If feature vector’s dimension is too large, the network can simply learn the identity function, which is bad
Reason 2: Force Compression and Feature Learning
If the vector has small dimension, we force the encoder to extract important features in order to fit in the feature vector
Reason 3: Enables Generalization
With limited dimensions, the encoder must extract generalizable features rather than specific training example

How we use trained encoder?
After training, we remove the decoder and use encoder for a downstream task (transfer learning)
Autoencoder can used to initialize supervised model
