Introduction

It works like convolutional layer. However, it doesn’t have a filter, instead, it has a pooling function which will be perform on the input

Most importantly, it doesn’t have learnable parameters

Its intention is reducing the spatial dimensions while preserving the most important information given in the image

Common Pooling Function

Max Pooling

Only preserve the maximum value in each kernel

Average Pooling

Averaging all the values in the input to give output

What is it?

A pooling layer downsamples feature maps by applying a simple mathematical operation (like finding the maximum or average) across small regions of the input. Unlike convolutional layers, pooling layers have no learnable parameters - they use fixed operations that never change during training.

Main purpose: Reduce image size while keeping the most important information.

How it works

Think of it as sliding a small window across your image and summarizing what’s inside each window with a single number. This makes the image smaller but preserves key features.

Common Types

Max Pooling

Takes the highest value from each window region. This preserves the strongest features - like keeping the brightest part of an edge or the most activated pixel in a pattern.

Average Pooling

Takes the mean of all values in each window. This gives a smoother representation that captures overall texture and patterns rather than just the strongest signals.

Key Benefits

  • Smaller data: Reduces computational load for later layers
  • Translation invariance: Makes the network less sensitive to small shifts in object position
  • Prevents overfitting: Reduces the model’s ability to memorize specific pixel arrangements