A Brief Timeline of Deep Learning

A Brief Timeline of Deep Learning#

This section walks through the evolution of deep learning for image analysis, highlighting how convolutional neural networks (CNNs) became the core of modern segmentation tools. We will cover the key breakthroughs that led to U-Net, and show how specialized tools like StarDist and Cellpose build upon it.

CNN in Practice (2010-)#

Before CNNs, machine learning in image analysis relied on hand-crafted features. These features were designed manually:

Texture filters
Edge detectors
Shape descriptors

This approach had several problems:

Features worked only for certain types of images.
Human intuition was often biased.
Models didn’t generalize well to new data.

Breakthrough: Let the Model Learn the Features

In the early 2010s, researchers realized that instead of designing features by hand, we could train a model to learn features directly from raw image pixels. This is where Convolutional Neural Networks (CNNs) came in.

What is a CNN?#

A convolutional neural network (CNN) is a type of neural network specially designed for images.

Key Ideas in CNNs:

Convolutional layers: Filters (or “kernels”) slide over the image and detect local patterns like edges, lines, or textures.
Pooling layers: Downsample the image to retain important information while reducing size.
Stacked layers: Combine many layers so the network can build up from simple features to complex ones (e.g., from edges to full cell shapes).

cnn figure

Source: https://medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148

CNNs became popular after the success of AlexNet in 2012, which won the ImageNet competition by a large margin. This moment marked the beginning of the modern deep learning era.

Segmentation as a Pixel Classifier#

While CNNs were first used for image classification (e.g., “Does this image contain a cat?”), biomedical researchers needed more: per-pixel predictions like “Which pixels belong to which cell?”

Early solutions tried to adapt classification networks into segmentation tasks by:

Applying CNNs to small image patches
Upsampling coarse predictions

But this was inefficient and inaccurate for fine biological structures.

Source: SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Introducing U-Net (2015)#

The real breakthrough for biomedical image segmentation came with the introduction of U-Net in 2015 by Ronneberger et al., developed specifically for segmenting cells in light microscopy images.

Key Features#

Component	Function
Encoder (contracting path)	Learns what features are present. Downsamples the image while extracting key patterns.
Decoder (expanding path)	Reconstructs the segmentation mask. Upsamples to original image size.
Skip connections	Link encoder and decoder layers to preserve spatial details like edges and boundaries.

unet architecture

Source: U-Net: Convolutional Networks for Biomedical Image Segmentation

Why U-Net Works So Well for Bioimages:#

Works well even with small datasets.
Preserves fine details through skip connections.
Generalizes across different staining protocols and microscopy types.
Can be trained end-to-end from raw images to segmentation masks.

StarDist (2018): Instance Segmentation With Shape Priors#

StarDist was developed specifically to adress domain specific challenges such as touching nuclei.

Instead of labeling every pixel directly, StarDist models each object as a star-convex polygon—a shape where all points on the boundary can be reached from the center by a straight line.

What StarDist Does:#

Predicts the center of each object.
Predicts a set of rays (lines) that describe the object’s boundary.
Reconstructs each object from its rays and ensures they don’t overlap.

stardist architecture

Source: stardist/stardist

Why It Works:#

Excellent at separating closely packed nuclei.
Uses U-Net as its backbone, but adds geometric reasoning.
Requires moderate annotation effort—annotating centers and boundaries.

StarDist is ideal for: nuclei segmentation in crowded 2D images where objects are mostly round or convex.

Cellpose (2020): A Generalist Model Using Flow Fields#

What if your images contain not just nuclei, but whole cells? What if those cells are irregular in shape—elongated, branched, or overlapping?

Cellpose was designed to solve these broader challenges.

What’s New in Cellpose:#

Introduces a novel idea: flow fields
- For each pixel, the model predicts the direction it should move to reach the object center.
- After computing this flow, pixels are grouped into objects.
Trained on a diverse set of cell and tissue types—generalist approach.
Works out of the box on many image types, without retraining.

cellpose

Source: Cellpose: a generalist algorithm for cellular segmentation

Advantages of Cellpose:#

Doesn’t assume round shapes.
Segments cytoplasm, cells, or nuclei—flexible across tasks.
Minimal parameter tuning needed.
Provides segmentation even on images with no close training data.

Timeline Summary#

Year	Model	Key Innovation	Use Case
2012	AlexNet	CNNs for image classification	General computer vision
2015	U-Net	Encoder-decoder with skip connections	Biomedical image segmentation
2018	StarDist	Star-convex polygons for instance segmentation	Nuclei in dense images
2020	Cellpose	Flow field-based generalist model	Cytoplasm, irregular shapes, multi-class segmentation