all projects

// COMPUTER VISION

Alzheimer's Classification

Vision Transformer from scratch — COMP3710

TensorFlowKerasVision TransformersPythonMedical Imaging

Overview

Built for UQ's COMP3710 (Pattern Recognition and Analysis). The constraint that made this interesting: implement the Vision Transformer from scratch — patch extraction, embeddings, positional encoding, multi-head self-attention — not just import it.

Pipeline

  • Data — ADNI MRI dataset, binary classification (Alzheimer's vs. Normal Control)
  • Preprocessing — slice extraction, normalisation, train/val split with patient-level grouping (no leakage)
  • Model — DeiT-small architecture, all components written from scratch in TF/Keras
  • Training — early stopping to prevent overfitting on a relatively small dataset

Result

  • 81.42% validation accuracy — a solid result given the dataset size and the from-scratch implementation
  • More importantly: a real understanding of what's happening inside a ViT, not just model = tf.keras.applications.X()

What I'd add now

The ViT was the wrong choice for this dataset size — CNN baselines with strong augmentation would likely have outperformed. Worth doing once for the learning, but not again.

// SCREENSHOTS

Alzheimer's Classification screenshot
Alzheimer's Classification screenshot