// COMPUTER VISION

Alzheimer's Classification

Vision Transformer from scratch — COMP3710

TensorFlowKerasVision TransformersPythonMedical Imaging

Overview

Built for UQ's COMP3710 (Pattern Recognition and Analysis). The constraint that made this interesting: implement the Vision Transformer from scratch — patch extraction, embeddings, positional encoding, multi-head self-attention — not just import it.

Pipeline

Data — ADNI MRI dataset, binary classification (Alzheimer's vs. Normal Control)
Preprocessing — slice extraction, normalisation, train/val split with patient-level grouping (no leakage)
Model — DeiT-small architecture, all components written from scratch in TF/Keras
Training — early stopping to prevent overfitting on a relatively small dataset

Result

81.42% validation accuracy — a solid result given the dataset size and the from-scratch implementation
More importantly: a real understanding of what's happening inside a ViT, not just model = tf.keras.applications.X()

What I'd add now

The ViT was the wrong choice for this dataset size — CNN baselines with strong augmentation would likely have outperformed. Worth doing once for the learning, but not again.

// SCREENSHOTS