// COMPUTER VISION
Alzheimer's Classification
Vision Transformer from scratch — COMP3710
TensorFlowKerasVision TransformersPythonMedical Imaging
Overview
Built for UQ's COMP3710 (Pattern Recognition and Analysis). The constraint that made this interesting: implement the Vision Transformer from scratch — patch extraction, embeddings, positional encoding, multi-head self-attention — not just import it.
Pipeline
- Data — ADNI MRI dataset, binary classification (Alzheimer's vs. Normal Control)
- Preprocessing — slice extraction, normalisation, train/val split with patient-level grouping (no leakage)
- Model — DeiT-small architecture, all components written from scratch in TF/Keras
- Training — early stopping to prevent overfitting on a relatively small dataset
Result
- 81.42% validation accuracy — a solid result given the dataset size and the from-scratch implementation
- More importantly: a real understanding of what's happening inside a ViT, not just
model = tf.keras.applications.X()
What I'd add now
The ViT was the wrong choice for this dataset size — CNN baselines with strong augmentation would likely have outperformed. Worth doing once for the learning, but not again.
// SCREENSHOTS

