capstone spine

Projects That Prove the Ideas

A course about machine learning should end in artifacts. These projects are designed to be small enough to finish, serious enough to review, and precise enough to reveal whether the reader understands the machinery.

Phase 1 The Learning Loop

Prediction Autopsy

Choose a real ML feature and reverse-engineer the learning loop behind it.

Deliverables

  • Input and output schema.
  • Candidate training signal.
  • Loss choice and one harmful alternative.
  • Three failure cases the product team should monitor.

Stretch

Build a tiny baseline model for a related public dataset.

Phase 2 Generalization and Measurement

Honest Evaluation Report

Take one dataset and show how the result changes under bad and good evaluation practice.

Deliverables

  • Train, validation, and test split policy.
  • Metric choice with cost-of-error argument.
  • Leakage audit.
  • One plot or table showing the generalization gap.

Stretch

Add calibration analysis and threshold selection.

Phase 3 Linear Models and Optimization

Classifier From Scratch

Implement logistic regression with gradient descent and explain every term.

Deliverables

  • Loss derivation.
  • Gradient update loop.
  • Learning-rate experiment.
  • Boundary plot before and after training.

Stretch

Compare full-batch and mini-batch updates.

Phase 4 Deep Networks

Tiny Autodiff Engine

Build a small scalar autodiff system and train a two-layer network.

Deliverables

  • Value object with forward operations.
  • Backward pass using local derivatives.
  • Training trace.
  • Explanation of one vanishing-gradient failure.

Stretch

Add residual connections and compare gradient flow.

Phase 5 Representations Across Space and Time

Inductive Bias Lab

Compare a dense model, a convolutional model, and a sequence model on tasks they are biased to solve.

Deliverables

  • Parameter-count comparison.
  • Accuracy or loss comparison.
  • One failure example for each architecture.
  • Plain-language explanation of the inductive bias.

Stretch

Add a toy attention model and inspect the weights.

Phase 6 Foundation Models

Retrieval-Augmented Tutor

Build a small system that retrieves notes, answers questions, and cites the retrieved evidence.

Deliverables

  • Chunking policy.
  • Retrieval ranking method.
  • Prompt or synthesis rule.
  • Eval set with at least ten questions.

Stretch

Add a tool call and a refusal behavior for missing evidence.

Phase 7 Agents, Alignment, and Evaluation

Agent Reliability Trial

Create a small agent task and evaluate it before and after a reward or instruction change.

Deliverables

  • Task environment.
  • Reward or preference specification.
  • Failure taxonomy.
  • Eval report showing regressions and improvements.

Stretch

Add adversarial cases that expose reward hacking.