Changelog
Every content addition, design change, and platform update.
v0.22.0 April 2026 Content + Platform
- Topic 4 (Data Splits & Evaluation Foundations) COMPLETED, which closes Domain 2 at 19 of 19 chapters. New routes: /topics/data-features/data-splits-evaluation/overfitting-underfitting and /imbalanced-data. Topic 4 chain wired end-to-end (Train/Val/Test → CV → Overfitting → Imbalanced).
- Two new FunctionPlayground modes, both backed by live in-browser computation rather than pre-baked inline data. polynomial-fit mode: user drags polynomial degree (1-15), noise level, ridge λ, and number of training points; the playground refits the polynomial in real time using new polyfit/polyval/mse helpers added to browser-math.ts, renders the fitted curve against a dashed "true function" reference, and shows a corner readout of train MSE + val MSE + regime classification (underfit / good fit / overfit). confusion-threshold mode: user drags a probability threshold across 200 pre-predicted loan applicants (8% default rate), playground recomputes the 2×2 confusion matrix + precision/recall/F1/cost-weighted total live and marks the current operating point on a PR curve.
- New browser-math helpers: polyfit (ridge-regularised least-squares polynomial fit with Gauss-Jordan solve, handles degrees up to 15 stably), polyval (polynomial evaluation), mse. Used by both the polynomial-fit playground and the Overfitting-chapter figures so displayed curves correspond to real least-squares fits rather than hand-tuned approximations.
- Chapter 3 Overfitting & Underfitting — 3 new figures. PolyFitGalleryFigure renders the same 24 noisy points fitted by degrees 1/3/12 side-by-side. TrainValErrorCurveFigure plots training and validation MSE against polynomial degree 1 to 14, showing the U-shaped validation curve with sweet-spot marker. RegularizationEffectFigure shows the same degree-12 polynomial with and without L2 regularisation (λ=0 vs λ=0.4).
- Chapter 4 Imbalanced Data — 3 new figures including the first hover-driven INTERACTIVE figure in the codebase. AccuracyParadoxFigure shows a trivial "predict-everyone-repays" classifier scoring 92% accuracy while catching zero of 16 defaulters, beside a real classifier scoring 87% but catching 11 of 16. ConfusionMatrixInteractiveFigure renders the four quadrants and responds to mouse hover and touch: each cell promotes on hover, fades the others, and reveals a prose tooltip explaining what that quadrant costs in bank-loan terms. PRROCCurveFigure compares PR-AUC 0.42 vs ROC-AUC 0.91 for the same classifier, footnoted to explain why PR is the honest picture for imbalanced data.
- Chapter content (both chapters): full v3.3 playground fields throughout. Each chapter carries 4 learning objectives, 5 FAQ items, narrative with 3 FigureCanvas calls (client:load on own line) + ComparisonTable + CodeBlock wrapped in JSX template literal, intuition section with standalone thought-experiment questions, reflection with 5 keyPoints + 4 prose paragraphs fully self-contained, checkpoint with 4 questions each linked to a learning objective.
- Narrative highlights: Chapter 3 opens with a decision tree reporting 99.8% training / 71% validation accuracy as the clean case of overfitting, develops the bias-variance trade-off, closes with three remedies and code. Chapter 4 opens with a fraud team whose 99.7%-accuracy model flagged zero fraudulent transactions and lost £2.3M, develops confusion-matrix-to-metrics pipeline, closes with three remedies (threshold adjustment, class weighting, resampling / SMOTE) and code comparing all three.
- figures/registry.ts grown from 63 to 69 entries. validate:figures confirms 69 registered figureIds; validate:client-load confirms 88 MDX files clean. Build at 52 pages, 3.2s. Topic 4 totals 76 minutes (18+16+21+21). Domain 2 totals 5.5 hrs dynamically computed.
- PROJECT_STATUS.md updated: Topic 4 marked 4/4 COMPLETE, Domain 2 closed at 19/19. Current task moved to Domain 3 Classical ML with a proposed 10-chapter structure across Regression and Classification topics.
v0.21.0 April 2026 Content
- Domain 2, Topic 3 (Dimensionality) — all four chapters rebuilt from scratch in a single pass. New routes: /topics/data-features/dimensionality/curse-of-dimensionality, /correlation-redundancy, /principal-component-analysis, /other-reduction-techniques. Topic 3 is now 4/4 complete and Domain 2 stands at 16 of 19 chapters done.
- Chapter 1 The Curse of Dimensionality — 3 new animated figures. DimVolumeCollapseFigure plots the volume of the inscribed hypersphere as a fraction of the unit cube on a log scale from d=1 to d=20, bars fading in left-to-right with an annotation at d=10 (0.25% of cube volume). DimShellConcentrationFigure shows the fraction of a ball's volume inside the outer 10% shell rising from 10% at d=1 to 99.5% at d=50 via a smooth curve with reference dots. DimSparsityGridFigure displays the same 10 points in 1D/2D/3D panels with a bottom annotation calling out the 10-billion-point requirement to match 1D density at d=10. Playground reuses FunctionPlayground's existing distance-concentration mode with hand-tuned inline data for Euclidean n=50/120/300 and Manhattan n=120.
- Chapter 2 Correlation and Redundancy — 3 new animated figures. CorrelationEllipseFigure renders four panels at r = 0.00, 0.50, 0.85, 0.99 each showing 80 Cholesky-generated Gaussian samples with a tilted confidence ellipse; below each panel an r² interpretation ("independent", "moderate", "strong", "near-duplicate"). CorrelationMatrixFigure shows a 10×10 Pearson correlation heatmap for a housing dataset with rows revealing top-to-bottom and strong (|r|≥0.7) cells outlined in white on completion. RedundancyTwinFigure contrasts sqft-vs-rooms at r=0.91 with sqft-vs-crime_rate at r=0.08, each with a shared/unique variance bar underneath. Playground uses ScatterBoundaryPlayground classification mode with interpretations mapping cluster geometry to coefficient-stability behaviour.
- Chapter 3 Principal Component Analysis — 3 new animated figures. PCAVarianceDirectionFigure shows 60 correlated points in original space with PC1 (solid) and PC2 (dashed) arrows and a demo-point projection onto PC1, then reveals a rotated panel with PC1 on the horizontal axis. PCAScreePlotFigure is a two-panel scree display: bars with sequential reveal (left) and a cumulative curve with 80%/95% reference lines and a vertical annotation at k=3 marking the 87% elbow (right). PCAHousesFigure projects 5 labelled houses A–E coloured by price class through PCA, showing both the original sqft×rooms scatter and the PC1×PC2 rotation side by side. Playground uses existing MatrixTransformPlayground eigenvector mode with symmetric/rotation/reflection/shear presets.
- Chapter 4 Other Reduction Techniques — 3 new animated figures. MethodsComparisonFigure shows the same 120-song catalogue via PCA (overlapping classes along a diagonal), t-SNE (tight separated clusters), and UMAP (clusters in a ring with global structure). TSNEPerplexityFigure compares perplexity=5/30/80 with the middle panel highlighted as recommended default; p=5 over-fragments, p=80 over-smooths. LDASupervisedFigure contrasts unsupervised PCA (three classes overlapping along within-class spread) with supervised LDA (three cleanly separated bands). Playground uses ScatterBoundaryPlayground classification mode with interpretations covering the decision tree: labels + class separation → LDA, downstream ML → PCA, visualisation → UMAP, legacy work → t-SNE.
- All four chapters follow the same structure: chapter.json with 4 learning objectives and 5 FAQ items, narrative with 3 FigureCanvas calls (each client:load on its own line), intuition with standalone thought-experiment questions, playground JSON with full v3.3 fields (context/scenario/elementMeaning/whatToNotice, 3–4 insights, 4–5 interpretations with AND-logic conditions, noscriptFallback, a11yDescription, scenarioLabel/tooltip on every parameter), reflection with 5 keyPoints + 4 prose paragraphs self-contained, checkpoint with 4 questions (2 multiple-choice, 1 ordering with scrambled correctOrder, 1 true-or-false) each linked to a learning objective with substantive wrong-option explanations.
- No new playground code was written this week — all four chapters reused existing playground modes (FunctionPlayground distance-concentration for Chapter 1, ScatterBoundaryPlayground classification for Chapters 2 and 4, MatrixTransformPlayground eigenvector for Chapter 3). This validates the audit hypothesis that the existing 5 playgrounds can carry a lot of content if paired with strong figures.
- figures/registry.ts grown from 51 to 63 entries with static imports for all 12 new figure components. Build increased from 46 to 50 pages. validate:figures confirms 63 registered figureIds; validate:client-load confirms 82 MDX files clean.
- PROJECT_STATUS.md updated: Topic 3 marked 4/4 COMPLETE, each chapter annotated with its Week 5 rebuild spec. Historical reversion note condensed now that both rollback gaps (Dimensionality and Train/Val/Test Splits) are closed. Current task moved to Topic 4 Chapter 3 (Overfitting & Underfitting), with Classical ML flagged as the biggest remaining syllabus gap after Domain 2 finishes.
v0.20.0 April 2026 Content
- Domain 2, Topic 4 (Data Splits & Evaluation Foundations), Chapter 1: Train, Validation, and Test Splits — rebuilt from scratch after the April rollback. New route: /topics/data-features/data-splits-evaluation/train-validation-test-splits. Topic 4 is now 2/4 chapters and the Cross-Validation chapter finally has its prerequisite in place.
- Three new animated FigureCanvas components added. SplitPipelineFigure animates a single 10,000-application bar separating into 70% training (accent), 15% validation (warm amber), 15% test (green) with a padlock icon fading in on the test section — each section labelled with its purpose and absolute count. LeakageDiagramFigure renders two horizontal pipeline rows stacked vertically: a WRONG row with a red cross badge (fit scaler on all data → split → train) and a CORRECT row with a green check badge (split → fit scaler on train only → train), each row building left-to-right with a single-sentence annotation explaining the failure or the fix. LearningCurveFigure plots training error (accent) and validation error (amber) against training-set size from 500 to 7,500, revealing the two curves left-to-right via an animated clip path, with convergence and overfitting gap annotations fading in last.
- Chapter JSON wiring: 4 learning objectives covering why training accuracy means nothing, the three splits and their distinct jobs, common data-leakage patterns, and split-ratio choice at varying dataset sizes. 5 FAQ items. previousChapterId null (first chapter in the topic) and nextChapterId wired to cross-validation, which now renders a working back-link from the CV chapter.
- Playground JSON (scatter-boundary type, classification mode): 4 controls — pointsPerClass slider (50–400), noiseLevel slider (0.05–0.9), trainRatio slider (0.5–0.9) and stratify toggle as conceptual parameters that drive the interpretation panel rather than the scatter render. 5 interpretations with AND-logic conditions: default 70/15/15, small-training, large-training-tiny-holdout, no-stratify warning, and tiny-dataset forward thread to cross-validation. 3 insights, context/scenario/defaultStateDescription/whatToNotice, noscriptFallback, a11yDescription — full v3.3 fields throughout.
- Narrative: 4 concept sections (Why Training Accuracy Means Nothing, Three Splits Three Purposes, Data Leakage, Choosing Split Ratios) grounded in the 10,000-application bank scenario. ComparisonTable for split-ratio recommendations across four dataset size brackets. CodeBlock wrapping train_test_split + stratified Pipeline + StandardScaler + LogisticRegression in a JSX template literal to avoid the MDX import-parse trap. Three FigureCanvas calls each with client:load on its own line.
- Intuition: three standalone thought-experiment questions — defaulter count in a 1,500-applicant test set at 8% base rate, direction of bias when a StandardScaler is fitted on full data before splitting, and what breaks if 200 applicants are forced into a 70/15/15 split.
- Reflection: 5 keyPoints in frontmatter and 4 prose paragraphs, fully self-contained (no references to the playground, no UI instructions), with the final paragraph threading forward into cross-validation as the next-chapter resolution for small-n datasets.
- Checkpoint: 4 questions (tvt-obj-1 through tvt-obj-4). Multiple-choice on why a 100%-training-accurate decision tree tells you nothing about generalisation. Multiple-choice on why fifteen test-set evaluations of fifteen hyperparameter combinations produces a biased reported score. Ordering question with correctOrder [1, 0, 3, 2] (scrambled — tests the leakage-free pipeline order: split → fit scaler on train → fit and tune model → evaluate once on test). True-or-false on whether 70/15/15 is appropriate for a 180-record medical dataset (false — CV is correct at that scale).
- figures/registry.ts grown from 48 to 51 entries with static imports for split-pipeline, leakage-diagram, learning-curve. validate:figures and validate:client-load both clean.
- Post-review polish (same release): all three new figures had a broken loop mechanism that reset elapsedMs to 0 after each animation cycle while the requestAnimationFrame loop had already exited, causing the figure to go blank until the IntersectionObserver re-fired on scroll. Removed the loop and made each animation one-shot so the figure holds its final frame indefinitely, matching the existing KFoldDiagramFigure convention.
- Post-review polish: extended ScatterBoundaryPlayground with a real splits-aware rendering path dispatched by the presence of trainRatio in values. Each applicant is now visually assigned to train/validation/test (solid / faded / dashed-outline dots), the nearest-centroid boundary is computed from training points only (so moving the train ratio slider visibly shifts the line), and a top-right legend shows per-split counts with per-class counts in parentheses. Stratify toggle now has a visible effect: with balanced synthetic data the stratified path always shows matched per-class counts, while the unstratified path uses a deterministic pseudo-shuffle that lets class ratios drift at extreme train ratios.
- PROJECT_STATUS.md reconciled: Topic 4 now 2/4, Dimensionality still 0/4 with unchanged reversion note, Week 4 recorded in the completed-work list.
v0.19.0 April 2026 Content + Platform
- Major three-week polish pass. Week 1 was SEO and accessibility: created /public/og/default.png (warm-dark 1200×630 branded placeholder) plus a pure-Node generator script at scripts/generate-og-png.mjs, so social shares no longer 404. Added og:image / twitter:image / image dimensions / og:locale tags to every page via BaseLayout. Wired Person (Muhammad Aashir Irshad) and Organization JSON-LD schemas with stable @id references; the homepage now carries WebSite + Organization + Person, chapters carry author + publisher + datePublished + dateModified. Excluded the dev-only /playground-test route from the generated sitemap. Expanded three short meta descriptions (Eigenvectors, The Gradient, The Chain Rule) so every chapter hits the 120-character target.
- Week 1 accessibility: every parameter control (slider, select, text, number, toggle, info-tooltip) now has a visible focus-visible ring in the accent colour. KeyTerm popovers and the sidebar glossary pills are now keyboard-accessible — the trigger takes tabindex 0, the popover has role=tooltip with an aria-describedby link, and a CSS group-focus-within rule opens the popover on either hover or focus. ParameterTooltip now closes on Escape and returns focus to its button. Fixed the GradientDescentPlayground hiker marker that was using hardcoded orange and white; it now reads CSS vars so both light and dark themes render it correctly. Bumped --color-text-subtle (#8a7d72 → #a39688 dark, #7a6d65 → #6a5d55 light) so subtle body text passes WCAG AA contrast on subtle backgrounds.
- Week 1 mobile: the collapsible chapter list that appears on mobile inside a chapter page got a prominent accent-coloured left border, a list icon, and a two-line summary reading "Chapter N of M · tap to browse". It is no longer easy to miss.
- Week 2 architecture: extracted the 300 lines of pure probability-distribution math from ProbabilitySamplerPlayground into src/lib/utils/distributions.ts (sampling, PDFs, PMFs, presets, valueToBin). The playground file dropped from 757 to 449 lines (−41%) with no behaviour changes. Extracted 120 lines of pure helpers from ScatterBoundaryPlayground into src/lib/utils/scatterBoundary.ts (MARGIN, ASPECT, music/categorical/patient constants, computeCentroids, bisectorSegments, computeOutlierFlags). Added a shared <Arrow /> component to figureUtils.tsx so new figures can import a single arrow with dashed support, opacity, and configurable head size.
- Week 3 content: rewrote the Chain Rule narrative so the formal dy/dx = (dy/du)·(du/dx) statement appears after the worked sigmoid-neuron example rather than before — intuition-before-notation is now enforced structurally with a new "The chain rule, stated formally" subheading that references the 0.105 × 2 = 0.210 sigmoid result. Added concrete intuitive primers in Probability Distributions for σ (68% of values fall within μ ± σ, with a 15±4°C city-temperature example) and λ (mean equals 1/λ, with a rainfall example). Eigenvectors chapter had its Hessian paragraph rewritten into plain English, cutting the "second partial derivatives" leap and framing eigenvalues as "measurements of pull along a direction"; the covariance matrix mention now carries a parenthetical explaining what it encodes.
- Syllabus reconciliation: the Gradient Descent topic.json (under Neural Networks) now lists Derivatives and Vectors as essential prerequisites, so learners know exactly what they need before entering. PROJECT_STATUS.md was corrected to reflect reality — the four Dimensionality chapters and the Train/Validation/Test Splits chapter described in v0.16.14–v0.17.0 were reverted in April 2026 when their figure files did not survive a git push; rebuild is pending. The document now marks these chapters as ⬜ rather than ✅ and gives a short reversion note so the state of the site is transparent.
v0.18.0 March 2026 Content
- Domain 2, Topic 4 (Data Splits & Evaluation Foundations), Chapter 2: Cross-Validation — full chapter built. New route: /topics/data-features/data-splits-evaluation/cross-validation. Topic 4 now at 2/4 chapters.
- Three animated FigureCanvas figures added: KFoldDiagramFigure (5-row rotation diagram — each row shows one training–validation iteration with the validation fold highlighted in accent colour; rows animate in 300ms apart; test set shown below with dashed border and lock annotation), StratifiedFoldFigure (two-panel comparison of random vs stratified fold assignment for 64 defaulters; bars animate first, orange defaulter squares animate in second to highlight uneven vs even distribution; seeded positions), CVComparisonFigure (two-curve log-scale chart: estimate variance in orange falls from k=2 to LOO, compute time in accent rises; both curves draw left-to-right simultaneously; vertical reference lines at k=5 and k=10 with LOO annotation).
- FunctionPlayground.tsx: added cv-error mode detected via playgroundMode="cv-error" in initialState. Reads inline data from config.dataSource.inlineData (kValues, logistic/tree/knn mean and std arrays). Renders mean CV error line (accent), ±1 std band (shaded + dashed bounds), vertical hairline at current k with exact mean±std readout. Responsive to all three parameters: kFolds (slider 2–20), modelType (select logistic/tree/knn), showStdBand (toggle).
- Playground JSON (function-2d type): inline data embeds precomputed mean/std for logistic regression, decision tree, and k-NN at k=2,3,4,5,6,7,8,9,10,12,15,20. Full v3.3 fields: context, 4 interpretations (default empty-conditions + low-k + high-k + tree), 3 insights, scenarioLabel and tooltip on all parameters, noscriptFallback, a11yDescription. kFolds slider (2–20), modelType select, showStdBand toggle, each with group field.
- Narrative: 4 concept sections (The Problem With One Split, How k-Fold Works, Choosing k and Stratification, Leave-One-Out and When to Use It). Two ComparisonTables (worked 5-fold example with per-fold accuracies, k trade-off table). CodeBlock with sklearn Pipeline + StratifiedKFold + cross_val_score in JSX template literal. Three FigureCanvas calls with client:load on own line.
- Checkpoint: 4 questions (cv-obj-1 through cv-obj-4). Ordering question correctOrder: [1, 0, 3, 2] (scrambled, tests CV pipeline order). True/false on LOO appropriateness for 60-record medical dataset with correct explanation. All wrong-answer explanations substantive.
- Reflection: 4 prose paragraphs (why CV exists, k controls estimate not model, stratification as default, forward thread to overfitting/underfitting). 5 keyPoints complete sentences, no headings, no UI references. Self-contained prose.
- registry.ts updated to 76 entries with 3 new figure imports (kfold-diagram, stratified-fold, cv-comparison).
v0.17.0 March 2026 Content
- Domain 2, Topic 4 (Data Splits & Evaluation Foundations), Chapter 1: Train, Validation & Test Splits — full chapter built. New routes: /topics/data-features/data-splits-evaluation and /topics/data-features/data-splits-evaluation/train-validation-test-splits. Topic 4 now started (1/4 chapters). Topic JSON updated with correct prerequisites (features-representations) and estimatedMinutes: 60.
- Three animated FigureCanvas figures added: SplitPipelineFigure (full dataset rectangle splits into three proportional boxes — train 70%, val 12.5%, test 12.5% — with purpose labels and lock annotation on test set), LeakageDiagramFigure (two-panel diagram: wrong scaler-before-split vs correct split-before-scaler, with orange leakage warning and accent success label), LearningCurveFigure (training error vs validation error as training set size grows from 500 to 7,500 — gap narrows, annotations for large gap and convergence regions, dashed vertical line at bank dataset size).
- Playground uses existing scatter-boundary type with randomClassificationDataset generator. Parameters: trainRatio (slider 0.50–0.90), splitView (select all/split/boundary), stratify (toggle), numApplicants (slider 50–400). Full v3.3 fields: context, 4 interpretations (default + small-train + large-train + boundary), 3 insights, scenarioLabel and tooltip on all parameters, noscriptFallback, a11yDescription.
- Narrative: 4 concept sections (Why Training Accuracy Means Nothing, Three Splits Three Purposes, Data Leakage, Choosing Split Ratios). ComparisonTable for split ratio recommendations by dataset size. CodeBlock with train_test_split + stratify + StandardScaler pipeline wrapped in JSX template literal. Three FigureCanvas calls with client:load.
- Checkpoint: 4 questions (tvt-obj-1 through tvt-obj-4). Ordering question correctOrder: [1, 0, 2, 3] (scrambled, tests preprocessing leakage pipeline order). True/false question on test set contamination via post-hoc feature engineering. All wrong-answer explanations substantive.
- Reflection: 4 prose paragraphs (one-shot guarantee, preprocessing leakage as silent failure, stratification as default, forward thread to cross-validation). 5 keyPoints, no headings, no UI references. Self-contained prose.
- registry.ts updated to 73 entries with 3 new figure imports.
v0.16.17 March 2026 Content
- Domain 2, Topic 3 (Dimensionality), Chapter 4: Other Reduction Techniques — full chapter built. New route: /topics/data-features/dimensionality/other-reduction-techniques. Topic 3 now complete (4/4 chapters).
- Three animated FigureCanvas figures added: PCAvsMethodsFigure (3-panel: PCA oval cloud vs t-SNE clusters vs UMAP circular arrangement, animated left-to-right reveal), TSNEPerplexityFigure (3-panel: perplexity 5/30/80 with perplexity=30 highlighted in green), LDAvsUMAPFigure (2-panel: UMAP circular arrangement vs LDA supervised separation with LD1 axis annotation).
- New generator reductionMethodsDataset.ts: deterministic pre-computed 2D coordinates for 120 songs under 6 scenarios (pca, tsne5, tsne30, tsne80, umap, lda) using Mulberry32 PRNG and genre cluster centres. Added reductionMethodsDataset to GeneratorKey type and registered in lib/generators/registry.ts.
- ScatterBoundaryPlayground.tsx: added ORT mode (detected by reductionMethod in values). Imports from reductionMethodsDataset, renders genre-coloured dots in pre-computed 2D space, optional cluster labels at genre centroids, genre legend. Method-specific axis labels (PC1/PC2, t-SNE 1/2, UMAP 1/2, LD1/LD2).
- Playground JSON: scatter-boundary type with reductionMethod select (PCA/t-SNE/UMAP/LDA), tsnePerplexity select (5/30/80) with showWhen condition, showGenreLabels toggle. Full v3.3 fields: context, 6 interpretations (default empty-conditions + 5 method-specific), 3 insights, scenarioLabel and tooltip on all parameters, noscriptFallback, a11yDescription.
- Narrative: 5 concept sections (What PCA Misses, t-SNE, UMAP, LDA, Method Comparison). ComparisonTable for all 4 methods across linearity/supervision/global-distances/downstream-use/speed. CodeBlock with all 4 sklearn methods in JSX template literal. Three FigureCanvas calls with client:load.
- Checkpoint: 4 questions (ort-obj-1 through ort-obj-4). Ordering question correctOrder: [1, 3, 0, 2] (scrambled). All wrong-answer explanations substantive. Reflection: 4 prose paragraphs, 5 keyPoints, no headings, forward thread to Data Splits topic.
- registry.ts updated to 70 entries with 3 new figure imports.
v0.16.16 March 2026 Content
- Domain 2, Topic 3 (Dimensionality), Chapter 3: Principal Component Analysis — full chapter built. New route: /topics/data-features/dimensionality/principal-component-analysis.
- Three animated FigureCanvas figures added: PCAVarianceDirectionFigure (2D scatter with PC1/PC2 arrows growing from centroid, demo point slides to PC1 projection), PCAScreePlotFigure (two-panel: scree bar chart + cumulative variance curve with 87%/k=3 annotation), PCAProjectionFigure (two-panel before/after: original sqft×rooms correlated cloud vs orthogonal PC1×PC2 space, 5 houses coloured by price class).
- Playground uses existing MatrixTransformPlayground in eigenvector mode. JSON includes full v3.3 fields: context, 4 interpretations (default empty-conditions, high/low correlation, one-component), scenarioLabel and tooltip on all parameters. Parameters: correlation (slider −0.95–0.95), showDataCloud (toggle), numComponents (select 1/2).
- Narrative: 4 concept sections following intuition→example→formula→ML-consequence order. Section 1: 5-house PC1/PC2 score table from Chapter 2 data, plain-English projection explanation. Section 2: variance-maximisation argument with speechiness/energy contrast. Section 3: 2×2 covariance matrix formula, eigenvector equation (no derivation). Section 4: 87% retention, interpretability trade-off, sklearn PCA code with fit-on-train-only rule.
- Checkpoint: 4 questions covering all learning objectives (pca-obj-1 through pca-obj-4). Ordering question correctOrder: [3, 1, 0, 2] (scrambled). True/false question on 45° eigenvector direction with full worked explanation.
- registry.ts updated to 67 entries with 3 new PCA figure imports.
v0.16.15 March 2026 Content
- Domain 2, Topic 3 (Dimensionality), Chapter 2: Correlation & Redundancy — full chapter built. New route: /topics/data-features/dimensionality/correlation-redundancy.
- Three animated FigureCanvas figures added: CorrelationEllipseFigure (4-panel scatter+ellipse for r=0/0.50/0.85/0.99), CorrelationMatrixFigure (10×10 heatmap with row-by-row animation and strong-cell pulse), RedundancyDiagramFigure (2-panel Venn diagram comparing r=0.91 vs r=0.08 feature pairs).
- ScatterBoundaryPlayground.tsx: added "correlation" mode dispatched via showCorrelationBadge in values. Renders housing dataset scatter coloured by above/below median price, nearest-centroid decision boundary, Pearson r badge (live-computed from data), axis labels per selected feature pair, and legend. New generator housingDataset.ts generates 200 houses with sqft/rooms correlation r≈0.91 via Cholesky decomposition.
- Playground JSON includes full v3.3 fields: context, 5 interpretations (including one default with empty conditions), scenarioLabel and tooltip on all parameters, noscriptFallback, a11yDescription. Parameters: featureX (select), featureY (select), showCorrelationBadge (toggle), numHouses (slider 50–400).
- Narrative covers: Pearson r formula with 5-row worked example, redundancy as r²=0.83 information overlap, multicollinearity with coefficient instability table, VIF formula, distance double-counting in KNN. Three ComparisonTable components (worked example, coefficient instability, feature pair summary). CodeBlock with pandas correlation matrix analysis wrapped in JSX template literal.
- Checkpoint: 4 questions covering all learning objectives (cr-obj-1 through cr-obj-4). Ordering question with correctOrder: [1, 3, 0, 2]. All wrong-answer explanations substantive.
- registry.ts updated to 64 entries with 3 new figure imports.
v0.16.14 March 2026 Content
- Domain 2, Topic 3 (Dimensionality), Chapter 1: The Curse of Dimensionality — full chapter built. New route: /topics/data-features/dimensionality/curse-of-dimensionality.
- Four animated FigureCanvas figures added: CurseVolumeCollapseFigure (unit hypersphere volume collapse d=1–20), CurseShellFractionFigure (outer-shell fraction rising to 99.5% at d=50), CurseDistanceRatioFigure (nearest/farthest neighbour convergence with KNN failure region), CurseSparsityGridFigure (10 points in 1D/2D/3D panels showing exponential sparsity).
- FunctionPlayground.tsx: added "distance-concentration" mode dispatched via initialState.playgroundMode field. Reads precomputed inline distance statistics for Euclidean/Manhattan metrics and 3 catalogue sizes; renders two-curve or single contrast-ratio view with reference threshold line.
- Playground JSON includes full v3.3 fields: context, 5 interpretations (including one default with empty conditions), scenarioLabel and tooltip on all parameters, noscriptFallback, a11yDescription.
- Narrative covers 4 geometric failures: volume collapse, shell concentration, distance concentration, and exponential sparsity. Two comparison tables: distance-vs-dimensions and algorithm sensitivity. Python code snippet computing shell_fraction() and relative_contrast_approx() — CodeBlock children wrapped in JSX template literal.
- Checkpoint: 4 questions covering all learning objectives (cod-obj-1 through cod-obj-4). Ordering question items in scrambled order; correctOrder: [1, 3, 2, 0].
- registry.ts updated to 61 entries with 4 new figure imports.
v0.16.13 March 2026 Content
- Feature Engineering chapter: removed unnecessary named imports for KeyTerm, Callout, CodeBlock, RealWorldNote, CommonMistake from narrative — these components are globally available via the page components prop; only FigureCanvas requires a direct import as a React island.
- Log transformation explanation: replaced log₁₀(1.1) ≈ 0.041 technical detail with a plain-English doubling explanation — doubling the price always adds the same gap in log space whether it is £200k→£400k or £1M→£2M.
- Interaction terms section: added sentence explaining that tree-based models like gradient boosting discover feature interactions automatically through conditional splits; explicit interaction terms are most valuable for linear models that can only learn additive effects.
- Putting it together section: added featuretools forward-thread at end — explains that libraries like featuretools can generate hundreds of candidate interaction terms automatically; challenge shifts from creating features to selecting them (bridges to next chapter).
- Intuition section rewritten: concrete log₁₀ doubling question using £250k/£500k and £500k/£1M examples; asks what pattern the equal 0.30 gap reveals and how to invert a log₁₀(price) prediction back to pounds.
- Playground: interaction scatter mode option added to transformation select (bathroom/bedroom ratio scatter view); fe-interp-interaction interpretation added; rendering in FunctionPlayground.tsx is a follow-up task documented in PROJECT_STATUS.md Known Issues.
- validate-client-load.ts extended with Check 5: flags unnecessary import statements from @components/mdx/ in any MDX file. Caught 8 violations across feature-selection/01-narrative.mdx (7 imports) and feature-selection/04-reflection.mdx (1 import); all fixed.
- playgroundMode non-standard field documented in PROJECT_STATUS.md: FunctionPlayground.tsx reads this field from initialState to switch to histogram rendering mode; field is live and must be kept.
- MDX component resolution architecture documented in PROJECT_STATUS.md: clarifies that 10 standard components are globally available, FigureCanvas is the sole exception requiring a direct import.
v0.16.12 March 2026 Content
- Similarity & Distance chapter: Euclidean distance formula updated from sigma notation to expansion form with ellipsis (⋯), making the n-dimensional pattern explicit without requiring knowledge of summation notation.
- Added KNN sentence after the Euclidean distance introduction: explains that KNN is the core application of Euclidean distance, classifies by finding k nearest songs and letting them vote on the label.
- Added approximate nearest neighbours paragraph at end of narrative: explains that at production scale (millions of songs, hundreds of features) exact search is too slow; FAISS and Annoy use indexing structures optimised for the chosen metric; metric choice constrains available approximation methods.
- Removed playground UI forward-reference from narrative end ("The playground makes this concrete: switch metrics and watch which songs appear as nearest neighbours").
- Intuition section rewritten as a standalone thought experiment: quiet vs loud acoustic jazz song framing, asks whether straight-line distance or angular similarity better captures similarity and why the two might disagree.
- Playground: show_disagreements boolean toggle added (default false, display group); interp-disagreements interpretation added explaining which songs change neighbourhood membership under metric switching; initialState updated.
- Playground show_disagreements rendering is not yet implemented in ScatterBoundaryPlayground.tsx — JSON and interpretation are present; visual ring overlay is a follow-up task documented in PROJECT_STATUS.md Known Issues.
- Checkpoint Q4 explanation extended: added note that articles vs songs ranking is close — articles rank first because document length variation is typically more extreme and systematic than recording volume variation; either could reasonably rank first.
v0.16.11 March 2026 Content
- Normalisation-Scaling chapter: replaced legacy static SVG figure (feature-scale-mismatch.svg) with an animated horizontal bar chart where six audio features grow sequentially, making Duration's 510-unit range dwarf all others — annotation reads "Duration contributes 260,000× more to squared distance than energy" (feature-range-comparison).
- Added two-panel animated scatter figure showing the same 120 songs in raw tempo×energy space (left) vs z-score scaled space (right); genre clusters are invisible in the raw panel and distinct in the scaled panel (raw-vs-scaled-scatter).
- Added scaler-leakage pipeline figure: two pipeline diagrams stacked — WRONG (fit scaler on full data before split, red X marker) vs CORRECT (split first, fit scaler on training set only, green check mark) — emphasising data leakage as a practical pitfall (scaler-leakage-pipeline).
- Added log transformation sentence after the min-max outlier warning: explains that log or Box-Cox transforms compress extreme values before the scaler sees them.
- Added batch normalisation sentence at the end of the narrative: positions BN as a neural network layer rather than a preprocessing step, bridging to deep learning content.
- Intuition section rewritten as two concrete questions about tempo/energy dominance in KNN distance — no playground UI references.
- Playground: added interp-default interpretation (conditions: []) as required first entry; added show_nearest_neighbours boolean toggle; added interp-nn-scaling interpretation triggered when show_nearest_neighbours is true, explaining how scaling changes which songs are considered nearest neighbours.
- Checkpoint Q4: scrambled item order so the correct step sequence is no longer listed top-to-bottom; updated correctOrder to [2, 1, 3, 0]; updated explanation to reinforce why fitting on test data inflates test performance.
- validate-client-load.ts extended with Check 4: scans all playground JSONs and reports any file missing a default interpretation (empty "conditions" array). Detected and fixed two pre-existing violations — feature-engineering and similarity-distance playgrounds now each have an interp-default entry.
- registry.ts updated to 57 entries with static imports for all three new figures.
v0.16.10 March 2026 Content
- Categorical Encoding chapter: replaced both legacy static SVG figures with animated React figures — a 3-phase number line where genre dots appear, Jazz↔Pop and Jazz↔Rock distance brackets materialise, then dots permute to a new arbitrary assignment (integer-encoding-number-line); and a 3-phase expanding table where the genre column appears first, binary headers and zeros fill in, then each row activates with a genre-coloured "1" highlight (one-hot-encoding-table).
- Added pd.get_dummies Python code snippet after the one-hot KeyTerm showing pandas one-hot encoding of four genres with column names and expected output.
- Added concrete target encoding example in the cardinality section: skip-rate means replace genre labels (jazz=0.23, pop=0.41, classical=0.18, rock=0.37) with a note on the leakage risk and the cross-validation mitigation.
- Removed standalone self-closing <KeyTerm id="integer-encoding" /> from narrative body — the term is already defined inline at first mention.
- Intuition section rewritten as a standalone thought experiment about the false distances implied by integer encoding — no playground description.
- Playground: added target encoding option to encodingMethod parameter; added ce-interp-target interpretation explaining skip-rate means and task-dependence; added proxy-note sentence to the one-hot energy interpretation body; updated encodingMethod tooltip to cover target encoding.
- registry.ts updated to 54 entries with static imports for both new figures.
v0.16.9 March 2026 Content
- Feature Vectors chapter: replaced both legacy static SVG figures with animated React figures — a 4-phase table revealing three songs as feature vectors with Jazz+Classical highlighted as similar and Pop highlighted as different (feature-vector-table), a genre-by-genre scatter reveal on energy×acousticness axes (feature-space-genre-clusters), and a two-panel contrast showing the same 120 songs plotted by energy×acousticness vs loudness×tempo (feature-choice-contrast).
- Added feature scaling preview paragraph after the jazz/pop vector comparison: explains how the 56-unit tempo gap dominates Euclidean distance and why normalisation is required before computing distances.
- Added Python code snippet computing np.linalg.norm distance between jazz and pop vectors — distance ~57.5 dominated by the tempo component.
- Added learned embeddings sentence after the Spotify RealWorldNote: explains that deep learning systems learn dense representations directly from raw audio, extending the engineered-features narrative.
- Added FigureCanvas (feature-choice-contrast) to the "Why Feature Choice Matters" section: illustrates concretely how feature selection determines cluster separability.
- Removed the "Bridge to the Playground" section from the narrative; the chapter ending now reads cleanly without playground description.
- Intuition section fully rewritten as a standalone thought experiment: asks which two features would best separate rock from classical, then whether the same pair works for jazz vs pop.
- Reflection section: added scaling note after the distance paragraph clarifying that raw Euclidean distance is dominated by the widest numerical range regardless of information content.
- ScatterBoundaryPlayground: extended music-mode hover tooltip to show all six feature values with units (energy, acousticness, tempo BPM, danceability, loudness dB, duration s) using a pre-formatted multi-line string; tooltip div now uses whitespace-pre rendering.
- validate-client-load.ts extended with two new checks: (1) Bridge to Playground heading detection in any MDX file, (2) playground UI forward-reference detection in 01-narrative.mdx files. Detected and removed a Bridge section from the categorical-encoding chapter.
- Categorical Encoding chapter: Bridge to the Playground section removed from narrative.
v0.16.8 March 2026 Content
- Data Quality & Outliers chapter: replaced both legacy static SVG figures with animated React figures — a 5-phase scatter showing all four data quality problem types (data-quality-types) and a two-panel figure where five outliers are added sequentially and the regression line corrupts with colour interpolating from accent to red (regression-outlier-corruption).
- Added Python code snippet showing a minimal data quality audit: null checks, impossible value checks, exact duplicate detection, and describe() for distribution inspection.
- Added production leakage guard paragraph: explains the strict temporal split pattern and the requirement to fit all transformations on training data only.
- Removed the "Bridge to the Playground" section from the narrative — the chapter ending now reads cleanly without playground description.
- Intuition section: replaced "The playground starts with a clean patient dataset..." with a playground-neutral framing using "Imagine a clean hospital dataset...".
- Added show_trend toggle to the playground: overlays a least-squares regression line with 95% confidence band; two new interpretations cover the clean-data case and the corrupted-slope case when outlierCount > 5.
- Added show_imputation toggle (visible only when missingFraction > 0): fills missing blood pressure values with the column mean and overlays a dashed horizontal line; new interpretation explains the horizontal cluster artifact and mean-imputation bias.
- Q4 checkpoint explanation extended: clarifies the distinction between exact duplicates (trivial hash detection) and near-duplicates requiring entity resolution with string distance metrics.
- Understanding Data topic marked COMPLETE — all five chapters revised.
v0.16.7 March 2026 Content
- Probability Distributions chapter: fixed last figure appearing twice — text that belonged in a paragraph following the normal-vs-exponential FigureCanvas was incorrectly placed on the same line as the closing />, causing MDX to render the component inline and produce a second blank container. Moved text to its own paragraph.
- Descriptive Statistics chapter: removed the standard deviation sigma-notation MathBlock; replaced with a plain-English description in the same units as the original data.
- Added a Python code snippet showing mean vs median on a 10-salary dataset with one outlier — one outlier shifts the mean 57% above the median.
- Added ML hooks to both the Measures of Centre and Measures of Spread sections: batch normalisation and MSE loss use the mean; feature standardisation divides by standard deviation and breaks when SD is computed on skewed data.
- Added an IQR forward-thread: the 1.5 × IQR outlier rule is the most common automated outlier detection method in data preprocessing, developed fully in the next chapter.
- Replaced both legacy static SVG figures with animated React figures: a number line showing mean jumping when an outlier appears (mean-median-divergence), and a step-by-step box plot construction from scatter to five-number summary (box-plot-construction).
- Added "What This Means for Machine Learning" section to the narrative covering right-skewed features, distance-based algorithm failure, log transformation, and feature standardisation.
- Removed playground-priming sentence from the narrative.
- Intuition section fully rewritten as a standalone thought experiment about a handful of tech executives moving to a city — no playground description.
- Fixed all five empty leadsTo strings in playground interpretations; removed leadsTo key from interp-default which needs no forward link.
- showStatisticsOverlay confirmed as a live component field — ProbabilitySamplerPlayground reads it to gate the statistics overlay render.
- Added show_iqr toggle to the descriptive-statistics playground: draws Q1 and Q3 dashed lines with a shaded IQR box and labels, with a new interpretation explaining IQR vs SD on skewed data.
- Reflection ML paragraph trimmed: the full skewed-feature explanation moved to the narrative; reflection now points forward to Normalisation and Scaling.
- Q4 checkpoint explanation extended: clarifies why standard deviation is actually more outlier-sensitive than the mean in absolute terms, while the mean is considered least robust in reporting contexts.
v0.16.6 March 2026 Content
- Probability Distributions chapter: replaced both legacy static figures with two new animated figures — falling dots building an exponential histogram with dashed PDF overlay, and an animated bell curve that morphs through σ = 2, 5, and 9 showing area conservation.
- Added a third animated figure contrasting temperature (symmetric bell) and rainfall (exponential decay) side by side, with a red-highlighted region marking where a normal model applied to rainfall would assign probability to impossible negative values.
- Removed the normal distribution PDF formula; replaced with a plain-English explanation of how σ controls bell shape and why area always equals 1.
- Added ML hook to the normal distribution section: residuals, calibrated confidence intervals, and why regression analysts check residual shape against the normal reference.
- Added ML hook to the exponential section: survival analysis, time-to-event modelling, and the exponential family as the foundation for churn, failure, and fraud-timing models.
- Fixed two broken forward links in playground interpretations: sunshine hours now correctly describes uninformative priors; small-sample now points to Descriptive Statistics and Data Quality chapters.
- Corrected bimodal parameters in the ComparisonTable from μ₁, μ₂, σ to μ₁, σ₁, μ₂, σ₂, w with a clarifying note about mixture components.
- Added a Python code example showing what goes wrong when a normal distribution is fitted to exponential data — probability of negative rainfall under the fitted model.
- Added "Show mismatched model" toggle to the playground: overlays a normal curve regardless of fit, illustrating why distribution selection matters.
- Removed playground-priming sentence from the narrative; closing thought experiment now stands on its own.
- Intuition section fully rewritten as a standalone thought experiment about rainfall vs temperature shape — no playground description.
- Q3 checkpoint explanation softened: "roughly a quarter" replaces the specific 23% figure.
- Q4 checkpoint explanation extended: clarifies why Uniform ranks above Normal despite both being perfectly symmetric.
v0.16.5 March 2026 Platform
- Fixed the Show Data Table toggle in the Feature Space Explorer playground — toggling it on now reveals a side-by-side data table showing the first ten rows of the generated dataset, with bidirectional hover: mousing over a table row highlights the corresponding scatter point, and mousing over a scatter point highlights the corresponding row.
v0.16.4 March 2026 Content
- Types of Data chapter: replaced both legacy static figures with two new animated figures — a continuous number line with falling dots and a traced KDE curve, and a two-panel categorical-vs-continuous comparison showing a bell curve against four isolated bars.
- Added a third animated figure showing integer encoding creating false distances versus correct one-hot encoding, placed in the "Why this matters for ML" section.
- Added a Python dtype CodeBlock after the hospital intake thesis sentence, with a note clarifying that storage type and statistical type are different concepts.
- Added "## Count data" heading to give count variables their own section, consistent with the other data types.
- Added an embeddings forward-thread after the one-hot CommonMistake, pointing readers toward compact dense embeddings for high-cardinality categorical features.
- Removed playground-priming sentence from the final narrative section.
- Rewrote the intuition section as a standalone thought experiment that no longer references the playground.
- Fixed four broken forward links in playground interpretations: blood-type, admissions, and small-sample all now point to correct chapters in the actual syllabus.
- Fixed reflection forward link: "Features and Representations domain" corrected to "Features & Representations topic".
- Simplified checkpoint Q4 from four items to three (removing Binary), resolving an ordering ambiguity; updated correctOrder and explanation.
v0.16.3 March 2026 Content
- What Is a Dataset? chapter: replaced static figures with two new animated figures — a three-phase table walkthrough (instances → features → label) and a split-screen table-to-scatter animation showing the correspondence between tabular and geometric representations.
- Added a high-dimensional bridge paragraph explaining that the feature-space geometry extends identically to hundreds or thousands of dimensions, with a language model embedding as a concrete example.
- Revised key point 3 in the chapter reflection to describe the general principle of measurement precision rather than referencing specific noise thresholds.
- Added a fourth checkpoint question testing the feature-space objective: identifying a 2D scatter plot as a feature space.
- Added a Show Data Table toggle to the Feature Space Explorer playground to let readers compare the table and scatter representations side by side.
v0.16.2 March 2026 Platform
- Added internal checks that automatically catch figure rendering issues before they reach the site.
- Fixed syntax highlighting on code examples in the Eigenvectors and Normalisation & Scaling chapters.
v0.16.1 March 2026 Platform
- Fixed a bug in the Eigenvectors chapter where three animated figures were rendering as blank white boxes.
v0.16.0 March 2026 Content
- Eigenvectors chapter: worked through the eigenvector equation with real numbers — applied a specific 2×2 matrix to three test directions and showed which ones come out pointing the same way they went in.
- Three new animated figures: a side-by-side comparison of eigenvector and non-eigenvector directions, a five-panel gallery showing what five different eigenvalue values do (stretch, shrink, leave unchanged, flip, collapse to zero), and a data cloud with principal component arrows.
- The playground now uses data and transformation language throughout — you are exploring how a matrix stretches or rotates data, not material stress.
- Eigenvalue readout added to the playground canvas: λ₁ and λ₂ update live as you adjust the matrix, colour-coded by magnitude.
- Complex rotation eigenvalues explained in plain English rather than with exponential notation.
- New checkpoint question testing whether you can match the eigenvalue symbol to its geometric meaning.
v0.15.0 March 2026 Content
- Matrix Multiplication chapter: three new animated figures showing the two-step geometry of AB, why AB and BA produce different results, and why stacking two linear layers is mathematically the same as one.
- The playground now displays both AB and BA simultaneously — you can see the non-commutativity side by side rather than toggling between them.
- Intermediate step toggle added: highlights what the first matrix alone produces, as a semi-transparent overlay, before the second matrix acts on it.
- Connections to transformer attention and residual networks added to the narrative.
v0.14.0 March 2026 Content
- What Does a Matrix Do? chapter: opened with a concrete multiplication example — every step from input vector to output vector shown with actual numbers.
- Three new animated figures: a row-by-column multiplication walkthrough, a column-interpretation animation showing what each matrix column geometrically represents, and a gallery of four fundamental transformations (rotation, scale, shear, reflection).
- Determinant section added: what it means when the determinant is large, small, or zero — and what it means for the transformation to be singular.
- Test-vector feature added to the playground: enter any vector and see exactly where the matrix sends it.
v0.13.0 March 2026 Content
- Vector Spaces and Basis chapter: added a worked numeric linear combination example (3 times one vector plus 2 times another) and a connection to how dense layers in neural networks operate.
- Three new animated figures: linear combination dials showing how scalar coefficients move the result, a basis grid collapse showing what linear dependence looks like geometrically, and a basis change diagram showing the same point described in two different coordinate systems.
- Target point feature added to the playground: click any point on the canvas and watch the basis vectors decompose it into coordinates.
- Independence score added to the canvas: shows how close your current basis is to becoming dependent, with colour coding.
v0.12.0 March 2026 Content
- Vector Operations chapter: restructured the dot product section to lead with the sign rule before the formula — knowing whether the result is positive, zero, or negative tells you the rough geometry before any arithmetic.
- Connections to residual networks (addition section) and gradient descent (scalar multiplication section) added to the narrative.
- Four new animated figures: vector addition, scalar multiplication across four scales including negative, a rotating dot product sign demonstration, and cosine similarity staying constant as one vector grows.
- Scalar multiplication mode added to the playground.
v0.11.0 March 2026 Content
- What Is a Vector? chapter: opened with a neural network framing before the geometric intuition — a layer output as a point in high-dimensional space — to establish immediately why vectors matter for ML.
- Four new animated figures: vector as arrow with components, right-triangle magnitude construction, a three-panel illustration from thermometer to wind arrow to a 768-dimensional word embedding, and unit normalisation showing the tip moving inward.
- Python code example added for magnitude and normalisation in NumPy.
- Comparison feature added to the playground: show a second vector alongside the first and read off the angle between them.
v0.10.0 March 2026 Content
- The Gradient chapter: rewritten throughout with ML variable names — loss function L, weights w₁ and w₂ — so the connection to training is immediate rather than retrofitted at the end.
- Walk animation added to the playground: press play and watch 30 gradient descent steps animate on the terrain. On a bowl surface the path spirals inward smoothly. On a ravine it zigzags. On a saddle it stalls near the middle.
- Partial arrows added: toggle them on to see the two partial derivatives as horizontal and vertical components that combine into the gradient vector.
- Three new animated figures covering partial derivative slices, a contour map with gradient arrows, and a single gradient descent step on a contour map.
v0.9.0 March 2026 Content
- Chain Rule chapter: the worked example is now a real neuron — σ(wx) — rather than an abstract function. The full backward pass is computed with specific numbers, step by step.
- Composition depth slider added to the playground: drag from one layer to ten and watch how the gradient shrinks layer by layer. The vanishing gradient problem, made visible.
- Three new animated figures: nudge propagation through a two-layer chain, a forward and backward pass flowing through a network diagram, and a side-by-side comparison of sigmoid and ReLU gradient magnitudes across ten layers.
v0.8.0 March 2026 Content
- The Derivative chapter: opens with a training loss scenario — what the slope of a loss curve tells you at a given epoch — before introducing any formal definition.
- Derivatives table expanded with a column showing where each rule appears in ML.
- Three new animated figures: a secant line converging to the tangent on y = x², a minimum versus saddle point comparison, and a training loss curve with annotated tangent slopes at three points.
- Gradient descent step overlay added to the playground: shows exactly where the next step lands for whatever function and starting point you choose.
v0.7.0 March 2026 Content
- Feature Selection chapter published, completing the Features & Representations topic.
- Covers the three families of selection methods: filter methods (Pearson correlation, mutual information), wrapper methods (forward and backward search), and embedded methods (Lasso, decision tree importance).
- The playground uses a hospital readmission dataset with six features ranging from highly predictive to pure noise. Select any feature pair and compare importance scores across three scoring methods side by side.
v0.6.0 March 2026 Platform
- Rebuilt the animation system from scratch. Pages now load roughly 1 MB lighter and animations run smoother.
- All animated figures respond to theme changes without a page reload.
v0.5.0 March 2026 Content + Platform
- Four chapters published in the Features & Representations topic.
- Feature Vectors: how a set of measurements becomes a point in a geometric space, and why the geometry matters for finding similar items.
- Categorical Encoding: what goes wrong when you represent categories as integers, and how one-hot encoding avoids false distance relationships.
- Normalisation & Scaling: why raw feature values mislead distance-based algorithms, and how z-score and min-max scaling put features on equal footing.
- Similarity & Distance: Euclidean, Manhattan, and cosine distance compared — with animated figures showing when each one is the right choice.
v0.4.0 March 2026 Content
- Understanding Data topic complete — five chapters: What Is a Dataset?, Types of Data, Probability Distributions, Descriptive Statistics, and Data Quality & Outliers.
- New probability distribution playground: watch samples fall one by one, build a histogram live, and see the theoretical curve emerge as the count grows.
- Each chapter is grounded in a consistent scenario, from loan applicant screening to hospital patient records.
v0.3.0 March 2026 Design
- All playgrounds now open with a plain-English description of the scenario you are exploring and what you should notice.
- As you move controls, a panel below updates to describe what your current settings mean — no formulas, just the situation.
- Every control now has a tooltip explaining what it does and why it matters.
v0.2.0 March 2026 Content
- Data & Features domain started.
- First chapter published: What Is a Dataset? — exploring a loan applicant scenario to build intuition for feature spaces and decision boundaries.
v0.1.0 March 2026 Platform
- SeeingML launched.
- Domain 1 — Mathematical Foundations — is complete: nine chapters across Vectors (three chapters), Matrices (three chapters), and Derivatives (three chapters).
- Three interactive playgrounds: a function explorer, a gradient descent terrain, and a matrix transformation canvas.
- Two themes: warm dark (default) and warm light, with instant switching.