Skip to content

berangerthomas/SchoolOfStatistics

Repository files navigation

title School of Statistics
colorFrom blue
colorTo indigo
sdk static
pinned false

SOS - School of Statistics

CI Documentation Demo License: MIT TypeScript Tests

SOS - School Of Statistics

Interactive visualizations for exploring statistical and machine learning concepts. Each page runs entirely in the browser (TypeScript compiled with Vite, Chart.js for rendering) — no server or build step required to run the built version.

Available Pages

Classification

  • Direct Classification: Generate synthetic 2D datasets and observe how class separation affects Gaussian Naive Bayes classifier performance. Displays ROC curve, AUC, confusion matrix, and standard metrics (accuracy, precision, recall, specificity, F1-score).

  • Inverse Classification: Directly set confusion matrix values (TP, FP, TN, FN) and observe resulting metrics, ROC curve, and simulated score distributions. Parameters can be locked to constrain totals.

  • Logistic Regression: Logistic decision boundary and gradient-based training. Adjust learning rate, iterations, and regularization (L1/L2).

  • k-Nearest Neighbors: Interactive k-NN Playground decision boundary visualization. Adjust k and distance metric.

  • Support Vector Machine: Interactive SVM visualization with decision boundary and margin display. Adjust regularization parameter C to see its effect on margin width and support vectors. Visualize support vectors highlighted in yellow, and understand the trade-off between margin maximization and classification errors.

Regression

  • Linear Regression: Interactive point placement on canvas with linear or polynomial regression fitting. Displays residuals, coefficient of determination (R²), and regression diagnostics. Supports zoom, point dragging, and confidence band display.

  • Regularization Playground: Ridge, Lasso, and Elastic-Net regularization visualization. Explore coefficient paths and shrinkage effects.

Optimization

  • Gradient Descent: Explore optimization algorithms (SGD, Momentum, Adam, RMSprop) navigating loss surfaces in real time.

Evaluation

  • Metrics Comparison: Visual comparison of regression and classification metrics with D3-based interactive charts.

  • Bias-Variance Tradeoff: Analyze bias-variance decomposition with polynomial fitting of increasing degree.

Dimensionality Reduction

  • PCA Step-by-Step: Interactive PCA visualization with Gaussian clouds, eigen decomposition, and 2D projections.

Signal Processing

  • Fourier Transform: Compose signals from sine waves and visualize their frequency spectrum. Up to 4 components with frequency, amplitude, and phase control. Displays time-domain signal, magnitude spectrum, phase spectrum, and signal metrics.

Embeddings

  • Embedding Distances: Explore 2D vector distance and similarity intuition with cosine similarity, Euclidean distance, and dot product.

Project Structure

.
├── src/                          # TypeScript source code
│   ├── core/                     # Shared core modules
│   │   ├── math/                 #  Mathematical utilities (linear algebra, stats, distributions)
│   │   ├── types/                # TypeScript type definitions
│   │   └── utils/                # Utility functions (DOM, canvas, color, animation)
│   ├── features/                 # Feature modules
│   │   ├── bias_variance/        # Bias-variance tradeoff analysis
│   │   ├── classification/      # Metrics, ROC, data generation
│   │   ├── embedding_distances/  # Embedding distance computations
│   │   ├── logistic_regression/  # Logistic regression implementation
│   │   ├── metrics_comparison/   # Metrics comparison tool
│   │   ├── optimization/        # Optimizers (gradient descent variants)
│   │   ├── pca/                 # Principal Component Analysis
│   │   ├── regression/          # Polynomial regression, diagnostics
│   │   ├── regularization/      # Ridge/Lasso regularization
│   │   └── signal_processing/   # Signal processing utilities
│   ├── pages/                    # Per-module pages (main.ts for each)
│   ├── styles/                   # Global styles (Tailwind)
│   ├── ui/                       # Reusable UI components
│   └── vite-env.d.ts
├── public/                       # Static assets
├── docs/                         # VitePress documentation
├── todo/                         # Planning & tracking
├── .github/workflows/            # CI/CD pipelines
├── index.html                    # Entry point
└── package.json

Development

# Install dependencies
npm install

# Start dev server with hot-reload
npm run dev

# Type-check
npm run typecheck

# Lint
npm run lint

# Fix lint issues automatically
npm run lint:fix

# Format code
npm run format

Testing

# Run all tests
npm test

# Watch mode
npm run test:watch

# With coverage
npm run test:coverage

Current test coverage: > 95 % on src/core/math/ (linear algebra, statistics, gaussian distributions).

Build

# Build for production (includes docs)
npm run build

# Preview production build
npm run preview

Versioning

See CHANGELOG.md for release history.

Roadmap

Upcoming

  • Clustering Algorithms Visualizer: k-Means and DBSCAN comparison on various dataset shapes
  • Neural Network Architecture & Forward Pass Visualizer: layer-by-layer fully-connected network construction
  • Tokenization & Embedding Visualizer: tokenization and 2D embedding space projection
  • Attention Mechanism Visualizer: Transformer attention mechanism visualization
  • Probability Distributions Explorer: exploration of standard distributions (Normal, Uniform, Exponential, Poisson, Binomial, Beta, Gamma, Chi-squared)
  • Markov Chain Text Generator: Markov chain construction and text generation
  • A/B Testing Calculator: statistical tool for hypothesis testing
  • Voice Signal Waveform Analyzer: audio recording, waveform display, spectrogram computation, and dominant frequency identification

License

See the LICENSE file.

Packages

 
 
 

Contributors

Languages