Data engineer based in the UK. I break things, read source code, and ship fixes upstream.
MSc Data Analytics from Aston. Day job is in food manufacturing; on the side I maintain four small Python tools on PyPI and contribute to OSS projects I use. I believe compliance shouldn't mean spreadsheets and AI shouldn't require the cloud.
- 🔭 Maintaining sql-sop, sql-sop-mcp, pr-sop, morning-brief
- 🌱 Currently learning T-SQL internals, on-prem AI orchestration, dbt's semantic layer, async signal handling for Kubernetes-deployed Python services
- 👯 Happy to collaborate on manufacturing data, SQL safety tooling, or on-prem AI work
- 💬 Ask me about Python, SQL Server, dbt, FastAPI, LangGraph
sql-sop — A Python SQL linter. 39 rules, 152 tests, libCST-based injection scanner, inline disable directives, SARIF output. v0.6.2 on PyPI; v0.7 milestone in progress (Performance Rules Pack), with ROADMAP and a scaffold tool for new contributors. 500+ monthly downloads on PyPI. Browser playground runs Pyodide so no data leaves the page. pip install sql-sop
sql-sop-mcp — Model Context Protocol server wrapping sql-sop's linter. Two stdio tools (lint_sql, list_rules) callable from Claude Desktop, Cursor, ChatGPT desktop, or any MCP-aware LLM client. Built on FastMCP. Trusted-Publishing release pipeline. pip install sql-sop-mcp
pr-sop — A small PR governance checker. Three configurable checks: CHANGELOG drift, version mismatch between pyproject.toml and __init__.py, and stale rev: pins in pre-commit configs. CLI, pre-commit hook, or GitHub Action. pip install pr-sop
morning-brief — Rule-based daily Gmail triage. Read-only OAuth, no LLM. Classifies recent mail into HIGH / MEDIUM / LOW / SPAM via YAML rules and writes a markdown digest. v0.3.0 added sub-day windows, thread collapse, and preview / why commands for working on the rules. pip install morning-brief
Production Analytics Pipeline — Incremental ETL from a fish-production ERP. Around 15K rows/day, FastAPI plus Next.js plus Power BI on the consumption side, Prefect for orchestration. 53 tests.
UK Crime Pipeline — Police UK API into PostgreSQL and BigQuery. 99,675 records, 6 dbt marts, 65 tests, Polars on the ingestion side. streamlit · looker studio · hugging face
OpsMind — On-prem AI for manufacturing. NL-to-SQL via a LangGraph agent, MCP server architecture, pgvector + ChromaDB RAG, Gemma 3 12B served by Ollama. Includes a small golden-set eval harness. docs
Manufacturing Compliance Dashboard — BRC/HACCP food-safety compliance. MCP server exposes 5 compliance tools to LLM agents, plus an NL query interface for auditors and a /metrics endpoint following the Four Golden Signals. live
SQL Ops Reviewer — A GitHub Action that reviews .sql files in PRs using a local LLM. Posts structured comments on the PR. One YAML to wire up, runs on the CI runner, no API keys.
MediAsk — Health Q&A platform for factory workers. NHS-verified guidance, voice input, 18 languages. Flask + PostgreSQL, Dockerised. live



