All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Default-on normalization pipeline:
detect()now normalizes homoglyphs, zero-width characters, whitespace-split tokens, leetspeak, common typos, and phonetic substitutions before matching injection patterns - Configurable normalization controls:
DetectOptions.normalizationallows callers to disable normalization entirely or tune individual repair stages
- Provider hardening tests: Updated provider expectations to assert against the current
harden()output instead of the removed"Security Rules"marker
excludeCategories: Skip detection for categories (e.g.["social_engineering"]) to reduce false positivesallowPhrases: Whitelist phrases; input containing one suppresses detectionsecondaryDetector: Optional async verifier for LLM-based override of heuristic detectiondetectAsync: Async variant supportingsecondaryDetectorstreamingSanitize: "chunked": Process streams in 8KB chunks to limit memory for long outputsstreamingChunkSize: Configurable chunk size for chunked mode (default 8192)shieldLanguageModelMiddleware: AI SDK middleware for automatic hardening, detection, and output sanitization (no manualsanitizeOutput)
- Dependencies: Upgraded to ai ^6, openai ^6, @ai-sdk/openai ^3, @anthropic-ai/sdk ^0.78, groq-sdk ^0.37
- Providers: Use
detectAsyncwhensecondaryDetectoris configured
- Core functions:
harden,detect,sanitize,sanitizeObject - Provider wrappers: OpenAI, Anthropic, Groq, Vercel AI SDK
- Injection detection: Pattern-based detection with 10+ categories (instruction override, role hijack, prompt extraction, authority exploit, tool hijacking, etc.)
- Leak sanitization: N-gram matching with paraphrased leak detection
- Typed errors:
InjectionDetectedError,LeakDetectedError,ShieldError - Multi-part messages: Text extraction from
ContentPart[]for OpenAI/Groq (text + images) - System prompt derivation: Auto-derive from params when
systemPromptnot provided - Streaming: Sanitized content yielded in chunks to preserve streaming UX
throwOnLeakoption: ThrowLeakDetectedErrorinstead of redacting when leak detected- AI SDK system array: Harden
systemwhen passed as array of parts - Integration tests: Opt-in tests for OpenAI (Anthropic, Groq when keys configured)
- Benchmarks:
bun run benchmarkfor performance verification
- Heuristic-based; use as defense-in-depth, not sole protection
- See README Threat Model for limitations