Skip to content

D1a0y1bb/PitcherPlant

Repository files navigation

PitcherPlant app icon

PitcherPlant

Native macOS WriteUP audit workbench for security competitions.

macOS 26+ Swift 6.2+ SwiftUI SQLite GRDB XcodeGen

Overview · Features · Quick Start · Workflow · Architecture · Development · 简体中文

Overview

PitcherPlant is a local-first macOS app for auditing security competition WriteUP submissions. It scans a submission directory, extracts text, code snippets, Office/PDF metadata, embedded images, standalone images, and SimHash fingerprints, then builds a structured evidence report for human review.

The app is built for organizers and reviewers who need to find suspicious reuse, rewritten reports, duplicated submissions, reused images, shared metadata, and cross-batch repetition across historical fingerprints. All core data is stored locally in SQLite through GRDB.

Features

Area Capability
Native desktop SwiftUI macOS app with a workspace dashboard, audit queue, report center, evidence inspector, fingerprint library, whitelist library, and settings surface.
Local persistence Reports, jobs, evidence review state, fingerprints, whitelist rules, imports, and export records are stored in a local SQLite database.
Document ingestion Recursively scans pdf, docx, pptx, md, txt, html, htm, rtf, source files, and standalone images. Office temporary files such as ~$draft.docx are skipped.
DOCX/PPTX parsing Extracts document text, author metadata, last-modified-by metadata, slide text, and embedded media from Office archives.
PDF parsing Extracts text and author metadata with PDFKit, reads embedded image streams, and uses page thumbnails as a fallback.
Text similarity Uses TF-IDF with word and character n-grams plus cosine similarity to detect suspiciously similar WriteUP text.
Code similarity Extracts fenced code and heuristic code blocks, then compares lexical shingles, structural tokens, and shared token coverage.
Image reuse Computes perceptual hash, average hash, and difference hash values, then compares image evidence with Hamming distance.
Metadata collisions Groups suspicious overlaps in author and last-modified-by fields.
Cross-batch reuse Stores SimHash fingerprints and compares new submissions against historical batches with exact Hamming-distance semantics.
Whitelist workflow Supports author, filename, text fragment, code template, image hash, metadata, and path rules. Matches can be marked or hidden.
Batch import Imports ZIP files, nested folders, and team directories into queued audit jobs. Jobs run serially and can be retried.
Evidence review Evidence rows support confirmed, false positive, ignored, favorite, watched, severity override, notes, and whitelist actions.
Report center Shows overview, text, code, image, metadata, deduplication, fingerprint, and cross-batch sections with risk sorting and detail inspection.
Export Exports HTML, PDF, CSV, JSON, Markdown, and Evidence Bundle ZIP packages.
Large workspace handling Uses paged database reads, incremental job event writes, audit preflight checks, cancellable parsing loops, report filtering caches, image caches, code diff caches, and bounded graph rendering.

Supported Inputs

Type Extensions
Documents pdf, docx, pptx, md, txt, html, htm, rtf
Source code py, c, cc, cpp, h, hpp, java, go, js, jsx, ts, tsx, swift, sh, bash, zsh, rb, rs, php, cs, kt, sql, m, mm
Images png, jpg, jpeg, gif, bmp, tiff, webp
Batch imports ZIP archives and nested submission directories

Quick Start

Open the workspace from the repository root:

open PitcherPlant.xcworkspace

In Xcode, select the PitcherPlantApp scheme, choose My Mac as the run destination, and launch the app.

Build and run from the command line:

cd PitcherPlantApp
./script/build_and_run.sh

Build, launch, and verify that the app process is running:

cd PitcherPlantApp
./script/build_and_run.sh --verify

The helper script builds the Debug app with xcodebuild and launches:

PitcherPlantApp/.build/xcode/Build/Products/Debug/PitcherPlant.app

Workflow

flowchart LR
    A["Choose audit directory"] --> B["Parse documents and images"]
    B --> C["Analyze text, code, image, metadata, and fingerprints"]
    C --> D["Generate structured report"]
    D --> E["Review evidence in Report Center"]
    E --> F["Maintain whitelist and historical fingerprints"]
    E --> G["Export HTML, PDF, CSV, JSON, Markdown, or Bundle"]
Loading
  1. Open PitcherPlant and review workspace counts for jobs, reports, fingerprints, and whitelist rules.
  2. Open New Audit.
  3. Choose the audit directory, output directory, and report file name template.
  4. Adjust text similarity, deduplication, image hash distance, SimHash distance, Vision OCR, and whitelist behavior.
  5. Start the audit job.
  6. Review findings in Report Center by evidence type, risk score, section, and search query.
  7. Open evidence in the inspector to compare text, code, image attachments, metadata, source references, and review notes.
  8. Mark evidence as confirmed, false positive, ignored, favorite, or watched.
  9. Add whitelist rules when repeated legitimate templates or known sources appear.
  10. Export the selected report as HTML, PDF, CSV, JSON, Markdown, or Evidence Bundle ZIP.

Detection Model

Analyzer Input Method Output
TextSimilarityAnalyzer Normalized document text TF-IDF, word n-grams, character n-grams, cosine similarity Similar WriteUP pairs with shared context attachments
CodeSimilarityAnalyzer Fenced code and heuristic code blocks Lexical shingles, structural signatures, shared token ratio Similar code pairs with token and structure details
ImageReuseAnalyzer Embedded and standalone images pHash, aHash, dHash, Hamming distance Reused-image evidence with thumbnails and source references
MetadataCollisionAnalyzer Author and last-modified-by metadata Field grouping with common-author filtering Metadata collision rows
DedupAnalyzer Normalized document text Stricter text similarity threshold Near-duplicate file pairs
FingerprintAnalyzer Parsed document text SimHash fingerprinting Current-batch fingerprint records
CrossBatchReuseAnalyzer Current and historical SimHash records Hamming distance with direct scan or BK-tree index Cross-batch reuse matches

Evidence is decision support for reviewers. Final enforcement decisions should consider competition rules, context, and manual verification.

Configuration

Default values come from AuditConfiguration.defaults(for:).

Setting Default
Input directory Fixtures/WriteupSamples/date
Output directory GeneratedReports/full
Report name template {dir}_PitcherPlant_{date}.html
Text similarity threshold 0.75
Deduplication threshold 0.85
Image hash distance threshold 5
SimHash distance threshold 4
Vision OCR Enabled
Whitelist mode Mark matches

The toolbar also exposes scan profiles:

Profile Behavior
Standard Uses the default thresholds above.
Deep Lowers text and deduplication thresholds, increases image and SimHash distance, keeps OCR enabled.
Quick Raises text and deduplication thresholds, tightens image and SimHash distance, disables OCR.
Evidence Review Tunes thresholds for broader review and uses the {dir}_EvidenceReview_{date}.html report template.
Fast Screening Uses quick scanning and the {dir}_QuickScreen_{date}.html report template.

Architecture

PitcherPlant/
├── Docs/
├── Fixtures/
├── GeneratedReports/
├── PitcherPlant.xcworkspace/
└── PitcherPlantApp/
    ├── Package.swift
    ├── Package.resolved
    ├── project.yml
    ├── PitcherPlantApp.xcodeproj
    ├── Resources/
    ├── script/
    ├── Sources/PitcherPlantApp/
    │   ├── App/
    │   ├── Core/
    │   ├── Features/
    │   ├── Models/
    │   ├── Persistence/
    │   └── Support/
    └── Tests/PitcherPlantAppTests/
Path Responsibility
App/ SwiftUI app entry point, shared app state, commands, and window setup.
Core/ Document ingestion, analyzers, audit runner, risk scoring, report assembly, export, batch import, and fingerprint packaging.
Features/ Main window, workspace dashboard, report center, evidence inspector, libraries, and settings views.
Models/ Audit configuration, jobs, reports, evidence, settings, fingerprints, and whitelist models.
Persistence/ GRDB-backed SQLite store, schema migrations, paged reads, event writes, and review-state persistence.
Support/ Workspace discovery, localization, theme, layout surfaces, typography, filtering, and environment helpers.
Tests/PitcherPlantAppTests/ Unit and integration tests for ingestion, analyzers, reports, persistence, imports, cancellation, caching, and performance-sensitive helpers.

Audit pipeline:

flowchart TD
    A["AuditConfiguration"] --> B["AuditRunPreflight"]
    B --> C["DocumentIngestionService"]
    C --> D["DocumentFeatureStore"]
    D --> E["TextSimilarityAnalyzer"]
    D --> F["CodeSimilarityAnalyzer"]
    C --> G["ImageReuseAnalyzer"]
    C --> H["MetadataCollisionAnalyzer"]
    D --> I["DedupAnalyzer"]
    D --> J["FingerprintAnalyzer"]
    J --> K["CrossBatchReuseAnalyzer"]
    E --> L["ReportAssembler"]
    F --> L
    G --> L
    H --> L
    I --> L
    K --> L
    L --> M["ReportExporter"]
    L --> N["DatabaseStore"]
Loading

Data Storage

PitcherPlant resolves a workspace root at launch. The preferred database location is:

.pitcherplant-macos/PitcherPlantMac.sqlite

When the workspace is read-only, the fallback location is:

~/Library/Application Support/PitcherPlant/.pitcherplant-macos/PitcherPlantMac.sqlite

Local generated paths:

.pitcherplant-macos/
GeneratedReports/
PitcherPlantApp/.build/
PitcherPlantApp/.pitcherplant-macos/
PitcherPlantApp/reports/
PitcherPlantApp/build/

These paths are ignored by Git.

Development

Install XcodeGen:

brew install xcodegen

Regenerate the Xcode project:

cd PitcherPlantApp
xcodegen generate

Run SwiftPM tests:

cd PitcherPlantApp
swift test

Run Xcode scheme tests:

cd PitcherPlantApp
xcodebuild -project PitcherPlantApp.xcodeproj -scheme PitcherPlantApp -destination 'platform=macOS' test

Build the Release app:

cd PitcherPlantApp
xcodebuild -project PitcherPlantApp.xcodeproj -scheme PitcherPlantApp -destination 'platform=macOS' -configuration Release build

Check whitespace before committing:

git diff --check

Project naming:

Context Name
SwiftPM package PitcherPlantApp
SwiftPM executable product PitcherPlantApp
Xcode scheme PitcherPlantApp
App bundle and product PitcherPlant
Bundle identifier com.pitcherplant.desktop

Dependencies:

Dependency Version policy Current resolved version
GRDB.swift from 7.0.0 7.10.0
ZIPFoundation from 0.9.19 0.9.20

Release

Create local ad-hoc release artifacts:

cd PitcherPlantApp
./script/package_release.sh --distribution ad-hoc

Artifacts are written to:

PitcherPlantApp/build/export/PitcherPlant.app
PitcherPlantApp/build/dist/PitcherPlant-macOS.zip
PitcherPlantApp/build/dist/PitcherPlant-macOS.dmg
PitcherPlantApp/build/dist/PitcherPlant.xcarchive.zip
PitcherPlantApp/build/dist/PitcherPlant-dSYMs.zip
PitcherPlantApp/build/dist/PitcherPlant-macOS-checksums.txt
PitcherPlantApp/build/dist/release-notes.md

The release script performs archive, export, ZIP/DMG packaging, code-sign verification, DMG verification, unpack checks, mount checks, and SHA-256 checksum generation.

Sparkle appcast generation requires SPARKLE_ED_PRIVATE_KEY. For real tagged release packaging, also provide RELEASE_TAG and set REQUIRE_RELEASE_NOTES=true; the script will fail if PitcherPlantApp/ReleaseNotes/<tag>.md is missing.

Developer ID distribution is available through:

cd PitcherPlantApp
./script/package_release.sh --distribution developer-id --notarize

GitHub tag publishing accepts vX.Y.Z, vX.Y.Z-beta, and vX.Y.Z-rc.N. Tag pushes publish through the release workflow: Developer ID signing and notarization are used when all signing secrets are available, otherwise the workflow publishes ad-hoc signed, not notarized artifacts. Required release notes and Developer ID variables are documented in Docs/RELEASE.md.

Related Docs

License

PitcherPlant is released under the MIT License. See LICENSE.

About

macOS app for auditing CTF write-up similarity and evidence.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors