OpenLine

Open-source contact discovery and prospect research powered by Jungle Grid.

OpenLine discovers organizations, investigates public sources, identifies publicly available contact channels, and produces structured prospect data for reviewable outreach workflows.

Powered by Jungle Grid for managed, asynchronous research workloads. OpenLine remains open source and self-hostable.

What is OpenLine?

OpenLine is an open-source, self-hostable prospect intelligence and contact discovery system. Developers can clone it, configure campaigns and sources, and adapt it for their own public research workflows.

OpenLine can collect leads and contact channels from supported public sources, preserve evidence, score fit, produce structured prospect data, and create reviewable outreach drafts. Jungle Grid is used as the execution layer for compute-heavy, asynchronous discovery, scraping, enrichment, extraction, model inference, research, and validation jobs.

Users must supply their own credentials, API keys, source configuration, and Jungle Grid access wherever required. OpenLine should only process public or explicitly authorized information, and operators are responsible for respecting platform terms, robots.txt, privacy rules, rate limits, and applicable laws.

Who it is for

OpenLine is for teams and developers who want a self-hosted research system that keeps evidence, contacts, drafts, approvals, and delivery controls in their own application instead of a black-box outbound platform.

It is designed for:

prospect research from public sources;
evidence-bound contact discovery;
campaign-specific scoring and qualification;
reviewable first-touch outreach drafts;
durable conversation tracking and approval gates;
developers adding new source adapters or contact methods.

What is implemented

Business profile setup and saved campaigns in the UI.
Configurable campaign contracts for offer, ICP, qualification, scoring, messaging, and model choices.
Public-source adapter registry with health, retries, rate limits, source signals, and deterministic fixtures.
Prospect research with public contact provenance and evidence URLs.
Fit scoring, proof-of-value artifacts, and structured JSON exports.
Managed Qwen/Ollama drafting and semantic validation inside the worker workload.
Strict artifact contracts and backend revalidation before local persistence.
Next.js dashboard, route handlers, SQLite persistence, run logs, artifact views, and exports.
Seed prospect import from CSV or JSON with preview validation.
Suppression import from CSV or JSON with preview validation.
Internal draft review, editing, approval/rejection, suppression, blocklist controls, and policy-gated ZeptoMail sending.
Conversation history, inbound reply ingestion, scheduled follow-up evaluation, and approval-required messages.
Docker worker image workflow, CI, release automation, and sample dry-run data.

Supported Sources

config/sources.yaml controls the adapter registry. Adapter hits are reported as source_signals in run_summary.json. A source signal becomes a prospect only when it resolves to an official repository or public official site with acceptable contact provenance.

Fully supported sources

These adapters are implemented and enabled by default, subject to public endpoint availability and local configuration:

Source	Access method	Notes
GitHub	API	`GITHUB_TOKEN` is optional and improves rate limits.
Official websites	Public webpages	Used for public contact method discovery and provenance.
News/RSS	Feeds, APIs, public webpages	Discovery and why-now signals from configured feeds.
Hugging Face	API, public webpages	Models, spaces, and datasets.
Hacker News	API	Discovery signal; posters are not treated as contact targets.
GitLab	API	Project discovery.
npm	API	Package discovery with generic-package filtering.
PyPI	Public search page	Discovery signal from the public search page.
Docker Hub	API	Repository discovery.
Stack Exchange	API	Discovery signal; question authors are not treated as contact targets.
arXiv	API/feed	Research discovery signal.
Job listings	APIs, feeds, public webpages	Includes configured URLs and Remotive API.
Accelerators	APIs, feeds, public webpages	Uses configured public URLs.
Open Collective	Public search page	Funding/community discovery signal.

Experimental or credential-gated sources

These adapters exist in code but are disabled when required credentials are missing or authorization is not configured:

Source	Required credentials	Status
Reddit	`REDDIT_CLIENT_ID`, `REDDIT_CLIENT_SECRET`	Implemented, credential-gated.
YouTube	`YOUTUBE_API_KEY`	Implemented, credential-gated.
Discord	`DISCORD_BOT_TOKEN`	Restricted; authorized bot, visible channels only, no DMs.
Slack	`SLACK_BOT_TOKEN`	Restricted; authorized app, visible channels only, no DMs.
LinkedIn	`LINKEDIN_ENRICHMENT_API_KEY`	Restricted; approved API or user-supplied data only.
Facebook Pages	`META_ACCESS_TOKEN`	Restricted; approved Meta API or user-supplied public page URLs only.
Product Hunt	`PRODUCT_HUNT_TOKEN`	Restricted; approved API and commercial permission required.

Planned sources

No additional planned source adapters are advertised in this README because the repository does not currently contain an implementation contract for them. See docs/source-adapters.md before adding a new source.

High-level workflow

Campaign criteria
  -> source adapter discovery
  -> public page/API/feed retrieval
  -> evidence extraction and normalization
  -> entity resolution and contact provenance checks
  -> qualification and fit scoring
  -> draft generation and semantic validation
  -> backend artifact validation
  -> SQLite persistence, review, export, and optional approved sending

How Jungle Grid is used

OpenLine submits managed workloads through the Jungle Grid REST API from src/providers/junglegrid-workload-provider.ts. The main run is asynchronous: the API route accepts a run, persists it locally, submits a remote job, and the UI polls local run state while the orchestrator polls Jungle Grid.

The production lead workflow submits the worker image and runs:

python /app/workers/outreach/outreach_worker.py \
  --job full-run-qwen \
  --target <count> \
  --output /workspace/artifacts

The template mode uses the same worker with --job full-run-template. Inbound replies and scheduled follow-ups submit a separate conversation-turn-qwen workload that returns conversation_result.json.

The backend handles remote execution by:

estimating the workload before submission;
storing each execution attempt in junglegrid_jobs;
polling queued, starting, running, and terminal phases;
recording remote status, events, logs, artifacts, failure reasons, retry count, and job IDs;
retrying failed or timed-out attempts up to the configured bound;
resuming non-terminal jobs after server restart without submitting duplicates;
downloading required artifacts only after completion;
validating every artifact bundle locally before saving drafts;
allowing operators to cancel active remote jobs.

Required and relevant environment variables include:

Variable	Purpose
`JUNGLEGRID_API_KEY`	Required to submit, poll, cancel, and inspect managed jobs.
`JUNGLEGRID_API_BASE`	API base URL; defaults to `https://api.junglegrid.dev`.
`JUNGLEGRID_DEFAULT_IMAGE`	Worker image submitted for managed runs.
`JUNGLEGRID_DEFAULT_WORKLOAD_TYPE`	Workload type used in job payloads.
`JUNGLEGRID_OPTIMIZE_FOR`	Routing preference for managed jobs.
`JUNGLEGRID_MAXIMUM_ATTEMPTS`	Maximum retry attempts.
`JUNGLEGRID_RETRY_BACKOFF_SECONDS`	Delay between retry attempts.
`JUNGLEGRID_POLL_INTERVAL_MS`	Polling interval for remote status.
`JUNGLEGRID_JOB_TIMEOUT_MS`	Local polling timeout.
`JUNGLEGRID_RESEARCH_BATCH_SIZE`	Research batch size passed in the job contract.
`JUNGLEGRID_SCORING_BATCH_SIZE`	Scoring batch size passed in the job contract.
`JUNGLEGRID_DRAFTING_BATCH_SIZE`	Drafting batch size passed in the job contract.
`JUNGLEGRID_VALIDATION_BATCH_SIZE`	Validation batch size passed in the job contract.
`JUNGLEGRID_REGISTRY_CREDENTIAL_ID`	Optional registry credential for private worker images.
`GITHUB_TOKEN`	Optional source credential for higher GitHub API limits.
`REDDIT_CLIENT_ID`, `REDDIT_CLIENT_SECRET`, `YOUTUBE_API_KEY`	Required only for their credential-gated adapters.
`ZEPTOMAIL_*`	Required only for approved email delivery. ZeptoMail credentials are not passed to the worker.

See docs/architecture.md, docs/worker-image.md, and docs/junglegrid-dogfooding.md for implementation details.

Safety principles

Publicly listed professional contact points only; no guessed or private contacts.
No guessed, leaked, hidden, brokered, or unrelated commit addresses.
Every personalization claim requires public evidence.
Draft bodies are 70-140 words and must include exactly one allowed campaign link.
Semantic validation statuses are send_ready, manual_review_required, regeneration_required, and excluded; fallback drafts are never send_ready.
No attachments, tracking, pixels, or unapproved channel automation.
Policy-autonomous email is allowed only after qualification, provenance, semantic validation, campaign, limit, opt-out, escalation, and managed-execution gates all pass.
Internal drafts are stored only after artifact validation.
ZeptoMail sending is disabled by default and requires a manual dashboard click.
Bulk sending requires approved drafts plus the confirmation phrase SEND APPROVED DRAFTS.
First setup defaults to dry-run.

See docs/safety.md.

Architecture

Next.js dashboard/API
  -> managed workload API
  -> outreach worker
  -> public research + scoring
  -> Qwen/Ollama or deterministic templates inside the managed workload
  -> /workspace/artifacts/*.json
  -> backend validation + persistence
  -> internal draft review database
  -> manually approved ZeptoMail send

ZeptoMail credentials are not available inside the worker. See docs/architecture.md.

Inbound replies and scheduled follow-up evaluations run as separate managed conversation jobs. The backend persists their job IDs, applies deterministic policy gates, and sends only through an approved channel adapter.

Quickstart

npm run setup
npm run dev

Open http://localhost:3000. The UI can be inspected without credentials, but campaign execution requires a working JUNGLEGRID_API_KEY.

Recommended first-run flow inside the app:

Open Settings and save the business profile.
Open Campaigns and create a saved campaign from a preset.
Open Prospects and import seed rows if you already have them.
Import suppressions in Settings if you have existing opt-outs or exclusions.
Configure only the provider credentials you plan to use.
Start a manual run.

npm run setup installs dependencies, prompts for the API key when run in a terminal, verifies API reachability, estimates and submits a one-item template workload, waits for completion, and checks its events, logs, and six artifacts. The verification job uses managed capacity and may incur a small charge. Create an account at junglegrid.dev, issue an API key from the account dashboard, and provide it when setup prompts.

Manual setup

Required for production:

A Jungle Grid API key.
A ZeptoMail send-mail token.
A verified ZeptoMail sender domain/address.

Import formats

Seed prospects CSV

Use headers like:

name,email,company,project,category,websiteUrl,githubUrl,projectDescription
Jane Maintainer,jane@example.com,Acme,Acme Platform,saas,https://acme.example,https://github.com/acme/platform,Durable workflow platform

Seed prospects JSON

[
  {
    "name": "Jane Maintainer",
    "email": "jane@example.com",
    "company": "Acme",
    "project": "Acme Platform",
    "category": "saas",
    "websiteUrl": "https://acme.example"
  }
]

Suppressions CSV

email,domain,reason,source
,no-contact.example,manual suppression,operator_import
blocked@example.com,,customer opt-out,crm_import

Suppressions JSON

[
  {
    "domain": "no-contact.example",
    "reason": "manual suppression",
    "source": "operator_import"
  }
]

A compliant email use case before enabling EMAIL_SEND_MODE=manual_approval_only.

Optional:

A GitHub token for higher public API rate limits.
Source-specific credentials for adapters such as Reddit and YouTube.

No paid hosted model provider credential is required or supported.

Development fixtures

python3 workers/outreach/outreach_worker.py \
  --job full-run-template \
  --target 2 \
  --output ./artifacts \
  --input ./examples/sample-worker-input.json

Direct worker execution is only for deterministic fixture and image testing. It is not production eligible. The legacy outreach:run:local command remains for CLI compatibility but now submits the Qwen workload through the managed pipeline.

Running managed jobs

Set JUNGLEGRID_API_KEY, then:

npm run outreach:run:junglegrid -- --count 17

The backend submits the configured worker image, polls the job, downloads all required artifacts, validates them, and stores local drafts.

For Qwen/Ollama workloads:

npm run outreach:test:junglegrid:qwen
npm run outreach:run:junglegrid:qwen -- --count 17

The first command performs an estimate only. The full run uses qwen2.5:3b by default. Production submissions force LLM_FALLBACK_MODE=disabled: a failed model workload fails or degrades explicitly and never silently switches to a local or external provider. Worker-only fixture tests may exercise fallback reporting, but those outputs are not production eligible.

Campaigns

Campaign files under config/campaigns/ configure the offer, ICP, qualification signals, scoring interpretation, messaging, and per-stage model choices. The manual-run screen can select any valid campaign. The included generic-saas.json example targets release observability rather than AI infrastructure while still executing through the managed worker. See docs/campaign-configuration.md.

ZeptoMail setup

Configure a ZeptoMail send-mail token, verified sender domain, sender address, reply-to address, and a test recipient. ZEPTOMAIL_API_BASE is required and configurable because account regions can differ. Sending remains disabled until EMAIL_SEND_MODE=manual_approval_only and DRY_RUN=false are set intentionally. See docs/zeptomail-setup.md and docs/zeptomail-implementation-notes.md.

Frontend dashboard

The dashboard includes:

Daily progress, latest run, managed job, validation, ZeptoMail, and model status.
Prospect filters, evidence, score breakdown, and blocklist controls.
Research approval/rejection.
Draft editing, validation, approval/rejection, suppression, and manual ZeptoMail sending.
Runs with job IDs, events, logs, artifacts, JSON/CSV export, and retry metadata.
Settings for mode, worker image, model, thresholds, dry-run, and safety limits.
Manual runs with mode, category, target, and live phase polling.
Conversation history, opportunity state, follow-up dates, managed job status, and approval-required messages.

Add release screenshots under docs/screenshots/ and link them here.

CLI commands

Command	Purpose
`npm run dev`	Start the Next.js development server.
`npm run build`	Build the production application.
`npm run start`	Start the production server.
`npm run lint`	Run ESLint.
`npm run typecheck`	Run TypeScript without emitting files.
`npm run test`	Run TypeScript unit and UI tests.
`npm run test:python`	Run worker smoke tests.
`npm run outreach:run:local`	Legacy alias; submit the Qwen pipeline through the managed worker.
`npm run outreach:run:junglegrid`	Run the template worker on Jungle Grid.
`npm run outreach:run:junglegrid:qwen`	Run the Qwen/Ollama worker on Jungle Grid.
`npm run outreach:test:junglegrid:qwen`	Estimate the Qwen worker contract without starting a job.
`npm run outreach:discover`	Legacy alias for the unified managed pipeline.
`npm run outreach:research`	Legacy alias for the unified managed pipeline.
`npm run outreach:score`	Legacy alias for the unified managed pipeline.
`npm run outreach:draft`	Legacy alias for the unified managed pipeline.
`npm run outreach:list`	List stored prospects.
`npm run outreach:status`	Show dashboard totals and recent runs.
`npm run outreach:export -- --format json`	Export prospects as JSON or CSV.
`npm run outreach:zeptomail:test`	Send a controlled ZeptoMail test message to `ZEPTOMAIL_TEST_RECIPIENT`.
`npm run outreach:jg:logs -- --job-id ID`	Print managed job logs.
`npm run outreach:jg:artifacts -- --job-id ID`	List managed job artifacts.

Worker image

./scripts/build-worker-image.sh
./scripts/test-worker-local.sh
./scripts/push-worker-image.sh

The release image is junglegrid/outreach-qwen-worker:latest. Two supported strategies are documented in docs/worker-image.md: pull the model at job start, or publish a larger image with the model preloaded.

Adding a new source or contact method

Source adapters live in workers/outreach/source_adapters.py, with configuration in config/sources.yaml. To add a source:

Add a stable ID to CORE_SOURCES or RESTRICTED_SOURCES.
Declare credentials, access methods, authorization requirements, and permissions in default_capabilities.
Add a _discover_<source_id> method that uses an official API, feed, or permitted public page.
Return SourceCandidate records with source IDs, URLs, publication dates, retrieval methods, and metadata.
Reuse the shared fetch, normalize, retry, throttling, caching, and redacted error behavior.
Add deterministic fixture tests and credential-disabled tests.

Do not add anti-bot bypasses, private-channel collection, guessed contacts, or undocumented endpoints. Contact methods must preserve their public source URL, confidence, listing status, and authorization status.

Execution persistence

Each managed run stores its estimate, remote job ID, workspace and campaign IDs, pipeline stage, execution phase, timestamps, retry count, log cursor, artifacts, workload metadata, and failure reason. Restarting a run with an already-submitted job resumes polling that job instead of submitting a duplicate. Run summaries report executionBackend: "jungle_grid" and keep local and external AI fallback counts at zero.

Active jobs are resumed automatically after a server restart. Failed and timed-out attempts are retried up to the configured bound; each attempt remains visible in the run detail page. Operators can cancel an active remote job from the same page.

For Qwen campaigns, the managed worker also executes batched semantic research, qualification, score explanations, angle selection, draft generation, and semantic draft validation. Production ingestion fails closed if an applicable semantic stage is missing or unsuccessful.

Privacy and retention

Only public or explicitly authorized source data may be collected. Restricted adapters do not process DMs, hidden channels, private groups, member lists, or unapproved profile scraping. Contacts must retain their public source URL and person-project relationship evidence. Source errors redact configured secrets.

Set DATA_RETENTION_DAYS to prune terminal runs, their dependent events and execution records, stale prospects and dependent research/drafts, and old audit logs at application startup. 0 disables automatic pruning. See docs/migration.md for artifact and stored-data compatibility notes.

Releases

The project uses semantic versioning and Keep a Changelog. Planned milestones are v0.1.0 MVP, v0.2.0 worker mode, v0.3.0 Qwen mode, and v1.0.0 stable. See docs/releasing.md.

Contributing

Read CONTRIBUTING.md and CODE_OF_CONDUCT.md. Use dev for active integration and feature branches for pull requests.

Security

Report validation bypasses, token exposure, private contact data, or any email send path privately as described in SECURITY.md.

License

Apache-2.0. It provides permissive reuse plus an explicit patent grant, which is useful for a public infrastructure example intended for commercial and open-source adoption.

Compatibility command

The historical jungle-grid-leads Python command remains available as a compatibility adapter. Its former independent local stages were removed; each legacy stage name submits the same OpenLine managed pipeline used by the web application and TypeScript CLI.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github		.github
app		app
apps		apps
components		components
config		config
data		data
docs		docs
examples		examples
jungle_grid_leads		jungle_grid_leads
packages/shared		packages/shared
prompts		prompts
public		public
scripts		scripts
src		src
test		test
workers/outreach		workers/outreach
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.worker		Dockerfile.worker
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
instrumentation.ts		instrumentation.ts
next-env.d.ts		next-env.d.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
pyproject.toml		pyproject.toml
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

OpenLine

What is OpenLine?

Who it is for

What is implemented

Supported Sources

Fully supported sources

Experimental or credential-gated sources

Planned sources

High-level workflow

How Jungle Grid is used

Safety principles

Architecture

Quickstart

Manual setup

Import formats

Seed prospects CSV

Seed prospects JSON

Suppressions CSV

Suppressions JSON

Development fixtures

Running managed jobs

Campaigns

ZeptoMail setup

Frontend dashboard

CLI commands

Worker image

Adding a new source or contact method

Execution persistence

Privacy and retention

Releases

Contributing

Security

License

Compatibility command

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages