topsrv

Server monitoring agent. Single binary replaces node_exporter + postgres_exporter + process-exporter + nginx-exporter. Auto-discovery, Prometheus-compatible metrics.

Install

curl -fsSL https://topsrv.io/install.sh | TOPSRV_TOKEN=xxx sudo -E bash

Options via environment variables:

Variable	Default	Description
`TOPSRV_TOKEN`	—	Push token
`TOPSRV_ENDPOINT`	`https://push.topsrv.io/v1/write`	Push endpoint
`VERSION`	latest	Specific version to install
`INSTALL_DIR`	`/usr/local/bin`	Binary install path

Manual install:

# Download from https://github.com/vmkteam/topsrv/releases
tar -xzf topsrv_*_linux_amd64.tar.gz
sudo install -m 0755 topsrv /usr/local/bin/

Quick start

Run:

# Scrape mode — Prometheus pulls /metrics
topsrv -config /etc/topsrv/topsrv.toml -verbose

curl localhost:9100/metrics

Minimal config (/etc/topsrv/topsrv.toml):

[Server]
Listen = ":9100"

That's it. System, disk, network, netstat, process, and S.M.A.R.T. metrics are collected automatically. PostgreSQL and Nginx are auto-discovered if running.

Features

Collector	Metrics	Replaces
System	CPU per-core, memory, load, swap, uptime, host info, context switches	node_exporter
Disk	IO (read/write bytes/ops/time), filesystem (space/inodes)	node_exporter
Network	IO per interface, interface info (MAC/IP/MTU/status)	node_exporter
Netstat	TCP connections by state/direction/port and remote-peer scope (loopback/private/public), TCP+UDP listening ports by scope and owning process, TCP retransmits, UDP/IP errors	node_exporter
Process	CPU, memory, disk IO, threads, FDs, worst_fd_ratio per process group	process-exporter
PostgreSQL	Connections (by state/addr/app), transactions, longest transaction age, checkpoints, bgwriter, locks, replication, WAL, wraparound, pg_stat_statements, tables (top 50)	postgres_exporter
Nginx	stub_status, access log parsing (text & JSON log_format, response time histogram, status codes, cache, 4xx/5xx URIs, bytes by URI)	nginx-exporter + mtail
Angie	JSON API (server zones, upstreams, SSL, caches, rate limiting, slabs) + access log parsing	—
S.M.A.R.T.	Disk health (ATA attributes, NVMe health log, temperature, wear, errors)	smartctl_exporter
SSL Certificates	Certificate expiry monitoring (auto-discovered from nginx/angie config)	—
Bot-logs (opt-in)	Ships UA-classified bot events from nginx access logs to topsrv.io as gzipped ndjson with disk-backed WAL. 38 families: global/RU/Asian search, AI 2026 crawlers, SEO tools, social link previews, archive	—
Packages	Installed-package inventory: dpkg/rpm/apk parsed pure-Go (no shell-out). Aggregates on `/metrics`; full snapshot (NEVRA, vendor, GPG key, signature digest, modularityLabel, autoInstalled, repoOrigin, licenses) pushed to `/v1/inventory` for CVE matching	apt-prom-exporter / pkg-exporter

Auto-discovery

On startup topsrv scans running processes and detects known services — no configuration needed:

Process	Type	Action
`postgres` / `postmaster`	postgresql	Auto-connects as `topsrv` user (password derived from Push.Token); falls back to hint if no token
`nginx`	nginx	Parses nginx.conf → finds log_format + access_log
`angie`	angie	Parses angie.conf → finds api /status/, log_format + access_log
`redis-server`	redis	Detected
`pgbouncer`	pgbouncer	Detected
`php-fpm`	php-fpm	Detected

For Nginx, auto-discovery parses nginx.conf including include directives, extracts log_format and access_log paths with $request_time. Both text and JSON (escape=json) log formats are auto-detected.

Configuration

Operating modes

Pull mode (default) — Prometheus scrapes /metrics:

[Server]
Listen = ":9100"

Push mode — agent pushes metrics to topsrv.io, VictoriaMetrics, or any compatible remote-write endpoint:

[Push]
Endpoint = "https://push.topsrv.io/v1/write"   # or your VictoriaMetrics URL
Token    = "ts_xxx"         # get your token at https://topsrv.io
Interval = "30s"
SpoolDir = "/var/lib/topsrv/spool"   # disk buffer for retries on network failure

Push-only mode — no HTTP server, only push. Omit [Server] or set Listen = "".

Both modes can work simultaneously.

Full reference

[Server]
Listen = ":9100"            # Prometheus /metrics endpoint

[Push]
Endpoint = ""               # Push URL
Token    = ""               # Bearer token for push auth
Interval = "30s"            # Push interval
SpoolDir = ""               # Disk buffer for retries

[Update]
Enabled  = false            # Auto-update via control plane
Interval = "15m"            # Check interval
Channel  = "stable"         # stable / beta

# PostgreSQL (optional — auto-discovery detects the process, DSN needed for connection)
# [Postgres]
# DSN      = "postgres://topsrv:pass@localhost:5432/postgres?sslmode=disable"
# Disabled = true    # set to skip PG monitoring even if discovery finds postgres

# Nginx (optional — auto-discovery parses nginx.conf automatically)
# [Nginx]
# StubStatusURL = "http://127.0.0.1/stub_status"
# LogFormat     = '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_time $upstream_response_time'
# ExtraLabels   = ["server_name"]
# AccessLogs    = ["/var/log/nginx/access.log"]

# Angie (optional — auto-discovery parses angie.conf automatically)
# [Angie]
# StatusURL     = "http://127.0.0.1:8080/status/"
# StubStatusURL = "http://127.0.0.1/stub_status"
# LogFormat     = '$remote_addr ...'
# ExtraLabels   = ["server_name"]
# AccessLogs    = ["/var/log/angie/access.log"]

# S.M.A.R.T. disk health (always enabled, requires CAP_SYS_RAWIO + CAP_SYS_ADMIN)
# [Smart]
# Disabled = true    # set to disable
# Interval = "5m"

# Package inventory (optional — auto-discovery detects dpkg/rpm/apk via files).
# On /metrics only aggregates are exposed (counts, scan duration, error counter).
# Full per-package snapshot is pushed to /v1/inventory with kind="packages".
# [Packages]
# Disabled      = false      # set to fully skip the collector
# Interval      = "6h"       # snapshot scan period (±10% jitter to avoid herds)
# Managers      = []         # auto-detect by default; e.g. ["dpkg","rpm"] to force a subset
# DisablePush   = false      # set to skip POSTing snapshots to /v1/inventory (keep /metrics only)
# MaxPackages   = 10000      # safety cap; logs a warning and truncates if exceeded

# Bot-logs (optional — ships UA-classified bot events to topsrv.io for analytics)
# [BotLogs]
# Enabled       = true
# Token         = ""        # required — issued by topsrv.io per project
# Endpoint      = ""        # default: [Push].Endpoint with /v1/bot-logs path
# BatchSize     = 5000      # events per batch
# BatchInterval = "30s"     # flush interval
# SpoolDir      = ""        # default: [Push].SpoolDir; a "botlog/" subdir is created inside
# MaxSpoolMB    = 200       # WAL disk budget
# UATruncate    = 1024      # truncate user-agent at this length
# URITruncate   = 2048      # truncate request URI at this length
# ExtraUAPatterns = ["MyCustomCrawler/"]  # local additions to the bot list
# Field aliases — only set when discovery cannot infer the right name
# (e.g. operator-defined `set $custom $http_referer;`). Empty falls back.
# [BotLogs.FieldAliases]
# Referer = "ref"

Parameter	Default	Description
`Server.Listen`	`:9100`	HTTP listen address for /metrics
`Push.Endpoint`	—	Remote write URL
`Push.Token`	—	Bearer token
`Push.Interval`	`30s`	Push frequency
`Push.SpoolDir`	—	Disk spool path (retries on network failure)
`Update.Enabled`	`false`	Enable auto-update
`Update.Interval`	`15m`	Update check interval
`Update.Channel`	`stable`	Update channel (stable/beta)
`Postgres.DSN`	—	PostgreSQL connection string
`Postgres.Disabled`	`false`	Skip PostgreSQL monitoring even if discovery finds a local postgres process
`Nginx.StubStatusURL`	—	nginx stub_status URL
`Nginx.LogFormat`	—	nginx log_format string (gonx format)
`Nginx.ExtraLabels`	`[]`	Log fields to add as metric labels
`Nginx.AccessLogs`	`[]`	Paths to access log files
`Angie.StatusURL`	—	Angie JSON API URL (e.g. `http://127.0.0.1:8080/status/`)
`Angie.StubStatusURL`	—	Fallback stub_status URL
`Angie.LogFormat`	—	angie log_format string (gonx format)
`Angie.ExtraLabels`	`[]`	Log fields to add as metric labels
`Angie.AccessLogs`	`[]`	Paths to access log files
`Smart.Disabled`	`false`	Disable S.M.A.R.T. collector
`Smart.Interval`	`5m`	S.M.A.R.T. polling interval
`Packages.Disabled`	`false`	Disable package inventory collector
`Packages.Interval`	`6h`	Inventory scan period (±10% jitter); pushed to `/v1/inventory`
`Packages.Managers`	`[]`	Force subset of managers (`dpkg`/`rpm`/`apk`); empty = auto-detect
`Packages.DisablePush`	`false`	Skip POSTing snapshots to `/v1/inventory` (keeps `/metrics` aggregates)
`Packages.MaxPackages`	`10000`	Truncate snapshot if exceeded (logs a warning)
`BotLogs.Enabled`	`false`	Ship UA-classified bot events to topsrv.io
`BotLogs.Token`	—	Bot-logs ingest bearer token (separate from `Push.Token`)
`BotLogs.Endpoint`	derived	Ingest URL; defaults to `[Push].Endpoint` with `/v1/bot-logs` path
`BotLogs.BatchSize`	`5000`	Events per batch
`BotLogs.BatchInterval`	`30s`	Flush interval
`BotLogs.SpoolDir`	derived	Parent dir for WAL spool; `botlog/` subdir is created inside. Defaults to `[Push].SpoolDir`
`BotLogs.MaxSpoolMB`	`200`	Disk budget for spool subdir
`BotLogs.UATruncate`	`1024`	Max UA length per event
`BotLogs.URITruncate`	`2048`	Max URI length per event
`BotLogs.ExtraUAPatterns`	`[]`	Local additions to known-bots UA patterns
`BotLogs.FieldAliases.*`	auto	Per-format field-name overrides (UserAgent/Host/ServerName/RemoteAddr/Referer). Discovery auto-detects from `log_format`; only set when operator uses non-standard variables nginx config

Environment variables

All config flags can be set via environment variables with TOPSRV_ prefix:

TOPSRV_CONFIG=/etc/topsrv/topsrv.toml topsrv

PostgreSQL

Supports PG15+ (version-gated features: pg_stat_wal PG14+, pg_stat_statements.toplevel PG14+, pg_stat_checkpointer PG17, shared_blk_*_time PG17).

Coverage:

Connections — by state, by client_addr, by application_name, max
Transactions — commit/rollback, deadlocks, temp files/bytes per database, longest transaction age
Checkpoints & bgwriter — timed/requested, checkpoint time, buffers (checkpoint/bgwriter/backend)
Autovacuum — common vs wraparound workers, max_workers
Locks — by mode + granted label, blocked_backends (via pg_blocking_pids()), max lock wait duration
Wait events — sampled from pg_stat_activity (backend_type, datname, application_name, wait_event_type, wait_event, state) — pganalyze/APM-style «why is it slow right now»
Replication — lag bytes, lag seconds by stage (write/flush/replay), sync_state, slots (retained bytes, active/inactive)
WAL — LSN position, files count via pg_ls_waldir(), plus pg_stat_wal (records, FPI, buffers_full, write/sync time) on PG14+
Archiver — pg_stat_archiver with result=archived|failed, timestamp pattern (time() - last_timestamp_seconds for age)
Wraparound — xid_age vs autovacuum_freeze_max_age, per-database + cluster max
Database sizes — per-database byte size
Tables (top 50 by size) — seq/idx scans, tuple ops, dead tuples, autovacuum count, last_maintenance_timestamp{op=vacuum|analyze}, mod_since_analyze — with database, schema, table labels
Indexes (top 50) — scans_total, size_bytes — with database, schema, table, index labels
Bloat estimation (ioguix heuristic, refreshed every 15 min) — top 50 tables + top 50 btree indexes by wasted bytes. Catalog-only (never reads heap pages) — safe on multi-TB databases. Points at tables for pg_repack / CLUSTER and indexes for REINDEX CONCURRENTLY
pg_stat_statements — union of top 20 by time, calls, blocks read, blocks dirtied (DML pressure), and WAL bytes (PG13+, if available) — ~80–100 unique queries covering both read-heavy and write-heavy workloads; outlier-aware duration histogram; full query text pushed separately to control plane (/v1/meta)
Settings — curated GUCs (shared_buffers, work_mem, max_connections etc.) normalized to bytes/seconds
Stats reset timestamps — pg_stat_database, pg_stat_bgwriter, pg_stat_wal (PG17), pg_stat_archiver — for detecting unexpected resets

topsrv connects with application_name=topsrv (visible in pg_stat_activity) and selects the largest database on startup for per-DB views (pg_stat_user_tables/pg_stat_user_indexes).

Setup

1. Create a monitoring role. The built-in pg_monitor role (PG10+) grants read access to all statistics views and functions including pg_ls_waldir() — no custom functions or schemas needed:

sudo -u postgres psql -d postgres

CREATE ROLE topsrv LOGIN PASSWORD 'CHANGE_ME';
GRANT pg_monitor TO topsrv;

2. Allow connections in pg_hba.conf:

# local agent (typical setup)
host all topsrv 127.0.0.1/32 scram-sha-256
local all topsrv scram-sha-256

Reload config:

SELECT pg_reload_conf();

3. Configure topsrv — add DSN to config:

[Postgres]
DSN = "postgres://topsrv:CHANGE_ME@localhost:5432/postgres?sslmode=disable"

4. (Optional) Enable pg_stat_statements for query-level metrics (top 20 queries, duration histogram):

Add to postgresql.conf:

shared_preload_libraries = 'pg_stat_statements'   # requires restart
pg_stat_statements.max = 500
pg_stat_statements.track = top
track_io_timing = on                               # recommended for block read/write time

Restart PostgreSQL, then create the extension in the same database that topsrv connects to (the one specified in Postgres.DSN):

-- connect to the database from DSN (e.g. postgres)
\c postgres
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
GRANT SELECT ON pg_stat_statements TO topsrv;

Note: pg_stat_statements view is only visible in the database where the extension is created. If your DSN points to mydb, run CREATE EXTENSION in mydb, not in postgres. The topsrv role also needs an explicit SELECT grant on the view — pg_monitor alone is not sufficient.

If pg_stat_statements is not installed, topsrv silently skips query metrics — everything else works.

Behaviour when PG is unreachable

NewCollector is lazy — the connection pool is created on first Collect(). If PostgreSQL is down at topsrv startup (e.g. systemd boot ordering), topsrv_pg_up will report 0 and the non-PG collectors continue working. The next scrape retries the connection automatically; no topsrv restart is needed once PG comes back.

To skip PG monitoring entirely (even when auto-discovery finds a local process), set Postgres.Disabled = true in the config.

Nginx

Two collectors:

stub_status — connections, requests, nginx_up (0/1).

access log — tail + parsing via configurable LogFormat. Supports both text (gonx) and JSON (escape=json) formats:

topsrv_nginx_request_duration_seconds — histogram (buckets: 5ms…10s)
topsrv_nginx_upstream_duration_seconds — upstream histogram
topsrv_nginx_http_requests_total{status} — by status code
topsrv_nginx_cache_requests_total{status} — HIT/MISS/EXPIRED
topsrv_nginx_5xx_requests_total{status,uri} — 5xx with URL normalization (/users/123 → /users/:id)
topsrv_nginx_4xx_requests_total{status,uri} — 4xx with URL normalization
topsrv_nginx_response_bytes_total — total response bytes
topsrv_nginx_response_bytes_by_uri_total{uri} — response bytes by normalized URI
Custom labels — ExtraLabels adds log fields as metric labels (server_name, http_platform, etc.)

SSL certificates — auto-discovered ssl_certificate paths from config. topsrv_ssl_certificate_expiry_seconds{path,cn,issuer} exposes NotAfter as Unix timestamp (one series per cert file); topsrv_ssl_certificate_san_info{path,domain} enumerates every DNS name in CN ∪ SANs so multi-host (SAN) certs aren't reduced to their CN in dashboards. Re-reads every 5 minutes to detect Let's Encrypt renewals.

Angie

Angie (nginx fork) is supported with a dedicated JSON API collector providing detailed per-zone, per-upstream, SSL, cache, and rate limiting metrics — features not available in nginx free.

Three collectors:

JSON API (/status/) — connections, server zones, upstreams (per-peer state, health, requests), caches, rate limiting, shared memory slabs. Requires api /status/; directive in Angie config.

stub_status — fallback when API is not configured (same 7 metrics as nginx).

access log — same as nginx: request duration histogram, status codes, cache status, 4xx/5xx by URI, bytes by URI.

Auto-discovery parses angie.conf and detects both api /status/; and stub_status directives. If Angie API is available, it takes priority over stub_status.

Angie config for monitoring

http {
    server {
        listen 80;
        server_name example.com;
        status_zone http_main;        # enable per-zone metrics

        location / {
            proxy_pass http://backend;
        }
    }

    upstream backend {
        zone backend_zone 64k;        # required for upstream metrics
        server 10.0.0.1:8080;
    }

    server {
        listen 127.0.0.1:8080;
        location /status/ {
            api /status/;
            allow 127.0.0.1;
            deny all;
        }
    }
}

Bot-logs (optional)

When [BotLogs].Enabled = true, every parsed nginx access-log line is matched against a built-in UA fingerprint table (38 families, 100+ patterns) covering:

Search engines — Googlebot, Bingbot, DuckDuckBot, Applebot, YandexBot (10 subtypes), Mail.RU_Bot, StackRambler, SputnikBot, Baiduspider, Sogou, 360Spider, PetalBot, Naver Yeti, Daum
AI / LLM — GPTBot / OAI-SearchBot / ChatGPT-User, ClaudeBot / Claude-User / Claude-SearchBot, PerplexityBot / Perplexity-User, MistralAI-User, cohere-ai, Amazonbot, Diffbot, AI2Bot, YouBot, Timpibot, Meta-ExternalAgent / Fetcher, Bytespider, TikTokSpider, CCBot
SEO crawlers — AhrefsBot, SemrushBot (+ subtypes), MJ12bot, DotBot, rogerbot, BLEXBot, DataForSeoBot
Social / messengers — Twitterbot, LinkedInBot, Pinterestbot, Slackbot, Discordbot, TelegramBot, WhatsApp, vkShare, SkypeUriPreview
Archive — ia_archiver, archive.org_bot

Matched events become JSON records with botFamily / botName / serverName / remoteAddr / uri / status / timing, are batched into gzipped ndjson, and POSTed to the topsrv.io /v1/bot-logs endpoint. On transient send failures the batch lands in SpoolDir/botlog/ for replay on the next interval; permanent rejections (HTTP 4xx, except 408/429) are discarded without retry to avoid corrupted batches blocking the queue.

Enabling [BotLogs] extends nginx auto-discovery to log_formats without $request_time so bot events from API/auxiliary vhosts flow through too — timing histograms are simply skipped per-line for those. Installs without [BotLogs] keep the original behavior.

Required log_format variables: $http_user_agent, $host / $server_name, $remote_addr, $http_referer. Field names are auto-detected from each tailed log_format — common alternates work out of the box (http_host, realip_remote_addr, http_x_real_ip, http_x_forwarded_for, http_referrer typo, custom JSON keys). The resolution and its provenance (override / detected / default) are emitted as a botlog: resolved field aliases log line at startup. When discovery cannot infer the right name (e.g. operator-defined set $custom $http_referer;), override via [BotLogs.FieldAliases]. If a format genuinely lacks UA, a botlog_no_ua_field warning is raised and topsrv_collector_config_warnings_total{kind=...} ticks. Operator-supplied [Nginx].ExtraLabels are kept on metrics labels; required botlog fields are unioned into ExtractFields for parsing.

Metrics: topsrv_botlog_events_total{state=enqueued|sent|spooled|dropped}, topsrv_botlog_match_total{family}, topsrv_botlog_send_errors_total{kind}, topsrv_botlog_batch_duration_seconds, topsrv_botlog_spool_files, topsrv_botlog_spool_bytes.

Metrics reference

Full list of all metrics: docs/metrics.md

Development

Requires GOEXPERIMENT=jsonv2 (set automatically by Makefile and goreleaser).

make build              # Build binary
make test               # Unit tests
make test-integration   # Integration tests (requires Docker)
make fmt                # Format code (golangci-lint fmt)
make lint               # Lint (golangci-lint run)
make run                # Run with local config
make demo               # Start demo (VictoriaMetrics + topsrv)

From source:

git clone https://github.com/vmkteam/topsrv.git
cd topsrv
make build
sudo install -m 0755 bin/topsrv /usr/local/bin/

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github/workflows		.github/workflows
cfg		cfg
cmd/topsrv		cmd/topsrv
docs		docs
internal		internal
scripts		scripts
testdata		testdata
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yml		.goreleaser.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.test.yml		docker-compose.test.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

topsrv

Install

Quick start

Features

Auto-discovery

Configuration

Operating modes

Full reference

Environment variables

PostgreSQL

Setup

Behaviour when PG is unreachable

Nginx

Angie

Angie config for monitoring

Bot-logs (optional)

Metrics reference

Development

License

About

Uh oh!

Releases 23

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

topsrv

Install

Quick start

Features

Auto-discovery

Configuration

Operating modes

Full reference

Environment variables

PostgreSQL

Setup

Behaviour when PG is unreachable

Nginx

Angie

Angie config for monitoring

Bot-logs (optional)

Metrics reference

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 23

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages