Fix angie ssl-listen pick, expose cert SAN domains#14
Merged
Conversation
- Auto-discovery for angie api/stub_status now prefers non-ssl
listen when a server block has both ssl and non-ssl. Picking an
ssl-only listen (e.g. `listen 443 ssl`) sent plain http to an
HTTPS port and got HTTP 400 — same URL also broke the ACME
collector. When only ssl listen exists, log a warning at
startup so the misconfig is visible before scrape errors land
- Expose SAN domains as topsrv_ssl_certificate_san_info{path,
domain}: one info series per DNS name in CN ∪ SANs (dedup, CN
first). Multi-host certs (typical for angie ACME, which packs
subdomains into one .pem) are no longer reduced to the CN in
dashboards. expiry metric stays one series per cert — its value
(NotAfter) isn't duplicated across N domain rows. Convention
follows blackbox_exporter's probe_ssl_last_chain_info
- Tests: ssl-flag among other listen params (`listen 443 ssl
http2`), ssl-first listen order, SAN dedup with CN-in-DNSNames,
pathological no-CN-no-SAN cert (san_info suppressed, expiry
still visible)
Replace real customer hostnames with example.com placeholders in TestSSLCollectorSANInfo — same assertion, no behavior change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related fixes against angie discovery & SSL reporting, surfaced by production logs showing
angie: API returned error status status=400and a UI listing only 2 domains for a host with many access logs.listen 80;andlisten 443 ssl;,findDirectiveused to overwrite the port with the lastlistenseen — ending up withhttp://127.0.0.1:443/status/, which makes angie return HTTP 400 («plain HTTP to HTTPS port»). Both the api collector and the ACME collector (which reuses the same URL) failed. Now non-ssl listen always wins; an ssl-only listen is still recorded (better than port 0) but logs a warning at startup so operators see the misconfig before scrape errors land..pemvia SAN. The collector previously emitted onlySubject.CommonName, so the UI showed the primary CN but missed every other SAN entry. New info-style metrictopsrv_ssl_certificate_san_info{path,domain}enumerates every DNS name inCN ∪ SANs(dedup, CN first).topsrv_ssl_certificate_expiry_secondsstays one series per cert file —NotAfterisn't duplicated across N rows. Convention follows blackbox_exporter'sprobe_ssl_last_chain_info. Operators join bypathto get «days until expiry per served domain».Docs:
README.md,docs/metrics.md, anddocs/promql-recipes.mdupdated with the new metric and a join example.Test plan
make fmt lint— 0 issuesGOEXPERIMENT=jsonv2 go test ./...— full suite greenTestDiscoverAngieEdgeCases/multiple_listen_—_non-ssl_wins_over_sslTestDiscoverAngieEdgeCases/ssl_listen_declared_first_—_non-ssl_still_winsTestDiscoverAngieEdgeCases/ssl_flag_among_other_listen_params_(http2)TestSSLCollectorSANInfo— 3-domain SAN (example.com placeholders), CN-in-DNSNames dedupTestSSLCollectorNoCNorSAN— patological cert,san_infosuppressed, expiry still visibleangie: auto-detected API url=http://127.0.0.1:80/status/(not 443), no more HTTP 400 from collectors, and SSL Certificates UI lists all SAN domains for every multi-host ACME cert