Skip to content

bcoles/auth-log-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

auth-log-scraper

Build Status Release License: MIT

auth-log-scraper

Search Linux auth logs for usernames that look like accidentally typed passwords.


A Rust CLI tool that scrapes Linux authentication log files for usernames, flagging any that appear to be accidentally entered passwords (e.g., when a user types their password in the username field during SSH login).

How It Works

  1. Discovers log files recursively in a list of common default log directories (such as /var/log) or user-specified paths. Gzipped (.gz) log files are transparently decompressed. Binary files are automatically skipped.
  2. Parses log lines using more than 100 pluggable, service-specific parsers plus a generic catch-all parser. Files are scanned in parallel using rayon.
  3. Classifies each extracted value as a valid username or a potential password.
  4. Extracts timestamps from log lines when available (ISO 8601, Apache/Nginx CLF, syslog BSD, and simple datetime formats).
  5. Deduplicates results by (value, source file). When both a specific parser and the generic parser match the same line, the specific parser's finding is preferred.
  6. Outputs results in human-readable text (with color), machine-parsable JSON, or CSV — or a compact summary with --summary.

Password Detection

A value is flagged as a potential password if it violates common username rules:

  • Contains spaces or special characters (e.g. @, !, #)
  • Starts with a digit or special character
  • Is longer than 32 characters

By default, uppercase letters are allowed since many applications (databases, web apps, LDAP) use mixed-case usernames. Use --strict-usernames to treat any uppercase as a potential password (matches Linux useradd rules).

Limitations

Most authentication services (SSH, PAM, sudo, etc.) log usernames as unquoted, whitespace-delimited tokens. If a user accidentally types a password that contains spaces, only the first word will be captured — the log format itself is ambiguous and there is no reliable way to recover the full value. Services that quote or bracket usernames in their logs (e.g. MySQL, Neo4j, ClickHouse, Elasticsearch, CouchDB) are not affected by this limitation.

When using --stdin, all parsers run against every line because there is no filename to filter on. This may produce false positives from parsers matching generic authentication patterns in unrelated log lines. To reduce noise, use --parser to limit to specific parsers (e.g. --stdin --parser ssh,sudo) or --exclude-parser to suppress noisy ones.

Installation

You can download the latest pre-built binaries from the Releases page; or build the latest pre-release version from source:

git clone https://github.com/bcoles/auth-log-scraper && \
cd auth-log-scraper && \
cargo build --release

Documentation

A man page and optional shell completions are provided.

# Install man page
sudo cp man/auth-log-scraper.1 /usr/local/share/man/man1/

# Bash completions
sudo cp completions/auth-log-scraper.bash /etc/bash_completion.d/auth-log-scraper

# Zsh completions
sudo cp completions/_auth-log-scraper /usr/local/share/zsh/site-functions/_auth-log-scraper

# Fish completions
cp completions/auth-log-scraper.fish ~/.config/fish/completions/

Usage

# Scan default directories with text output
sudo auth-log-scraper

# Scan a specific file
auth-log-scraper --path /var/log/auth.log

# Scan a directory
auth-log-scraper --path /var/log/

# Scan multiple paths
auth-log-scraper --path /var/log/auth.log --path /var/log/secure

# JSON output
auth-log-scraper --format json

# Write results to a file
auth-log-scraper --output results.json --format json

# Show only potential passwords
auth-log-scraper --passwords-only

# Show all occurrences (no deduplication)
auth-log-scraper --show-all

# Use only specific parsers
auth-log-scraper --parser ssh,sudo

# Exclude specific parsers
auth-log-scraper --exclude-parser generic

# Run all parsers against all files (bypass filename filtering)
# Useful for renamed/collected logs
auth-log-scraper --path /tmp/collected-logs/ --scan-all

# Exit with code 2 if potential passwords are found (useful in CI/scripts)
auth-log-scraper --strict

# Read from stdin (pipe from journalctl, zcat, etc.)
journalctl -u sshd --no-pager | auth-log-scraper --stdin
zcat /var/log/auth.log.*.gz | auth-log-scraper --stdin

# CSV output
auth-log-scraper --format csv

# Limit directory recursion depth
auth-log-scraper --path /var/log --max-depth 1

# Control color output
auth-log-scraper --color always
auth-log-scraper --color never

# Print a brief summary (counts and top passwords)
auth-log-scraper --summary

# Filter out short findings (e.g. only show values ≥ 4 characters)
auth-log-scraper --min-length 4

# Exclude specific paths from scanning
auth-log-scraper --exclude-path "*.gz,/var/log/journal/*"

# List available parsers
auth-log-scraper --list-parsers

Options

Flag Short Description
--path <PATH> -p File or directory to scan (repeatable). Defaults to standard log directories.
--format <FORMAT> -f Output format: text (default), json, or csv.
--output <FILE> -o Write output to a file instead of stdout.
--passwords-only Only show entries classified as potential passwords.
--show-all Show all occurrences. By default, results are deduplicated per value per file.
--parser <PARSERS> Only use specified parsers (comma-separated). Cannot combine with --exclude-parser.
--exclude-parser <PARSERS> Exclude specified parsers (comma-separated). Cannot combine with --parser.
--scan-all Run every parser against every file, ignoring filename-based filtering.
--strict Exit with code 2 if any potential passwords are found.
--stdin Read log lines from stdin instead of files.
--max-depth <DEPTH> Maximum directory recursion depth (0 = only the given directories).
--color <MODE> Color output: auto (default), always, never. Colors are disabled for file output.
--list-parsers List available parsers and exit.
--summary Print a brief summary with counts and top 10 potential passwords.
--min-length <LEN> Minimum character length for findings (default: 2). Filters out single-character noise.
--exclude-path <PATTERNS> Glob patterns to exclude files/directories (comma-separated).
--strict-usernames Treat uppercase letters as password indicators (strict Linux username rules).

Output Examples

Text format

=== Potential Passwords (2) ===
  [ssh] /var/log/auth.log:42 -- "P@ssw0rd!"
  [ssh] /var/log/auth.log:108 -- "MySecret123"

=== Usernames (3) ===
  [ssh] /var/log/auth.log:15 -- root
  [sudo] /var/log/auth.log:23 -- admin
  [su] /var/log/auth.log:67 -- deploy

Summary: 2 unique potential passwords, 3 unique usernames

JSON format

[
  {
    "value": "P@ssw0rd!",
    "classification": "potential_password",
    "source_file": "/var/log/auth.log",
    "line_number": 42,
    "parser": "ssh",
    "timestamp": "2026-03-20T14:23:45+00:00"
  },
  {
    "value": "root",
    "classification": "username",
    "source_file": "/var/log/auth.log",
    "line_number": 15,
    "parser": "ssh"
  }
]

CSV format

value,classification,source_file,line_number,parser,timestamp
P@ssw0rd!,potential_password,/var/log/auth.log,42,ssh,2026-03-20T14:23:45+00:00
root,username,/var/log/auth.log,15,ssh,

Supported Parsers

The tool includes more than 100 service-specific parsers plus a generic catch-all:

Category Parsers
SSH / Auth ssh, sudo, su, login, sssd
FTP vsftpd, proftpd, pure-ftpd
Mail dovecot, postfix, exim, sendmail, courier, horde, opensmtpd, cyrus, sogo, roundcube
XMPP ejabberd, prosody
VPN openvpn, strongswan, ocserv, libreswan, pptpd, softether
Web / Reverse Proxy apache, nginx, lighttpd, caddy, haproxy, traefik, squid, tomcat
Database postgresql, mysql, mongodb, redis, elasticsearch, cassandra, couchdb, influxdb, neo4j, clickhouse
Database Admin phpmyadmin, pgadmin
Directory / Auth kerberos, freeradius, winbind, openldap, samba
Identity / SSO keycloak, authentik, freeipa
Remote Access xrdp, cockpit, guacamole
Hosting Panels cpanel, webmin, proxmox, plesk
NAS / Appliances synology, pfsense, unifi
Media Servers plex, jellyfin, emby
Monitoring grafana, nagios, zabbix, icinga, cacti, librenms, checkmk
Log Management graylog, kibana, wazuh
Surveillance zoneminder, motioneye, shinobi
Container Mgmt portainer, docker-registry, harbor
CI / CD jenkins, gitea, gitlab, awx
Message Queue rabbitmq, mosquitto
Wiki / CMS mediawiki, dokuwiki, wordpress, drupal, joomla, discourse, moodle, bookstack, phpbb
Collaboration nextcloud, mattermost, rocketchat, matrix_synapse, mastodon
Project Mgmt redmine, odoo
Document Mgmt paperless
Backup bacula, proxmox-backup
Print cups
Firewall UI opnsense
VoIP / Telephony asterisk, freeswitch, kamailio
Home Automation homeassistant
Secrets / Passwords vault, vaultwarden
Object Storage minio
Network Analysis ntopng
Code Quality sonarqube, nexus
File Sync seafile, zimbra
Generic Catch-all parser matching broad PAM/syslog auth patterns

Use --list-parsers to see all available parsers.

Extending with New Parsers

The tool is designed for easy extensibility. Each log parser is a separate file in src/parsers/.

Adding a new parser

  1. Create a new file src/parsers/myservice.rs:
use crate::finding::RawFinding;
use crate::parsers::LogParser;
use regex::{Regex, RegexSet};
use std::path::Path;

pub struct MyServiceParser {
    regex_set: RegexSet,
    patterns: Vec<Regex>,
}

impl MyServiceParser {
    pub fn new() -> Self {
        let raw = &[
            r"myservice.*user=(\S+)",
            r"myservice.*login\s+(\S+)",
        ];
        Self {
            regex_set: RegexSet::new(raw).expect("invalid regex set"),
            patterns: raw.iter().map(|r| Regex::new(r).expect("invalid regex")).collect(),
        }
    }
}

impl LogParser for MyServiceParser {
    fn name(&self) -> &str {
        "myservice"
    }

    fn can_parse(&self, path: &Path) -> bool {
        let name = path.file_name().and_then(|n| n.to_str()).unwrap_or_default();
        name.starts_with("myservice.log")
    }

    fn parse_line(&self, path: &Path, line_number: usize, line: &str) -> Vec<RawFinding> {
        if !self.regex_set.is_match(line) {
            return Vec::new();
        }
        let mut findings = Vec::new();
        for i in self.regex_set.matches(line) {
            if let Some(caps) = self.patterns[i].captures(line) {
                if let Some(m) = caps.get(1) {
                    findings.push(RawFinding::new(
                        m.as_str().to_string(), path, line_number, self.name(),
                    ));
                    break;
                }
            }
        }
        findings
    }
}
  1. Register it in src/parsers/mod.rs:
mod myservice;  // Add the module

pub fn all_parsers() -> Vec<Box<dyn LogParser>> {
    vec![
        // ... existing parsers ...
        Box::new(myservice::MyServiceParser::new()),  // Add before generic
        Box::new(generic::GenericParser::new()),      // Generic must be last
    ]
}

Disabling a parser

Comment out or remove its entry in the all_parsers() function, or use --exclude-parser at runtime.

Running Tests

cargo test

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

Rust CLI that searches Linux auth logs for usernames that look like accidentally typed passwords.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages