Bean is a lightweight framework for collecting, storing, and analyzing user behavioral metrics. It enables the detection and classification of automated behavior based on customizable rules and ML models.
On the client side, a JavaScript script collects user behavioral metrics (traces) and regularly sends them to the server. On the server side, metrics are aggregated and a score is calculated to classify user actions. Classification uses both rule-based processing and ML model inference provided by the inference service. The server provides a REST API that allows retrieving the score by user identifier.
The script collects the following metrics:
- timestamp (string) — event timestamp in ISO 8601 format, indicating when the trace was recorded.
- mouseMoves (int) — total number of recorded mouse movements during the session.
- clicks (int) — total number of clicks (left, right, middle) during the session.
- clickTimingMin (int) — minimum time (in milliseconds) between consecutive clicks.
- clickTimingMax (int) — maximum time (in milliseconds) between consecutive clicks.
- clickTimingAvg (int) — average time (in milliseconds) between clicks during the session.
- clickTimingCount (int) — number of measured intervals between clicks (useful for avg normalization).
- scrolls (int) — total number of scroll events (wheel, touch, keys) during the session.
- scrollTimingMin (int) — minimum time (in ms) between scroll events.
- scrollTimingMax (int) — maximum time (in ms) between scroll events.
- scrollTimingAvg (int) — average time (in ms) between scroll events.
- scrollTimingCount (int) — number of recorded scroll intervals.
- textInputEvents (int) — number of text input events (keydown, input, etc.).
- textInputTimingMin (int) — minimum time (in ms) between characters during input.
- textInputTimingMax (int) — maximum time (in ms) between characters during input.
- textInputTimingAvg (int) — average input speed (in ms per character).
- textInputTimingCount (int) — number of recorded input intervals (pairs of key presses).
- sessionDuration (int) — session duration in milliseconds from its start.
- userAgent (string) — browser User-Agent string containing client information.
- language (string) — browser's preferred language (e.g., "ru-RU", "en-US").
- platform (string) — device platform (e.g., "Win32", "Linux x86_64", "MacIntel").
- screenWidth (int) — device screen width in pixels.
- screenHeight (int) — device screen height in pixels.
- timezone (string) — client timezone in IANA format (e.g., "Europe/Moscow").
- cookiesEnabled (bool) — flag indicating whether cookies are enabled in the browser.
- onLine (bool) — network status flag: true if the browser considers itself connected to the internet.
- deviceMemory (int) — device RAM size in gigabytes (estimate, not available in all browsers).
- maxTouchPoints (int) — maximum number of simultaneous touch points (0 — no touch screen; 1 or more — touch device).
- browserName (string) — browser name (e.g., "Chrome", "Firefox", "Safari").
- browserVersion (string) — browser version (e.g., "125.0.0").
- osName (string) — operating system name (e.g., "Windows", "Android", "iOS").
- osVersion (string) — operating system version (e.g., "10", "14.5").
The client is uniquely identified by one of the cookies. The script sends metrics along with browser cookies; to properly process metrics on the server side, you need to configure the name of the cookie from which the client identifier is extracted.
The service provides the following REST API endpoints:
- POST /api/v1/traces — accept a new trace
- GET /api/v1/scores/{token} — retrieve score by token
- GET /static/... — serve static files (if enabled)
# Build
go build -o bean cmd/bean/main.go
# Run
./bean --config config.yamlThe script can be embedded on a page using the following tag:
<script src="/static/collector.js"></script>After that, create an instance of the collector:
const collector = new BehavioralMetricsCollector({
enableLogging: false,
reportInterval: 5000,
skipEmpty: true,
address: "/api/v1/traces",
});Bean is configured through a YAML configuration file. Below is a detailed description of all parameters, their purposes, and allowed values.
logger:
level: info
server:
address: ":8080"
static: "./public"
analysis:
token: token
traces_length: 10
traces_ttl: 10m
scorers:
- type: ml
model: default
url: http://127.0.0.1:8000
- type: rules
rules: /etc/bean/rules.yaml
dataset:
file: /var/log/bean/dataset.log
size: 1024
amount: 10Settings for the logging component.
Log detail level. Supported values (case-insensitive):
- debug — detailed logs for development
- info — informational messages (default)
- warn or warning — warnings
- error — critical errors only
HTTP server parameters.
Address and port where the server will run. Use :8080 to listen on all interfaces on port 8080.
Path to the directory with static files (e.g., collector.js). If specified, files will be available at the /static/ route. Can be left empty if static file serving is not needed.
Behavioral analysis settings.
Cookie name used for session identification. Bean expects the client (browser) to send this cookie with each trace request. This is not a secret, just a key for binding session data.
The list of scores performing the analysis. The scores perform the analysis in the order in which they are specified. Possible types: ML and rule. Example:
- type: ml
model: default
url: http://127.0.0.1:8000
- type: rules
rules: /etc/bean/rules.yamlFor ML scorer, you must specify the URL of the inference service and the model name. For rule, you must specify the path to the rules file. The file must exist and contain the correct rules in the CEL language.
Maximum number of traces stored per session. When exceeded, old traces are deleted (FIFO). Recommended value: 20–100, depending on sending frequency.
Time after which a session is considered inactive and removed from memory.
Supported units:
- s — seconds
- m — minutes
- h — hours
Dataset collection settings. This is optional parameter. If it is defined, then all received traces will be written to the dataset file.
Dataset file path
Maximum dataset file size.
Amount of storing datasets.
Bean automatically supports parameter overriding through environment variables. Priority: environment variables > YAML values. Variable names are formed according to the pattern:
LOGGER_LEVEL=debug
SERVER_ADDRESS=:9090
ANALYSIS_TRACES_TTL=30mBean validates the configuration on startup:
- All required fields must be specified.
- The logging level must be valid.
- The rules file must exist.
- On error, startup is stopped with a problem description.
Use bean --config config.yaml to load the config from a file.
Bean uses Common Expression Language (CEL) to describe behavioral analysis rules. Rules allow you to evaluate user actions and assign scores for behavioral patterns (e.g., automation, bots).
When requesting a score, all collected traces are analyzed and scores are assigned. If a trace satisfies a rule, the scores are changed by the specified values.
Rules are defined in a YAML file, which is loaded when the server starts. The file contains a list of rules; each rule consists of a condition and score increments:
- when — condition in CEL language (should return true or false)
- then — object with scores that will be added to the final result
- when: mouseMoves > 10 && clicks > 5
then:
human: 0.3
automation: -0.1
- when: deviceMemory < 2
then:
automation: 0.2The following variables can be used in when expressions:
| Metric | Type | Description |
|---|---|---|
| mouseMoves | int | Number of mouse movements |
| clicks | int | Number of clicks |
| clickTimingMin | int | Minimum delay between clicks (ms) |
| clickTimingMax | int | Maximum delay between clicks (ms) |
| clickTimingAvg | int | Average delay between clicks (ms) |
| clickTimingCount | int | Number of measured click intervals |
| scrolls | int | Number of scrolls |
| scrollTimingMin | int | Minimum delay between scrolls |
| scrollTimingMax | int | Maximum delay between scrolls |
| scrollTimingAvg | int | Average delay between scrolls |
| scrollTimingCount | int | Number of measured scroll intervals |
| textInputEvents | int | Number of text input events |
| textInputTimingMin | int | Minimum delay between characters |
| textInputTimingAvg | int | Average delay between characters |
| textInputTimingMax | int | Maximum delay between characters |
| textInputTimingCount | int | Number of measured input intervals |
| sessionDuration | int | Session duration (ms) |
| userAgent | string | Full User-Agent string |
| language | string | Browser language (e.g., ru-RU) |
| platform | string | Platform (e.g., Win32) |
| screenWidth | int | Screen width (px) |
| screenHeight | int | Screen height (px) |
| timezone | string | Timezone (e.g., Europe/Moscow) |
| cookiesEnabled | bool | Are cookies enabled |
| onLine | bool | Internet connection status |
| deviceMemory | int | Estimated RAM in GB |
| maxTouchPoints | int | Maximum number of touch points |
| browserName | string | Browser name (Chrome, Firefox, etc.) |
| browserVersion | string | Browser version |
| osName | string | Operating system name (Windows, Android, etc.) |
| osVersion | string | Operating system version |
If the when condition returns true, Bean adds the specified scores to the final result. Conditions use CEL syntax. CEL supports:
- Arithmetic operations:
+,-,*,/,% - Logical operations:
&&,||,! - Comparisons:
==,!=,<,>,<=,>= - String methods:
browserName == "Chrome",language.startsWith("en"), etc.
Learn more about CEL: https://github.com/google/cel-spec
Each score is a key (string) and a value from 0.0 to 1.0. All scores are summed by key but limited to the range [0.0, 1.0].
- Suspicious lack of activity
- when: mouseMoves < 3 && sessionDuration > 30000
then:
inactive: 0.8- Fast text input
- when: textInputTimingAvg < 80 && textInputEvents > 5
then:
automation: 0.7- Device with little memory
- when: deviceMemory < 2
then:
device: 0.6- No scrolling
- when: scrolls == 0 && sessionDuration > 10000
then:
automation: 0.5- Headless Chrome
- when: browserName.contains("HeadlessChrome")
then:
automation: 1.0- Rules are applied to each trace — if a user sent 10 traces, each rule is checked 10 times.
- Scores accumulate — if two rules fire, their then-values are summed.
- Maximum score value per key — 1.0 — score cannot exceed 1.0 (saturation).
- Expression error — causes the rule to be skipped (does not stop analysis).
- Rule order — not important, but it is recommended to group by logic.