Skip to content

Commit fbce6b1

Browse files
committed
lib/logstorage: add pattern_match() and pattern_match_full() filters for searching for logs by the given patterns
The patterns may contain various placeholders such as <N>, <IP4>, <TIME>, etc. Such patterns with these placeholders are generated by `| collapse_nums prettify` pipe. See https://docs.victoriametrics.com/victorialogs/logsql/#pattern-match-filter for details. These filters are needed for VictoriaMetrics#518
1 parent 3a2b311 commit fbce6b1

File tree

10 files changed

+1744
-8
lines changed

10 files changed

+1744
-8
lines changed

docs/victorialogs/CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ according to [these docs](https://docs.victoriametrics.com/victorialogs/quicksta
1818

1919
## tip
2020

21+
* FEATURE: [LogsQL](https://docs.victoriametrics.com/victorialogs/logsql/): add [pattern match filter](https://docs.victoriametrics.com/victorialogs/logsql/#pattern-match-filter) for searching logs by the given patterns such as `<DATETIME>: user_id=<N>, ip=<IP4>, trace_id=<UUID>`. These filters are needed for [#518](https://github.com/VictoriaMetrics/VictoriaLogs/issues/518).
22+
2123
## [v1.32.0](https://github.com/VictoriaMetrics/VictoriaLogs/releases/tag/v1.32.0)
2224

2325
Released at 2025-09-03

docs/victorialogs/LogsQL.md

Lines changed: 45 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -264,15 +264,18 @@ The list of LogsQL filters:
264264
- [Phrase filter](#phrase-filter) - matches logs with the given phrase
265265
- [Prefix filter](#prefix-filter) - matches logs with the given word prefix or phrase prefix
266266
- [Substring filter](#substring-filter) - matches logs with the given substring
267+
- [Pattern match filter](#pattern-match-filter) - matches logs by the given pattern
267268
- [Range comparison filter](#range-comparison-filter) - matches logs with field values in the provided range
268269
- [Empty value filter](#empty-value-filter) - matches logs without the given [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model)
269270
- [Any value filter](#any-value-filter) - matches logs with the given non-empty [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model)
270271
- [Exact filter](#exact-filter) - matches logs with the exact value for the given [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model)
271272
- [Exact prefix filter](#exact-prefix-filter) - matches logs starting with the given prefix for the given [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model)
272273
- [Multi-exact filter](#multi-exact-filter) - matches logs with one of the specified exact values for the given [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model)
273274
- [Subquery filter](#subquery-filter) - matches logs with [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) values matching the results of another query
274-
- [`contains_all` filter](#contains_any-filter) - matches logs with [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) containing all the provided [words](#word) / phrases
275-
- [`contains_any` filter](#contains_any-filter) - matches logs with [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) containing at least one of the provided [words](#word) / phrases
275+
- [`contains_all` filter](#contains_any-filter) - matches logs with [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) containing
276+
all the provided [words](#word) / phrases
277+
- [`contains_any` filter](#contains_any-filter) - matches logs with [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) containing
278+
at least one of the provided [words](#word) / phrases
276279
- [Case-insensitive filter](#case-insensitive-filter) - matches logs with the given case-insensitive word, phrase or prefix
277280
- [Sequence filter](#sequence-filter) - matches logs with the given sequence of words or phrases
278281
- [Regexp filter](#regexp-filter) - matches logs for the given regexp
@@ -720,6 +723,41 @@ See also:
720723
- [Exact-filter](#exact-filter)
721724
- [Logical filter](#logical-filter)
722725

726+
### Pattern match filter
727+
728+
VictoriaLogs supports filtering logs by patterns with the `pattern_match("pattern")` filter. This filter matches logs where
729+
[`_msg` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field) contains the given `"pattern"`.
730+
731+
The filter can be applied to any given log field with the `log_field:pattern_match("pattern")` syntax.
732+
733+
The `"pattern"` must contain the text to match, plus arbitrary number of the following placeholders:
734+
735+
- `<N>` - matches any integer number. It also matches hexadecimal numbers with the length of 4 chars and longer. For example, it matches `123` and `12abcdEF`.
736+
It doesn't match floating point numbers such as `123.456`. Use `<N>.<N>` pattern for matching such numbers.
737+
- `<UUID>` - matches any UUID such as `2edfed59-3e98-4073-bbb2-28d321ca71a7`.
738+
- `<IP4>` - matches IPv4 such as `123.45.67.89`. Use `<IP4>/<N>` for matching IPv4 masks.
739+
- `<TIME>` - matches time strings such as `10:20:30`. It also captures fractional seconds such as `10:20:30.123` and `10:20:30,123`.
740+
- `<DATE>` - matches date strings such as `2025-10-20` and `2025/10/20`.
741+
- `<DATETIME>` - matches datetime strings such as `2025-10-20T08:09:11` and `2025-10-20 08:09:11`. It also captures fractional seconds and timezones.
742+
743+
Such patterns are generated by the [`collapse_nums` pipe](#collapse_nums-pipe).
744+
745+
For example, the following filter matches `_msg` field with the `<arbitrary_prefix>user_id=123, ip=45.67.89.12, time=2025-10-20T23:32:12Z<aribtrary_suffix>` contents:
746+
747+
```logsql
748+
pattern_match("user_id=<N>, ip=<IP4>, time=<TIMESTAMP>")
749+
```
750+
751+
If you need matching the whole `_msg` field value, then use `pattern_match_full("pattern")` filter.
752+
753+
See also:
754+
755+
- [`collapse_nums` pipe](#collapse_nums-pipe)
756+
- [Sequence filter](#sequence-filter)
757+
- [Phrase filter](#phrase-filter)
758+
- [Substring filter](#substring-filter)
759+
- [Logical filter](#logical-filter)
760+
723761
### Substring filter
724762

725763
If it is needed to find logs with some substring, then `*substring*` filter can be used. The substring can be but in quotes according to [these docs](#string-literals) if needed.
@@ -741,6 +779,7 @@ Performance tip: prefer using [word filter](#word-filter) and [phrase filter](#p
741779

742780
See also:
743781

782+
- [Pattern match filter](#pattern-match-filter)
744783
- [Word filter](#word-filter)
745784
- [Phrase filter](#phrase-filter)
746785
- [Regexp filter](#regexp-filter)
@@ -1106,6 +1145,7 @@ For example, the following query matches `event:original` field containing `(err
11061145

11071146
See also:
11081147

1148+
- [Pattern match filter](#pattern-match-filter)
11091149
- [`contains_all` filter](#contains_all-filter)
11101150
- [Word filter](#word-filter)
11111151
- [Phrase filter](#phrase-filter)
@@ -1606,12 +1646,15 @@ when the following query is executed:
16061646
_time:1h | collapse_nums prettify
16071647
```
16081648

1649+
The patterns returned by `collapse_nums prettify` pipe can be used in [pattern match filter](#pattern-match-filter).
1650+
16091651
`collapse_nums` can miss some numbers or can collapse unexpected numbers. In this case [conditional `collapse_nums`](#conditional-collapse_nums) can be used
16101652
for skipping such values and pre-processing them separately with [`replace_regexp`](#replace_regexp-pipe).
16111653

16121654
See also:
16131655

16141656
- [conditional `collapse_nums`](#conditional-collapse_nums)
1657+
- [pattern match filter](#pattern-match-filter)
16151658
- [`replace`](#replace-pipe)
16161659
- [`replace_regexp`](#replace_regexp-pipe)
16171660

lib/logstorage/filter_and.go

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -143,13 +143,13 @@ func getCommonTokensForAndFilters(filters []filter) []fieldTokens {
143143
case *filterExactPrefix:
144144
tokens := t.getTokens()
145145
mergeFieldTokens(t.fieldName, tokens)
146-
case *filterPhrase:
146+
case *filterPatternMatch:
147147
tokens := t.getTokens()
148148
mergeFieldTokens(t.fieldName, tokens)
149-
case *filterPrefix:
149+
case *filterPhrase:
150150
tokens := t.getTokens()
151151
mergeFieldTokens(t.fieldName, tokens)
152-
case *filterSubstring:
152+
case *filterPrefix:
153153
tokens := t.getTokens()
154154
mergeFieldTokens(t.fieldName, tokens)
155155
case *filterRegexp:
@@ -158,6 +158,9 @@ func getCommonTokensForAndFilters(filters []filter) []fieldTokens {
158158
case *filterSequence:
159159
tokens := t.getTokens()
160160
mergeFieldTokens(t.fieldName, tokens)
161+
case *filterSubstring:
162+
tokens := t.getTokens()
163+
mergeFieldTokens(t.fieldName, tokens)
161164
case *filterOr:
162165
bfts := t.getByFieldTokens()
163166
for _, bft := range bfts {

lib/logstorage/filter_or.go

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -147,13 +147,13 @@ func getCommonTokensForOrFilters(filters []filter) []fieldTokens {
147147
case *filterExactPrefix:
148148
tokens := t.getTokens()
149149
mergeFieldTokens(t.fieldName, tokens)
150-
case *filterPhrase:
150+
case *filterPatternMatch:
151151
tokens := t.getTokens()
152152
mergeFieldTokens(t.fieldName, tokens)
153-
case *filterPrefix:
153+
case *filterPhrase:
154154
tokens := t.getTokens()
155155
mergeFieldTokens(t.fieldName, tokens)
156-
case *filterSubstring:
156+
case *filterPrefix:
157157
tokens := t.getTokens()
158158
mergeFieldTokens(t.fieldName, tokens)
159159
case *filterRegexp:
@@ -162,6 +162,9 @@ func getCommonTokensForOrFilters(filters []filter) []fieldTokens {
162162
case *filterSequence:
163163
tokens := t.getTokens()
164164
mergeFieldTokens(t.fieldName, tokens)
165+
case *filterSubstring:
166+
tokens := t.getTokens()
167+
mergeFieldTokens(t.fieldName, tokens)
165168
case *filterAnd:
166169
bfts := t.getByFieldTokens()
167170
for _, bft := range bfts {

0 commit comments

Comments
 (0)