Skip to content

Commit 4948036

Browse files
authored
feat: add default OpenGraph tags to configuration file (#694)
* feat(config): opengraph passthrough configuration Signed-off-by: Xe Iaso <[email protected]> * chore(ogtags): use config.OpenGraph for configuration Signed-off-by: Xe Iaso <[email protected]> * chore: wire up ogtags config in most of the app Signed-off-by: Xe Iaso <[email protected]> * feat(ogtags): return default tags if they are supplied Signed-off-by: Xe Iaso <[email protected]> * chore: make OpenGraph legal so we have some sanity in reviewing Signed-off-by: Xe Iaso <[email protected]> * chore: spelling Signed-off-by: Xe Iaso <[email protected]> * fix(lib): use OpenGraph.Enabled Signed-off-by: Xe Iaso <[email protected]> * test(lib): load default config file if one is not specified in spawnAnubis Signed-off-by: Xe Iaso <[email protected]> * chore(config): fix ST1005 Signed-off-by: Xe Iaso <[email protected]> * docs: document open graph defaults and its new home in the policy file Signed-off-by: Xe Iaso <[email protected]> * docs(installation): point to weight threshold new home Signed-off-by: Xe Iaso <[email protected]> * chore: rename default to override Signed-off-by: Xe Iaso <[email protected]> * chore(default-config): add off-by-default opengraph settings to bot policy file Signed-off-by: Xe Iaso <[email protected]> * fix(anubis): make build Signed-off-by: Xe Iaso <[email protected]> * test(lib): fix build Signed-off-by: Xe Iaso <[email protected]> --------- Signed-off-by: Xe Iaso <[email protected]>
1 parent 7aa732c commit 4948036

File tree

25 files changed

+416
-78
lines changed

25 files changed

+416
-78
lines changed

.github/actions/spelling/expect.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,9 +183,11 @@ NONINFRINGEMENT
183183
nosleep
184184
OCOB
185185
ogtags
186+
ogtitle
186187
omgili
187188
omgilibot
188189
openai
190+
opengraph
189191
openrc
190192
pag
191193
palemoon

.github/actions/spelling/line_forbidden.patterns

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -273,14 +273,6 @@
273273
# Most people only have two hands. Reword.
274274
\b(?i)on the third hand\b
275275

276-
# Should be `Open Graph`
277-
# unless talking about a specific Open Graph implementation:
278-
# - Java
279-
# - Node
280-
# - Py
281-
# - Ruby
282-
\bOpenGraph\b
283-
284276
# Should be `OpenShift`
285277
\bOpenshift\b
286278

cmd/anubis/main.go

Lines changed: 21 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -331,22 +331,28 @@ func main() {
331331
slog.Warn("REDIRECT_DOMAINS is not set, Anubis will only redirect to the same domain a request is coming from, see https://anubis.techaro.lol/docs/admin/configuration/redirect-domains")
332332
}
333333

334+
// If OpenGraph configuration values are not set in the config file, use the
335+
// values from flags / envvars.
336+
if !policy.OpenGraph.Enabled {
337+
policy.OpenGraph.Enabled = *ogPassthrough
338+
policy.OpenGraph.ConsiderHost = *ogCacheConsiderHost
339+
policy.OpenGraph.TimeToLive = *ogTimeToLive
340+
policy.OpenGraph.Override = map[string]string{}
341+
}
342+
334343
s, err := libanubis.New(libanubis.Options{
335-
BasePrefix: *basePrefix,
336-
StripBasePrefix: *stripBasePrefix,
337-
Next: rp,
338-
Policy: policy,
339-
ServeRobotsTXT: *robotsTxt,
340-
PrivateKey: priv,
341-
CookieDomain: *cookieDomain,
342-
CookieExpiration: *cookieExpiration,
343-
CookiePartitioned: *cookiePartitioned,
344-
OGPassthrough: *ogPassthrough,
345-
OGTimeToLive: *ogTimeToLive,
346-
RedirectDomains: redirectDomainsList,
347-
Target: *target,
348-
WebmasterEmail: *webmasterEmail,
349-
OGCacheConsidersHost: *ogCacheConsiderHost,
344+
BasePrefix: *basePrefix,
345+
StripBasePrefix: *stripBasePrefix,
346+
Next: rp,
347+
Policy: policy,
348+
ServeRobotsTXT: *robotsTxt,
349+
PrivateKey: priv,
350+
CookieDomain: *cookieDomain,
351+
CookieExpiration: *cookieExpiration,
352+
CookiePartitioned: *cookiePartitioned,
353+
RedirectDomains: redirectDomainsList,
354+
Target: *target,
355+
WebmasterEmail: *webmasterEmail,
350356
})
351357
if err != nil {
352358
log.Fatalf("can't construct libanubis.Server: %v", err)

data/botPolicies.yaml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,30 @@ bots:
8484

8585
dnsbl: false
8686

87+
# Open Graph passthrough configuration, see here for more information:
88+
# https://anubis.techaro.lol/docs/admin/configuration/open-graph/
89+
openGraph:
90+
# Enables Open Graph passthrough
91+
enabled: false
92+
# Enables the use of the HTTP host in the cache key, this enables
93+
# caching metadata for multiple http hosts at once.
94+
considerHost: false
95+
# How long cached OpenGraph metadata should last in memory
96+
ttl: 24h
97+
# # If set, return these opengraph values instead of looking them up with
98+
# # the target service.
99+
# #
100+
# # Correlates to properties in https://ogp.me/
101+
# override:
102+
# # og:title is required, it is the title of the website
103+
# "og:title": "Techaro Anubis"
104+
# "og:description": >-
105+
# Anubis is a Web AI Firewall Utility that helps you fight the bots
106+
# away so that you can maintain uptime at work!
107+
# "description": >-
108+
# Anubis is a Web AI Firewall Utility that helps you fight the bots
109+
# away so that you can maintain uptime at work!
110+
87111
# By default, send HTTP 200 back to clients that either get issued a challenge
88112
# or a denial. This seems weird, but this is load-bearing due to the fact that
89113
# the most aggressive scraper bots seem to really, really, want an HTTP 200 and

docs/docs/CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1111

1212
## [Unreleased]
1313

14+
- Move Open Graph configuration [to the policy file](./admin/configuration/open-graph.mdx)
15+
- Enable support for default Open Graph metadata
1416
- Replace cidranger with bart for IP range checking, improving IP matching performance by 3-20x with zero heap
1517
allocations
1618
- Remove the unused `/test-error` endpoint and update the testing endpoint `/make-challenge` to only be enabled in

docs/docs/admin/configuration/open-graph.mdx

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,45 @@ This page provides detailed information on how to configure [Open Graph tag](htt
99

1010
## Configuration Options
1111

12+
Open Graph settings are configured in the `openGraph` section of the [Policy File](../policies.mdx).
13+
14+
```yaml
15+
openGraph:
16+
# Enables Open Graph passthrough
17+
enabled: true
18+
# Enables the use of the HTTP host in the cache key, this enables
19+
# caching metadata for multiple http hosts at once.
20+
considerHost: true
21+
# How long cached OpenGraph metadata should last in memory
22+
ttl: 24h
23+
# If set, return these opengraph values instead of looking them up with
24+
# the target service.
25+
#
26+
# Correlates to properties in https://ogp.me/
27+
override:
28+
# og:title is required, it is the title of the website
29+
"og:title": "Techaro Anubis"
30+
"og:description": >-
31+
Anubis is a Web AI Firewall Utility that helps you fight the bots
32+
away so that you can maintain uptime at work!
33+
"description": >-
34+
Anubis is a Web AI Firewall Utility that helps you fight the bots
35+
away so that you can maintain uptime at work!
36+
```
37+
38+
<details>
39+
<summary>Configuration flags / envvars (old)</summary>
40+
41+
Open Graph passthrough used to be configured with configuration flags / environment variables. Reference to these settings are maintained for backwards compatibility's sake.
42+
1243
| Name | Description | Type | Default | Example |
1344
| ------------------------ | --------------------------------------------------------- | -------- | ------- | ----------------------------- |
1445
| `OG_PASSTHROUGH` | Enables or disables the Open Graph tag passthrough system | Boolean | `true` | `OG_PASSTHROUGH=true` |
1546
| `OG_EXPIRY_TIME` | Configurable cache expiration time for Open Graph tags | Duration | `24h` | `OG_EXPIRY_TIME=1h` |
1647
| `OG_CACHE_CONSIDER_HOST` | Enables or disables the use of the host in the cache key | Boolean | `false` | `OG_CACHE_CONSIDER_HOST=true` |
1748

49+
</details>
50+
1851
## Usage
1952

2053
To configure Open Graph tags, you can set the following environment variables, environment file or as flags in your Anubis configuration:

docs/docs/admin/installation.mdx

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ title: Setting up Anubis
44

55
import RandomKey from "@site/src/components/RandomKey";
66

7-
87
Anubis is meant to sit between your reverse proxy (such as Nginx or Caddy) and your target service. One instance of Anubis must be used per service you are protecting.
98

109
<center>
@@ -30,7 +29,7 @@ TLS terminator)
3029
Anubis is shipped in the Docker repo [`ghcr.io/techarohq/anubis`](https://github.com/TecharoHQ/anubis/pkgs/container/anubis). The following tags exist for your convenience:
3130

3231
| Tag | Meaning |
33-
|:--------------------|:-----------------------------------------------------------------------------------------------------------------------------------|
32+
| :------------------ | :--------------------------------------------------------------------------------------------------------------------------------- |
3433
| `latest` | The latest [tagged release](https://github.com/TecharoHQ/anubis/releases), if you are in doubt, start here. |
3534
| `v<version number>` | The Anubis image for [any given tagged release](https://github.com/TecharoHQ/anubis/tags) |
3635
| `main` | The current build on the `main` branch. Only use this if you need the latest and greatest features as they are merged into `main`. |
@@ -43,12 +42,24 @@ Anubis has very minimal system requirements. I suspect that 128Mi of ram may be
4342

4443
For more detailed information on installing Anubis with native packages, please read [the native install directions](./native-install.mdx).
4544

46-
## Environment variables
45+
## Configuration
46+
47+
Anubis is configurable via environment variables and [the policy file](./policies.mdx). Most settings are currently exposed with environment variables but they are being slowly moved over to the policy file.
48+
49+
### Configuration via the policy file
50+
51+
Currently the following settings are configurable via the policy file:
52+
53+
- [Bot policies](./policies.mdx)
54+
- [Open Graph passthrough](./configuration/open-graph.mdx)
55+
- [Weight thresholds](./configuration/thresholds.mdx)
56+
57+
### Environment variables
4758

4859
Anubis uses these environment variables for configuration:
4960

5061
| Environment Variable | Default value | Explanation |
51-
|:-------------------------------|:------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
62+
| :----------------------------- | :---------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
5263
| `BASE_PREFIX` | unset | If set, adds a global prefix to all Anubis endpoints. For example, setting this to `/myapp` would make Anubis accessible at `/myapp/` instead of `/`. This is useful when running Anubis behind a reverse proxy that routes based on path prefixes. |
5364
| `BIND` | `:8923` | The network address that Anubis listens on. For `unix`, set this to a path: `/run/anubis/instance.sock` |
5465
| `BIND_NETWORK` | `tcp` | The address family that Anubis listens on. Accepts `tcp`, `unix` and anything Go's [`net.Listen`](https://pkg.go.dev/net#Listen) supports. |
@@ -60,9 +71,9 @@ Anubis uses these environment variables for configuration:
6071
| `ED25519_PRIVATE_KEY_HEX_FILE` | unset | Path to a file containing the hex-encoded ed25519 private key. Only one of this or its sister option may be set. |
6172
| `METRICS_BIND` | `:9090` | The network address that Anubis serves Prometheus metrics on. See `BIND` for more information. |
6273
| `METRICS_BIND_NETWORK` | `tcp` | The address family that the Anubis metrics server listens on. See `BIND_NETWORK` for more information. |
63-
| `OG_EXPIRY_TIME` | `24h` | The expiration time for the Open Graph tag cache. |
64-
| `OG_PASSTHROUGH` | `false` | If set to `true`, Anubis will enable Open Graph tag passthrough. |
65-
| `OG_CACHE_CONSIDER_HOST` | `false` | If set to `true`, Anubis will consider the host in the Open Graph tag cache key. |
74+
| `OG_EXPIRY_TIME` | `24h` | The expiration time for the Open Graph tag cache. Prefer using [the policy file](./configuration/open-graph.mdx) to configure the Open Graph subsystem. |
75+
| `OG_PASSTHROUGH` | `false` | If set to `true`, Anubis will enable Open Graph tag passthrough. Prefer using [the policy file](./configuration/open-graph.mdx) to configure the Open Graph subsystem. |
76+
| `OG_CACHE_CONSIDER_HOST` | `false` | If set to `true`, Anubis will consider the host in the Open Graph tag cache key. Prefer using [the policy file](./configuration/open-graph.mdx) to configure the Open Graph subsystem. |
6677
| `POLICY_FNAME` | unset | The file containing [bot policy configuration](./policies.mdx). See the bot policy documentation for more details. If unset, the default bot policy configuration is used. |
6778
| `REDIRECT_DOMAINS` | unset | If set, restrict the domains that Anubis can redirect to when passing a challenge.<br/><br/>If this is unset, Anubis may redirect to any domain which could cause security issues in the unlikely case that an attacker passes a challenge for your browser and then tricks you into clicking a link to your domain.<br/><br/>Note that if you are hosting Anubis on a non-standard port (`https://example:com:8443`, `http://www.example.net:8080`, etc.), you must also include the port number here. |
6879
| `SERVE_ROBOTS_TXT` | `false` | If set `true`, Anubis will serve a default `robots.txt` file that disallows all known AI scrapers by name and then additionally disallows every scraper. This is useful if facts and circumstances make it difficult to change the underlying service to serve such a `robots.txt` file. |
@@ -138,6 +149,7 @@ STRIP_BASE_PREFIX=true
138149
```
139150

140151
With this configuration:
152+
141153
- A request to `/myapp/api/users` would be forwarded to your target service as `/api/users`
142154
- A request to `/myapp/` would be forwarded as `/`
143155

internal/ogtags/cache.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ func (c *OGTagCache) GetOGTags(url *url.URL, originalHost string) (map[string]st
1313
return nil, errors.New("nil URL provided, cannot fetch OG tags")
1414
}
1515

16+
if len(c.ogOverride) != 0 {
17+
return c.ogOverride, nil
18+
}
19+
1620
target := c.getTarget(url)
1721
cacheKey := c.generateCacheKey(target, originalHost)
1822

0 commit comments

Comments
 (0)