Does not actually block bot traffic (tested with curl) #809

martinpitt · 2025-07-11T08:49:29Z

martinpitt
Jul 11, 2025

I am currently deploying Anubis in front of our project's test logs. With a real browser I see the Anubis proof-of-work page, and then it goes But what worries me is that it does not actually seem to block automated traffic? When I run curl on my deployed proxy, that succeeds and gives me the actual file. I expected that to get the Anubis page and some kind of "401 Go away" response -- because, if that succeeds, the whole reason for deploying Anubis is moot.

I could probably have made an error in our configuration -- I use a fairly standard one:

      - name: anubis
        image: ghcr.io/techarohq/anubis:latest
        ports:
          - containerPort: 8080
            protocol: TCP
            name: anubis-port
        env:
          # https://anubis.techaro.lol/docs/admin/installation/
          - name: BIND
            value: ":8080"
          - name: METRICS_BIND
            value: ":9099"
          - name: SERVE_ROBOTS_TXT
            value: "true"
          - name: TARGET
            # app container listens on port 10080
            value: "http://localhost:10080/"
          - name: DIFFICULTY
            value: "6"
          - name: COOKIE_EXPIRATION_TIME
            value: "24h"

But this is equally true for other sites that have Anubis, e.g.:

Opening them with Firefox does the Proof-of-work, but running curl on them just works. I even tried to run them on a different machine, just in case the server remembers IPs (which it shouldn't, this is cookie based).

What am I missing? Can Anubis be configured to not allow this?

Thank you!

Answered by Xe

Jul 11, 2025

Hi, welcome to the land of tradeoffs.

Anubis is intended to have two "modes of operation": the default config which attempts to break as little as possible in the process of being added in and the less than default config where administrators have customized it for their needs. Curl and the like are allowed through in that first mode so package managers, monitoring tools, and as much sysadmin tooling as possible doesn't break.

If you want to show challenges across the board, you need to have a base weight rule like this:

bots:
- name: base-weight
  action: WEIGH
  expression: "true"
  weight:
    adjust: 5

# other rules go here

Then you need to account for all the user agents of all the …

View full answer

guest20 · 2025-07-11T12:17:38Z

guest20
Jul 11, 2025

@martinpitt curl isn't actually a misbehaved / deceptive user agent on its own, and it definitely isn't an AI training bot by default.

it is, however, a helpful debugging tool/library used by humans and infrastructure software alike.

You might still want to do a bucked rate-limit for curl users, but not really much more aggressively than any other user agent.

TL;DR: Curl is friend shaped.

0 replies

Xe · 2025-07-11T12:22:32Z

Xe
Jul 11, 2025
Maintainer

Hi, welcome to the land of tradeoffs.

Anubis is intended to have two "modes of operation": the default config which attempts to break as little as possible in the process of being added in and the less than default config where administrators have customized it for their needs. Curl and the like are allowed through in that first mode so package managers, monitoring tools, and as much sysadmin tooling as possible doesn't break.

If you want to show challenges across the board, you need to have a base weight rule like this:

bots:
- name: base-weight
  action: WEIGH
  expression: "true"
  weight:
    adjust: 5

# other rules go here

Then you need to account for all the user agents of all the software that should be allowed to use the service that isn't a browser, such as the git client, curl, wget, etc. This is a pain and I have yet to complete a set of rules / establish guidance on how to do this. Most of the time administrators don't have a complete list of everything that should be allowed to communicate with a web service without that tool being a browser.

The other reason vanilla curl is allowed out of the box is because Anubis is targeted at the patterns that abusive scrapers do. They don't just use curl out of the box, they try to make it look like Google Chrome to trick sysadmins into thinking their service is just weirdly popular.

I am collecting some data from honeypots to try and get better heuristics, but the main lesson I've learned working on Anubis is that shitty heuristics buy you time. The core of how Anubis works is an exceptionally shitty heuristic. This has backfired a little, but solving that is a much smaller problem space than what Anubis solves as a whole.

2 replies

martinpitt Jul 11, 2025
Author

Thanks @guest20 and @Xe ! @jelly just explained that to me as well. Indeed I know that most abusive scrapers go out of their way to blend in as regular user traffic (botnet, residential IPs, indistinguishable User-Agent), so it should fend off these.

I just wasn't aware of the "allow curl and friends" default policy and was afraid I misconfigured something.

Indeed

curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:140.0) Gecko/20100101 Firefox/140.0" https://elixir.bootlin.com/linux/v5.0.9/source/fs/proc

gets blocked as it should, and our own logs deployment works the same way.

Thanks for the explanation and for creating this project! ❤️

Xe Jul 11, 2025
Maintainer

No problem! I think this explanation may end up in the docs or the blog. Not sure which yet. I just wish I could afford to work on this full time lol

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Does not actually block bot traffic (tested with curl) #809

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Does not actually block bot traffic (tested with curl) #809

Uh oh!

martinpitt Jul 11, 2025

Replies: 2 comments · 2 replies

Uh oh!

guest20 Jul 11, 2025

Uh oh!

Xe Jul 11, 2025 Maintainer

Uh oh!

martinpitt Jul 11, 2025 Author

Uh oh!

Xe Jul 11, 2025 Maintainer

martinpitt
Jul 11, 2025

Replies: 2 comments 2 replies

guest20
Jul 11, 2025

Xe
Jul 11, 2025
Maintainer

martinpitt Jul 11, 2025
Author

Xe Jul 11, 2025
Maintainer