-
-
Notifications
You must be signed in to change notification settings - Fork 283
feat(config): Add FCrDNS checker #682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hey, thanks for the contribution! The CHALLENGE rule is more meant for client-facing challenges. You probably want something like the checker.Impl interface here. This will let you add an |
I have finished converting the implementation to a checker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved modulo the change to checker.List#Check
I have added CEL bindings and reverted to the previous checker behavior. |
Signed-off-by: Xe Iaso <[email protected]>
Signed-off-by: Xe Iaso <[email protected]>
If a client claims to be Googlebot but isn't from Google, that's kinda suspicious and should be treated as such. Signed-off-by: Xe Iaso <[email protected]>
Signed-off-by: Xe Iaso <[email protected]>
Thanks much! This is gonna let us do a lot of fun things :) |
Signed-off-by: Xe Iaso <[email protected]>
Signed-off-by: Xe Iaso <[email protected]>
Signed-off-by: Xe Iaso <[email protected]>
Signed-off-by: Xe Iaso <[email protected]>
Ahrefs is a large SEO company used by single bloggers to large enterprises. It may be beneficial to allow (or deny) them in Anubis. They do publish rDNS entries, so once an Anubis version with TecharoHQ#682 is released, this policy would benefit from setting up that check. Further information: https://ahrefs.com/robot
Ahrefs is a large SEO company used by single bloggers to large enterprises. It may be beneficial to allow (or deny) them in Anubis. They do publish rDNS entries, so once an Anubis version with TecharoHQ#682 is released, this policy would benefit from setting up that check. Crawler information: https://ahrefs.com/robot Majestic is a UK based specialist search engine and commercial SEO entity. They claim to "spider the Web for the purpose of building a search engine" with a distributed crawler. Defaults to allow as it'd be caught with the generic browser policy definition. Crawler information: https://mj12bot.com Screaming Frog is a smaller actor in the SEO space and their crawler occasionally attempts to access content despite being explicitly excluded via robots.txt directives. As far as I could research they neither publish their IP ranges nor provide an information page for their crawler. That's why this defaults to deny. Company website: https://www.screamingfrog.co.uk Checkmark Network is a brand and intellectual property protection company. If you have no direct business with them, it is likely they shouldn't be crawling your content in the first place. Defaults to deny for this reason. Crawler information: https://www.checkmarknetwork.com/spider.html/ Domainsbot collects information on domains and website data for intellectual property disputes. Unless you have direct business with them, there's likely no reason for them to be accessing your content. Defaults to deny. Crawler information: https://domainsbot.com/pandalytics/ zoominfo is a data mining and sales platform for enterprise use, feeding the gathered information into a machine learning model. It is unlikely to be of value to anyone else. Therefore, this defaults to deny. Company website: https://www.zoominfo.com
Closes #431. This PR implements dynamic verification of bot IPs using DNS records. For details regarding how it works, see the documentation I have added. I am not sure if the way I added a new
algorithm
is the best way to implement this. Let me know if there is a better way.[Unreleased]
section of docs/docs/CHANGELOG.mdnpm run test:integration
(unsupported on Windows, please use WSL)