Skip to content

Commit 7c19f68

Browse files
committed
feat: add more service policy definitions
Ahrefs is a large SEO company used by single bloggers to large enterprises. It may be beneficial to allow (or deny) them in Anubis. They do publish rDNS entries, so once an Anubis version with TecharoHQ#682 is released, this policy would benefit from setting up that check. Crawler information: https://ahrefs.com/robot Majestic is a UK based specialist search engine and commercial SEO entity. They claim to "spider the Web for the purpose of building a search engine" with a distributed crawler. Defaults to allow as it'd be caught with the generic browser policy definition. Crawler information: https://mj12bot.com Screaming Frog is a smaller actor in the SEO space and their crawler occasionally attempts to access content despite being explicitly excluded via robots.txt directives. As far as I could research they neither publish their IP ranges nor provide an information page for their crawler. That's why this defaults to deny. Company website: https://www.screamingfrog.co.uk Checkmark Network is a brand and intellectual property protection company. If you have no direct business with them, it is likely they shouldn't be crawling your content in the first place. Defaults to deny for this reason. Crawler information: https://www.checkmarknetwork.com/spider.html/ Domainsbot collects information on domains and website data for intellectual property disputes. Unless you have direct business with them, there's likely no reason for them to be accessing your content. Defaults to deny. Crawler information: https://domainsbot.com/pandalytics/ zoominfo is a data mining and sales platform for enterprise use, feeding the gathered information into a machine learning model. It is unlikely to be of value to anyone else. Therefore, this defaults to deny. Company website: https://www.zoominfo.com
1 parent 132b2ed commit 7c19f68

File tree

6 files changed

+72
-0
lines changed

6 files changed

+72
-0
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
- name: checkmark-network
2+
user_agent_regex: ^CheckMarkNetwork/
3+
action: DENY
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
- name: pandalytics
2+
user_agent_regex: ^Pandalytics/
3+
action: DENY
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
- name: zoominfo
2+
user_agent_regex: ZoominfoBot
3+
action: DENY

data/services/seo/ahrefs.yaml

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
- name: ahrefs
2+
user_agent_regex: (AhrefsBot|AhrefsSiteAudit)/
3+
action: ALLOW
4+
# https://api.ahrefs.com/v3/public/crawler-ip-ranges
5+
remote_addresses: [
6+
"5.39.1.224/27",
7+
"5.39.109.160/27",
8+
"15.235.27.0/24",
9+
"15.235.96.0/24",
10+
"15.235.98.0/24",
11+
"37.59.204.128/27",
12+
"51.68.247.192/27",
13+
"51.75.236.128/27",
14+
"51.89.129.0/24",
15+
"51.161.37.0/24",
16+
"51.161.65.0/24",
17+
"51.195.183.0/24",
18+
"51.195.215.0/24",
19+
"51.195.244.0/24",
20+
"51.222.95.0/24",
21+
"51.222.168.0/24",
22+
"51.222.253.0/26",
23+
"54.36.148.0/23",
24+
"54.37.118.64/27",
25+
"54.38.147.0/24",
26+
"54.39.0.0/24",
27+
"54.39.6.0/24",
28+
"54.39.89.0/24",
29+
"54.39.136.0/24",
30+
"54.39.203.0/24",
31+
"54.39.210.0/24",
32+
"92.222.104.192/27",
33+
"92.222.108.96/27",
34+
"94.23.188.192/27",
35+
"142.44.220.0/24",
36+
"142.44.225.0/24",
37+
"142.44.228.0/24",
38+
"142.44.233.0/24",
39+
"148.113.128.0/24",
40+
"148.113.130.0/24",
41+
"167.114.139.0/24",
42+
"168.100.149.0/24",
43+
"176.31.139.0/27",
44+
"198.244.168.0/24",
45+
"198.244.183.0/24",
46+
"198.244.186.193/32",
47+
"198.244.186.194/31",
48+
"198.244.186.196/30",
49+
"198.244.186.200/31",
50+
"198.244.186.202/32",
51+
"198.244.226.0/24",
52+
"198.244.240.0/24",
53+
"198.244.242.0/24",
54+
"202.8.40.0/22",
55+
"202.94.84.110/31",
56+
"202.94.84.112/31",
57+
]

data/services/seo/mj12bot.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
- name: mj12bot
2+
user_agent_regex: MJ12bot/
3+
action: ALLOW

data/services/seo/screaming-frog.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
- name: screaming-frog
2+
user_agent_regex: ^Screaming Frog SEO Spider/
3+
action: DENY

0 commit comments

Comments
 (0)