cap.news
robots.txt

Robots Exclusion Standard data for cap.news

Archived Snapshots

Resource Scan

Scan Details

Site Domain	cap.news
Base Domain	cap.news
Scan Status	Ok
Last Scan	2026-02-05T19:41:24+00:00
Next Scan	2026-03-07T19:41:24+00:00

Last Scan

Scanned	2026-02-05T19:41:24+00:00
URL	https://www.cap.news/robots.txt
Domain IPs	104.18.68.40, 104.18.69.40, 2606:4700::6812:4428, 2606:4700::6812:4528
Response IP	104.18.68.40
Found	Yes
Hash	46d87c76f66c60881138397f15b68205866c397bbce64906555bc561cf3f1d72
SimHash	6f1d9c20ab11

Groups

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

googlebot

Rule	Path
Disallow	/nogooglebot/

Rule

Path

Disallow

/nogooglebot/

*

Rule	Path
Disallow	/login

Rule

Path

Disallow

/login

adsbot-google

Rule	Path
Disallow	/login

Rule

Path

Disallow

/login

nutch

Rule	Path
Disallow	/

Rule

Path

Disallow

/

ahrefsbot

Rule	Path
Disallow	/login

Rule

Path

Disallow

/login

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

10

ahrefssiteaudit

Rule	Path
Disallow	/login

Rule

Path

Disallow

/login

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

10

mj12bot

Rule	Path
Disallow	/login

Rule

Path

Disallow

/login

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

10

Back to top

Other Records

Field	Value
sitemap	https://www.cap.news/sitemap.xml

Field

Value

sitemap

https://www.cap.news/sitemap.xml

Back to top

Comments

beehiiv default robots.txt
This is automatically used when you leave custom content empty
Customize below or upload your own robots.txt file

Back to top

cap.newsrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

amazonbot

googlebot

*

adsbot-google

nutch

ahrefsbot

Other Records

ahrefssiteaudit

Other Records

mj12bot

Other Records

Other Records

Comments

cap.news
robots.txt