cambridgetoday.ca
robots.txt

Robots Exclusion Standard data for cambridgetoday.ca

Resource Scan

Scan Details

Site Domain cambridgetoday.ca
Base Domain cambridgetoday.ca
Scan Status Ok
Last Scan2024-09-18T19:28:54+00:00
Next Scan 2024-09-25T19:28:54+00:00

Last Scan

Scanned2024-09-18T19:28:54+00:00
URL https://cambridgetoday.ca/robots.txt
Redirect https://www.cambridgetoday.ca/robots.txt
Redirect Domain www.cambridgetoday.ca
Redirect Base cambridgetoday.ca
Domain IPs 104.18.4.236, 104.18.5.236, 2606:4700::6812:4ec, 2606:4700::6812:5ec
Redirect IPs 104.18.4.236, 104.18.5.236, 2606:4700::6812:4ec, 2606:4700::6812:5ec
Response IP 104.18.4.236
Found Yes
Hash 3a1e4d3a06d39b0a828ac93e34ef3f705fcacf9a079ae09bbc5aedf488a8e4cb
SimHash 4904cb90c510

Groups

*

Rule Path
Allow /

googlebot-news

Rule Path
Allow /rss/showcase

googlebot

Rule Path
Allow /rss/showcase

semrushbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /