newcom.ca
robots.txt

Robots Exclusion Standard data for newcom.ca

Resource Scan

Scan Details

Site Domain newcom.ca
Base Domain newcom.ca
Scan Status Ok
Last Scan2024-10-19T20:45:06+00:00
Next Scan 2024-11-18T20:45:06+00:00

Last Scan

Scanned2024-10-19T20:45:06+00:00
URL https://newcom.ca/robots.txt
Redirect https://www.newcom.ca/robots.txt
Redirect Domain www.newcom.ca
Redirect Base newcom.ca
Domain IPs 2600:1f11:793:c400:dbff:7a6a:43bb:43f2, 3.97.123.209
Redirect IPs 2600:1f11:793:c400:dbff:7a6a:43bb:43f2, 3.97.123.209
Response IP 3.97.123.209
Found Yes
Hash f3423b9bf2b0aa522ebabbf05003e09528e23aa506a686afbad9699ca6ab8a36
SimHash 5c2c59d0a4b1

Groups

*

Rule Path
Disallow

amazonbot
anthropic-ai
applebot
awariorssbot
awariosmartbot
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
dataforseobot
diffbot
facebookbot
friendlycrawler
googleother
img2dataset
imagesiftbot
magpie-crawler
meltwater
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
perplexitybot
piplbot
scoop.it
seekr
youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.newcom.ca/sitemap_index.xml

Comments

  • START YOAST BLOCK
  • ---------------------------
  • ---------------------------
  • END YOAST BLOCK
  • Disallow Rules