info.destinationcanada.com
robots.txt

Robots Exclusion Standard data for info.destinationcanada.com

Resource Scan

Scan Details

Site Domain info.destinationcanada.com
Base Domain destinationcanada.com
Scan Status Ok
Last Scan2024-05-10T16:04:24+00:00
Next Scan 2024-06-09T16:04:24+00:00

Last Scan

Scanned2024-05-10T16:04:24+00:00
URL https://info.destinationcanada.com/robots.txt
Domain IPs 2600:9000:2003:2e00:3:b612:40c0:93a1, 2600:9000:2003:5200:3:b612:40c0:93a1, 2600:9000:2003:6a00:3:b612:40c0:93a1, 2600:9000:2003:9c00:3:b612:40c0:93a1, 2600:9000:2003:a00:3:b612:40c0:93a1, 2600:9000:2003:a600:3:b612:40c0:93a1, 2600:9000:2003:d600:3:b612:40c0:93a1, 2600:9000:2003:da00:3:b612:40c0:93a1, 52.84.229.11, 52.84.229.120, 52.84.229.123, 52.84.229.13
Response IP 52.84.229.11
Found Yes
Hash 4e5b2ebecbce24f61c0faa9bc065c80209c4d4bf080f1ea9205f7fa031d463be
SimHash e815d81365a0

Groups

*

Rule Path
Disallow

gsa-crawler

Rule Path
Disallow

akamai-sitesnapshot/*

Rule Path
Disallow

Comments

  • To remove the staging sites from all user-agent from and prevent them crawling the staging sites, with the exceptions of the user-agents of ctc google search appliance and Akamai.
  • Allow the CTC's google search appliance to access the sites
  • Any empty value, indicates that all URLs can be retrieved.
  • Allow Akamai crawler to access the sites
  • Any empty value, indicates that all URLs can be retrieved.