caen-keepexploring.canada.travel
robots.txt

Robots Exclusion Standard data for caen-keepexploring.canada.travel

Resource Scan

Scan Details

Site Domain caen-keepexploring.canada.travel
Base Domain canada.travel
Scan Status Ok
Last Scan2024-10-28T07:55:50+00:00
Next Scan 2024-11-27T07:55:50+00:00

Last Scan

Scanned2024-10-28T07:55:50+00:00
URL https://caen-keepexploring.canada.travel/robots.txt
Redirect https://info.destinationcanada.com:443/robots.txt
Redirect Domain info.destinationcanada.com
Redirect Base destinationcanada.com
Domain IPs 13.33.30.14, 13.33.30.38, 13.33.30.50, 13.33.30.98
Redirect IPs 34.111.187.154
Response IP 34.111.187.154
Found Yes
Hash 4e5b2ebecbce24f61c0faa9bc065c80209c4d4bf080f1ea9205f7fa031d463be
SimHash e815d81365a0

Groups

*

Rule Path
Disallow

gsa-crawler

Rule Path
Disallow

akamai-sitesnapshot/*

Rule Path
Disallow

Comments

  • To remove the staging sites from all user-agent from and prevent them crawling the staging sites, with the exceptions of the user-agents of ctc google search appliance and Akamai.
  • Allow the CTC's google search appliance to access the sites
  • Any empty value, indicates that all URLs can be retrieved.
  • Allow Akamai crawler to access the sites
  • Any empty value, indicates that all URLs can be retrieved.