/.well-known/

Log In Sign Up

caen-keepexploring.canada.travel
robots.txt

Robots Exclusion Standard data for caen-keepexploring.canada.travel

Archived Snapshots

Resource Scan

Scan Details

Site Domain	caen-keepexploring.canada.travel
Base Domain	canada.travel
Scan Status	Ok
Last Scan	2024-10-28T07:55:50+00:00
Next Scan	2024-11-27T07:55:50+00:00

Last Scan

Scanned	2024-10-28T07:55:50+00:00
URL	https://caen-keepexploring.canada.travel/robots.txt
Redirect	https://info.destinationcanada.com:443/robots.txt
Redirect Domain	info.destinationcanada.com
Redirect Base	destinationcanada.com
Domain IPs	13.33.30.14, 13.33.30.38, 13.33.30.50, 13.33.30.98
Redirect IPs	34.111.187.154
Response IP	34.111.187.154
Found	Yes
Hash	4e5b2ebecbce24f61c0faa9bc065c80209c4d4bf080f1ea9205f7fa031d463be
SimHash	e815d81365a0

Groups

*

Rule

Path

Disallow

gsa-crawler

Rule

Path

Disallow

akamai-sitesnapshot/*

Rule

Path

Disallow

Back to top

Comments

To remove the staging sites from all user-agent from and prevent them crawling the staging sites, with the exceptions of the user-agents of ctc google search appliance and Akamai.
Allow the CTC's google search appliance to access the sites
Any empty value, indicates that all URLs can be retrieved.
Allow Akamai crawler to access the sites
Any empty value, indicates that all URLs can be retrieved.

Back to top