dia.com
robots.txt

Robots Exclusion Standard data for dia.com

Resource Scan

Scan Details

Site Domain dia.com
Base Domain dia.com
Scan Status Ok
Last Scan2024-10-16T16:07:09+00:00
Next Scan 2024-11-15T16:07:09+00:00

Last Scan

Scanned2024-10-16T16:07:09+00:00
URL https://dia.com/robots.txt
Redirect https://www.dia.com:443/robots.txt
Redirect Domain www.dia.com
Redirect Base dia.com
Domain IPs 13.227.254.35, 13.227.254.7, 13.227.254.8, 13.227.254.89, 2600:9000:200a:1200:1a:607c:7ac0:93a1, 2600:9000:200a:3400:1a:607c:7ac0:93a1, 2600:9000:200a:4a00:1a:607c:7ac0:93a1, 2600:9000:200a:8000:1a:607c:7ac0:93a1, 2600:9000:200a:8c00:1a:607c:7ac0:93a1, 2600:9000:200a:da00:1a:607c:7ac0:93a1, 2600:9000:200a:dc00:1a:607c:7ac0:93a1, 2600:9000:200a:fa00:1a:607c:7ac0:93a1
Redirect IPs 34.196.173.217, 34.205.176.224, 35.169.16.230, 52.22.193.118
Response IP 35.169.16.230
Found Yes
Hash 9c13f0c4a19cbc57338395b44124ed830c47546ceb31c80c1111d0d1e7571863
SimHash 3a850d8d7450

Groups

*

Rule Path
Disallow /maintenance.html
Disallow /external/whoami

Other Records

Field Value
sitemap https://www.dia.com/sitemaps/sitemap.xml.gz

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-agent: *
  • Disallow: /