dia.com
robots.txt

Robots Exclusion Standard data for dia.com

Resource Scan

Scan Details

Site Domain dia.com
Base Domain dia.com
Scan Status Ok
Last Scan2024-06-18T16:06:22+00:00
Next Scan 2024-07-18T16:06:22+00:00

Last Scan

Scanned2024-06-18T16:06:22+00:00
URL https://dia.com/robots.txt
Redirect https://www.dia.com:443/robots.txt
Redirect Domain www.dia.com
Redirect Base dia.com
Domain IPs 13.227.254.35, 13.227.254.7, 13.227.254.8, 13.227.254.89, 2600:9000:200a:6a00:1a:607c:7ac0:93a1, 2600:9000:200a:8c00:1a:607c:7ac0:93a1, 2600:9000:200a:8e00:1a:607c:7ac0:93a1, 2600:9000:200a:b000:1a:607c:7ac0:93a1, 2600:9000:200a:be00:1a:607c:7ac0:93a1, 2600:9000:200a:c800:1a:607c:7ac0:93a1, 2600:9000:200a:ce00:1a:607c:7ac0:93a1, 2600:9000:200a:e400:1a:607c:7ac0:93a1
Redirect IPs 3.230.153.162, 34.225.149.245, 44.209.132.62, 44.209.51.223
Response IP 44.209.51.223
Found Yes
Hash 9c13f0c4a19cbc57338395b44124ed830c47546ceb31c80c1111d0d1e7571863
SimHash 3a850d8d7450

Groups

*

Rule Path
Disallow /maintenance.html
Disallow /external/whoami

Other Records

Field Value
sitemap https://www.dia.com/sitemaps/sitemap.xml.gz

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-agent: *
  • Disallow: /