die.net
robots.txt

Robots Exclusion Standard data for die.net

Resource Scan

Scan Details

Site Domain die.net
Base Domain die.net
Scan Status Ok
Last Scan2024-09-26T22:50:30+00:00
Next Scan 2024-10-03T22:50:30+00:00

Last Scan

Scanned2024-09-26T22:50:30+00:00
URL https://die.net/robots.txt
Redirect https://www.die.net/robots.txt
Redirect Domain www.die.net
Redirect Base die.net
Domain IPs 104.26.0.94, 104.26.1.94, 172.67.69.187, 2606:4700:20::681a:15e, 2606:4700:20::681a:5e, 2606:4700:20::ac43:45bb
Redirect IPs 104.26.0.94, 104.26.1.94, 172.67.69.187, 2606:4700:20::681a:15e, 2606:4700:20::681a:5e, 2606:4700:20::ac43:45bb
Response IP 172.67.69.187
Found Yes
Hash c2295c0f71bbfa737dea740f104c205ff29214c3c7519f978651f597ba02fbac
SimHash 24251d5ac6e1

Groups

mediapartners-google
adsbot-google

Rule Path
Disallow

*

Rule Path
Disallow /icons/
Disallow /private/
Disallow /search/
Disallow /this-is-a-bad-url/
Disallow /tmp/

Other Records

Field Value
sitemap http://www.die.net/sitemap.xml.gz

Comments

  • Serve relevant ads on any page:
  • There's not much here, so everyone is welcome to crawl most of the site:
  • And here's where to find everything: