thegradcafe.com
robots.txt

Robots Exclusion Standard data for thegradcafe.com

Resource Scan

Scan Details

Site Domain thegradcafe.com
Base Domain thegradcafe.com
Scan Status Ok
Last Scan2024-11-17T00:14:12+00:00
Next Scan 2024-11-24T00:14:12+00:00

Last Scan

Scanned2024-11-17T00:14:12+00:00
URL https://thegradcafe.com/robots.txt
Redirect https://www.thegradcafe.com/robots.txt
Redirect Domain www.thegradcafe.com
Redirect Base thegradcafe.com
Domain IPs 72.52.144.230
Redirect IPs 72.52.144.230
Response IP 72.52.144.230
Found Yes
Hash 5b3887d2aa9a3a9791e60af4edd3653a141019b503424692f5d9e9ad8b781f7b
SimHash 087ac363cf93

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /index-ad-test.php

ia_archiver

Rule Path
Disallow /

ia_archiver/1.6

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

computer_and_automation_research_institute_crawler

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

mediapartners-google

Rule Path
Disallow

opebot-v (https://www.1plusx.com (https://www.1plusx.com/))

Rule Path
Allow /