byscuit.com
robots.txt

Robots Exclusion Standard data for byscuit.com

Resource Scan

Scan Details

Site Domain byscuit.com
Base Domain byscuit.com
Scan Status Ok
Last Scan2024-05-20T03:26:32+00:00
Next Scan 2024-06-19T03:26:32+00:00

Last Scan

Scanned2024-05-20T03:26:32+00:00
URL https://byscuit.com/robots.txt
Redirect https://www.byscuit.com/robots.txt
Redirect Domain www.byscuit.com
Redirect Base byscuit.com
Domain IPs 174.142.247.8
Redirect IPs 104.21.23.101, 172.67.210.119, 2606:4700:3035::6815:1765, 2606:4700:3035::ac43:d277
Response IP 172.67.210.119
Found Yes
Hash b3fe41e57272c7ba854b1beb0eac75f29021256aa8cf737f263b940bc95ac3ee
SimHash 696253e24570

Groups

slurp

Rule Path
Disallow

Other Records

Field Value
crawl-delay 100

gsa-crawler-www

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 100

googlebot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 100

mediapartners-google

Rule Path
Disallow

yahoo-newscrawler

Rule Path
Disallow

msnbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 100

*

Rule Path
Disallow /config/
Disallow /handlers/
Disallow /includes/
Disallow /interceptors/
Disallow /layouts/
Disallow /logs/
Disallow /models/
Disallow /modules/
Disallow /modules_app/
Disallow /views/
Allow /