grubstreet.com
robots.txt

Robots Exclusion Standard data for grubstreet.com

Resource Scan

Scan Details

Site Domain grubstreet.com
Base Domain grubstreet.com
Scan Status Ok
Last Scan2024-11-09T17:48:28+00:00
Next Scan 2024-11-16T17:48:28+00:00

Last Scan

Scanned2024-11-09T17:48:28+00:00
URL https://grubstreet.com/robots.txt
Redirect https://www.grubstreet.com/robots.txt
Redirect Domain www.grubstreet.com
Redirect Base grubstreet.com
Domain IPs 199.232.193.246, 199.232.197.246
Redirect IPs 199.232.193.246, 199.232.197.246
Response IP 146.75.93.246
Found Yes
Hash 4bde72a6ad3ace55e371938db7023b13bbf14df1ef777c52f4cae420f9f6f39e
SimHash d910c950cdb5

Groups

mediapartners-google

Rule Path
Disallow

truliabot

Rule Path
Disallow /

*

Rule Path
Disallow /coral-talk/
Disallow /daily/grubstreet/
Disallow /daily/southflorida/
Disallow /fashion/lookbook/38917/
Disallow /fashion/search/models/
Disallow /jesse/
Disallow /marketplace/
Disallow /news/intelligencer/16951/
Disallow /newsletter/
Disallow /nymag/columns/intelligencer/features/16951/
Disallow /nymag/letters/14872/
Disallow /search.html
Disallow /search/
Disallow /srch
Disallow /srch/
Disallow /temp/
Disallow /test/
Disallow *?origSession=*

twitterbot

Rule Path
Disallow /amp/*

mediapartners-google*

Rule Path
Disallow

perplexitybot

Rule Path
Disallow /

google-extended

Rule Path
Disallow

amazonbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

applebot

Rule Path
Allow /

anthropic-ai

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.grubstreet.com/sitemap.xml
sitemap https://www.grubstreet.com/_news.xml