find-org.com
robots.txt

Robots Exclusion Standard data for find-org.com

Resource Scan

Scan Details

Site Domain find-org.com
Base Domain find-org.com
Scan Status Ok
Last Scan2024-10-04T15:08:49+00:00
Next Scan 2024-10-11T15:08:49+00:00

Last Scan

Scanned2024-10-04T15:08:49+00:00
URL https://find-org.com/robots.txt
Redirect http://www.find-org.com/robots.txt
Redirect Domain www.find-org.com
Redirect Base find-org.com
Domain IPs 104.21.32.21, 172.67.182.73, 2606:4700:3031::ac43:b649, 2606:4700:3032::6815:2015
Redirect IPs 104.21.32.21, 172.67.182.73, 2606:4700:3031::ac43:b649, 2606:4700:3032::6815:2015
Response IP 172.67.182.73
Found Yes
Hash 5433d215f78514e56435289942f73169444c4b7ac3018f2d19489d291422e095
SimHash d464d9c2e612

Groups

ahrefsbot
slurp
aport
teoma
proximic
scooter
ia_archiver
ia_archiver-web.archive.org
domaincrawler
lycos
webalta
mj12bot
grapeshot
baiduspider
baiduspider-video
sogou spider
youdaobot
naverbot
yeti
moget
ichiro
hybridbot
semrushbot
semrushbot-sa
smtbot
blexbot
gptbot
dotbot
getintent
petalbot
weborama-fetcher
dataforseobot
amazonbot
claudebot

Rule Path
Disallow /

mediapartners-google

Rule Path
Disallow /go/

*

Rule Path
Disallow /js/
Disallow /search/*%26val%3D
Disallow /search/inn/
Disallow /search/name/
Disallow /search/similar/
Disallow /arbitrage/*/page/
Disallow /contract/*/page/
Disallow /status/*/page/
Disallow /licence/*/page/
Disallow /founders/*/page/
Disallow /go/

Warnings

  • `host` is not a known field.