find-org.com
robots.txt
Robots Exclusion Standard data for find-org.com
Resource Scan
Scan Details
Site Domain | find-org.com |
Base Domain | find-org.com |
Scan Status | Ok |
Last Scan | 2024-10-04T15:08:49+00:00 |
Next Scan | 2024-10-11T15:08:49+00:00 |
Last Scan
Scanned | 2024-10-04T15:08:49+00:00 |
URL | https://find-org.com/robots.txt |
Redirect | http://www.find-org.com/robots.txt |
Redirect Domain | www.find-org.com |
Redirect Base | find-org.com |
Domain IPs | 104.21.32.21, 172.67.182.73, 2606:4700:3031::ac43:b649, 2606:4700:3032::6815:2015 |
Redirect IPs | 104.21.32.21, 172.67.182.73, 2606:4700:3031::ac43:b649, 2606:4700:3032::6815:2015 |
Response IP | 172.67.182.73 |
Found | Yes |
Hash | 5433d215f78514e56435289942f73169444c4b7ac3018f2d19489d291422e095 |
SimHash | d464d9c2e612 |
Groups
ahrefsbot
slurp
aport
teoma
proximic
scooter
ia_archiver
ia_archiver-web.archive.org
domaincrawler
lycos
webalta
mj12bot
grapeshot
baiduspider
baiduspider-video
sogou spider
youdaobot
naverbot
yeti
moget
ichiro
hybridbot
semrushbot
semrushbot-sa
smtbot
blexbot
gptbot
dotbot
getintent
petalbot
weborama-fetcher
dataforseobot
amazonbot
claudebot
Rule | Path |
---|---|
Disallow | / |
*
Rule | Path |
---|---|
Disallow | /js/ |
Disallow | /search/*%26val%3D |
Disallow | /search/inn/ |
Disallow | /search/name/ |
Disallow | /search/similar/ |
Disallow | /arbitrage/*/page/ |
Disallow | /contract/*/page/ |
Disallow | /status/*/page/ |
Disallow | /licence/*/page/ |
Disallow | /founders/*/page/ |
Disallow | /go/ |
Warnings
- `host` is not a known field.