venelehti.fi
robots.txt

Robots Exclusion Standard data for venelehti.fi

Resource Scan

Scan Details

Site Domain venelehti.fi
Base Domain venelehti.fi
Scan Status Ok
Last Scan2024-11-16T12:24:05+00:00
Next Scan 2024-11-23T12:24:05+00:00

Last Scan

Scanned2024-11-16T12:24:05+00:00
URL https://venelehti.fi/robots.txt
Domain IPs 65.9.112.12, 65.9.112.15, 65.9.112.9, 65.9.112.92
Response IP 108.156.22.58
Found Yes
Hash e6ba1df2f1360901fd3a0574c785ffd23d83db387abd0beca19010243c800a3d
SimHash 6018814dc7b3

Groups

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

*

Rule Path
Allow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

omgili

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://venelehti.fi/asmagsitemapindex.xml
sitemap https://venelehti.fi/sitemap_index.xml

Comments

  • FacebookBot crawls to improve language models
  • Omgilibot/webz.io sells data for training LLMs