eralehti.fi
robots.txt

Robots Exclusion Standard data for eralehti.fi

Resource Scan

Scan Details

Site Domain eralehti.fi
Base Domain eralehti.fi
Scan Status Ok
Last Scan2024-06-26T23:32:10+00:00
Next Scan 2024-07-03T23:32:10+00:00

Last Scan

Scanned2024-06-26T23:32:10+00:00
URL https://eralehti.fi/robots.txt
Domain IPs 18.173.121.102, 18.173.121.60, 18.173.121.82, 18.173.121.97
Response IP 3.160.246.89
Found Yes
Hash 6dffc84c1f295cdebc6dad4833b815a541c90e8bbd68a2fc87d6fdba14cce61b
SimHash 222c81490f37

Groups

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

*

Rule Path
Allow /

Other Records

Field Value
sitemap https://wp.eralehti.fi/asmagsitemapindex.xml
sitemap https://wp.eralehti.fi/sitemap_index.xml

Comments

  • FacebookBot crawls to improve language models
  • Omgilibot/webz.io sells data for training LLMs