thetoc.gr
robots.txt

Robots Exclusion Standard data for thetoc.gr

Resource Scan

Scan Details

Site Domain thetoc.gr
Base Domain thetoc.gr
Scan Status Ok
Last Scan2024-09-21T16:45:40+00:00
Next Scan 2024-09-28T16:45:40+00:00

Last Scan

Scanned2024-09-21T16:45:40+00:00
URL https://thetoc.gr/robots.txt
Redirect https://www.thetoc.gr/robots.txt
Redirect Domain www.thetoc.gr
Redirect Base thetoc.gr
Domain IPs 52.174.23.118
Redirect IPs 23.46.230.136, 23.46.230.137
Response IP 23.210.250.137
Found Yes
Hash 12291109113a7d424609211e7073f9b167ba15288b8f1155e1e145e91c9538b8
SimHash 2b2dc860c1f4

Groups

*

Rule Path
Disallow
Disallow /Newsletter*
Disallow /newsletter*
Disallow /webtv/videofeed*

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.thetoc.gr/sitemap/allnews
sitemap https://www.thetoc.gr/sitemap/googlenews
sitemap https://www.thetoc.gr/sitemap/categories

Comments

  • Disallow: /Api/*
  • Disallow: /api/*
  • Disallow: /Search*
  • Disallow: /search*
  • Block ChatGPT etc.