frontline.thehindu.com
robots.txt

Robots Exclusion Standard data for frontline.thehindu.com

Resource Scan

Scan Details

Site Domain frontline.thehindu.com
Base Domain thehindu.com
Scan Status Ok
Last Scan2024-05-16T07:44:19+00:00
Next Scan 2024-05-30T07:44:19+00:00

Last Scan

Scanned2024-05-16T07:44:19+00:00
URL https://frontline.thehindu.com/robots.txt
Domain IPs 104.18.39.235, 172.64.148.21, 2606:4700:4400::6812:27eb, 2606:4700:4400::ac40:9415
Response IP 172.64.148.21
Found Yes
Hash cfe987d732d208da7d99f19ea150c60914d3415f51fa1b5576b3d4786fb98717
SimHash 29367860f792

Groups

*

Rule Path
Disallow /?type=commentReceipt
Disallow /cgi-bin/
Disallow /cdn-cgi/*
Disallow /config/
Disallow /nic/
Disallow /search/*
Disallow /search/
Disallow /SEARCH/
Disallow /Search/
Disallow /?service=*
Disallow /newsletter/
Disallow /newsletter/*
Disallow */analysis-logger/*
Disallow */wf.fragment/*
Disallow *ref%3D*
Disallow *textsize%3D*
Disallow *test%3D*
Disallow *css%3D*
Disallow */?_ptid=*
Disallow */?*&page=*
Disallow */?page=*&*
Disallow */?*categoryId=*

gptbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://frontline.thehindu.com/sitemap/archive.xml
sitemap https://frontline.thehindu.com/sitemap/googlenews.xml
sitemap https://frontline.thehindu.com/sitemap/update/all.xml

Comments

  • Block paginations with categoryID
  • Disallow: */?page=*
  • Disallow ChatGPT from extracting or interpreting our content