hindi.theprint.in
robots.txt

Robots Exclusion Standard data for hindi.theprint.in

Resource Scan

Scan Details

Site Domain hindi.theprint.in
Base Domain theprint.in
Scan Status Ok
Last Scan2024-05-11T14:02:48+00:00
Next Scan 2024-06-10T14:02:48+00:00

Last Scan

Scanned2024-05-11T14:02:48+00:00
URL https://hindi.theprint.in/robots.txt
Domain IPs 13.225.4.107, 13.225.4.22, 13.225.4.72, 13.225.4.87
Response IP 13.225.4.87
Found Yes
Hash fd03a3d8347b9ee8a408f7faa765feaa36a94349332376bed7713597b4eea843
SimHash 695c4242c513

Groups

*

Rule Path
Allow /
Disallow *LINK*
Disallow *newnewssitemap.xml?yyyy*
Disallow *?p=*
Disallow */search/*
Disallow *?s*

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
sitemap https://hindi.theprint.in/sitemap_index.xml