sudarshannews.in
robots.txt

Robots Exclusion Standard data for sudarshannews.in

Resource Scan

Scan Details

Site Domain sudarshannews.in
Base Domain sudarshannews.in
Scan Status Ok
Last Scan2024-11-13T08:16:46+00:00
Next Scan 2024-11-20T08:16:46+00:00

Last Scan

Scanned2024-11-13T08:16:46+00:00
URL https://sudarshannews.in/robots.txt
Redirect https://www.sudarshannews.in/robots.txt
Redirect Domain www.sudarshannews.in
Redirect Base sudarshannews.in
Domain IPs 50.7.129.154
Redirect IPs 50.7.129.154
Response IP 50.7.129.154
Found Yes
Hash 1cca67865c3c940170ec9788707b511f362f514704ecaeb3388d9151844ce46b
SimHash 28141151c9f5

Groups

*

Rule Path
Allow /

googlebot
google-adstxt

Rule Path
Disallow

twitterbot

Rule Path
Allow /resources

Other Records

Field Value
sitemap https://www.sudarshannews.in/sitemap.xml

Comments

  • Certain social media sites are whitelisted to allow crawlers to access page markup when links to /images are shared.