deccan.com
robots.txt

Robots Exclusion Standard data for deccan.com

Resource Scan

Scan Details

Site Domain deccan.com
Base Domain deccan.com
Scan Status Ok
Last Scan2024-09-18T08:30:53+00:00
Next Scan 2024-09-25T08:30:53+00:00

Last Scan

Scanned2024-09-18T08:30:53+00:00
URL http://deccan.com/robots.txt
Redirect https://www.deccanchronicle.com/robots.txt
Redirect Domain www.deccanchronicle.com
Redirect Base deccanchronicle.com
Domain IPs 209.17.116.163
Redirect IPs 13.33.88.42, 13.33.88.60, 13.33.88.67, 13.33.88.89, 2600:9000:223b:1800:16:59ed:f00:93a1, 2600:9000:223b:1a00:16:59ed:f00:93a1, 2600:9000:223b:5e00:16:59ed:f00:93a1, 2600:9000:223b:9a00:16:59ed:f00:93a1, 2600:9000:223b:9c00:16:59ed:f00:93a1, 2600:9000:223b:a00:16:59ed:f00:93a1, 2600:9000:223b:a600:16:59ed:f00:93a1, 2600:9000:223b:b600:16:59ed:f00:93a1
Response IP 13.33.88.67
Found Yes
Hash d376afeb5dd66d7538fd79799f4afe790eb20b625973d91ced87a35ddf1cb7e1
SimHash 800a1e47cdc7

Groups

*

Rule Path
Allow /
Disallow /admin/*
Disallow /search/*
Disallow /search?*
Disallow /xhr/*
Disallow /preview/story-*
Disallow /amp/preview/story-*
Disallow /staging/*
Disallow /alfoo
Disallow /sildoo
Disallow /dutas
Disallow /metsmall
Allow /xhr/getNewsMixin*
Allow /content/servlet/RDESController?*

Other Records

Field Value
sitemap https://www.deccanchronicle.com/sitemap/sitemap-index.xml
sitemap https://www.deccanchronicle.com/news-sitemap-daily.xml
sitemap https://www.deccanchronicle.com/feeds.xml
sitemap https://www.deccanchronicle.com/sitemap-daily.xml

Comments

  • robots.txt for https://www.deccanchronicle.com/