thenewsagency.in
robots.txt

Robots Exclusion Standard data for thenewsagency.in

Resource Scan

Scan Details

Site Domain thenewsagency.in
Base Domain thenewsagency.in
Scan Status Ok
Last Scan2024-09-23T23:50:11+00:00
Next Scan 2024-10-07T23:50:11+00:00

Last Scan

Scanned2024-09-23T23:50:11+00:00
URL https://thenewsagency.in/robots.txt
Redirect https://www.thenewsagency.in/robots.txt
Redirect Domain www.thenewsagency.in
Redirect Base thenewsagency.in
Domain IPs 23.20.179.164, 54.158.195.16
Redirect IPs 104.18.90.198, 104.18.91.198, 104.18.92.198, 104.18.93.198, 104.18.94.198, 2606:4700::6812:5ac6, 2606:4700::6812:5bc6, 2606:4700::6812:5cc6, 2606:4700::6812:5dc6, 2606:4700::6812:5ec6
Response IP 104.18.90.198
Found Yes
Hash f7e63e33a57aacdbbce06f5286bdee2ec43ecd43d6d02be8567bf36defb8a616
SimHash e04c90528e22

Groups

*

Rule Path
Allow /

semrush

Rule Path
Disallow /

ahref

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

claude

Rule Path
Disallow /

open ai

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.thenewsagency.in/sitemap.xml
sitemap https://www.thenewsagency.in/news_sitemap.xml