india.newswaker.com
robots.txt

Robots Exclusion Standard data for india.newswaker.com

Resource Scan

Scan Details

Site Domain india.newswaker.com
Base Domain newswaker.com
Scan Status Ok
Last Scan2024-10-18T08:16:11+00:00
Next Scan 2024-11-17T08:16:11+00:00

Last Scan

Scanned2024-10-18T08:16:11+00:00
URL https://india.newswaker.com/robots.txt
Domain IPs 104.26.0.236, 104.26.1.236, 172.67.75.91, 2606:4700:20::681a:1ec, 2606:4700:20::681a:ec, 2606:4700:20::ac43:4b5b
Response IP 104.26.1.236
Found Yes
Hash 9edce754c1f36f1ffbf4918887d4946b7497c846a86e3b26b4658dda29198d57
SimHash 0a10d6805703

Groups

*

Rule Path
Disallow /wp-admin/
Allow /

mediapartners-google

Rule Path
Allow /

googlebot-desktop

Rule Path
Allow /

googlebot

Rule Path
Allow /

yandex

Rule Path
Allow /

yandexbot

Rule Path
Allow /

yandeximages

Rule Path
Allow /

yandexnews

Rule Path
Allow /

googlebot-image

Rule Path
Allow /

googlebot-mobile

Rule Path
Allow /

googlebot-news

Rule Path
Allow /

msnbot

Rule Path
Allow /

slurp

Rule Path
Allow /

teoma

Rule Path
Allow /

gigabot

Rule Path
Allow /

robozilla

Rule Path
Allow /

nutch

Rule Path
Allow /

ia_archiver

Rule Path
Allow /

baiduspider

Rule Path
Allow /

naverbot

Rule Path
Allow /

yeti

Rule Path
Allow /

yahoo-mmcrawler

Rule Path
Allow /

psbot

Rule Path
Allow /

yahoo-blogs/v3.9

Rule Path
Allow /

Other Records

Field Value
sitemap https://india.newswaker.com/sitemap.xml
sitemap https://india.newswaker.com/sitemap-news.xml

Comments

  • robots.txt