newswaker.com
robots.txt

Robots Exclusion Standard data for newswaker.com

Resource Scan

Scan Details

Site Domain newswaker.com
Base Domain newswaker.com
Scan Status Ok
Last Scan2024-11-11T09:40:11+00:00
Next Scan 2024-11-18T09:40:11+00:00

Last Scan

Scanned2024-11-11T09:40:11+00:00
URL https://newswaker.com/robots.txt
Domain IPs 104.26.0.236, 104.26.1.236, 172.67.75.91, 2606:4700:20::681a:1ec, 2606:4700:20::681a:ec, 2606:4700:20::ac43:4b5b
Response IP 172.67.75.91
Found Yes
Hash eb52412ab9bcf7f51f8d32e024ee29fc4c01a7880823f6f7b9a5410250b83dce
SimHash 0a10c6824783

Groups

*

Rule Path
Disallow /wp-admin/
Allow /

mediapartners-google

Rule Path
Allow /

googlebot-desktop

Rule Path
Allow /

googlebot

Rule Path
Allow /

yandex

Rule Path
Allow /

yandexbot

Rule Path
Allow /

yandeximages

Rule Path
Allow /

yandexnews

Rule Path
Allow /

googlebot-image

Rule Path
Allow /

googlebot-mobile

Rule Path
Allow /

googlebot-news

Rule Path
Allow /

msnbot

Rule Path
Allow /

slurp

Rule Path
Allow /

teoma

Rule Path
Allow /

gigabot

Rule Path
Allow /

robozilla

Rule Path
Allow /

nutch

Rule Path
Allow /

ia_archiver

Rule Path
Allow /

baiduspider

Rule Path
Allow /

naverbot

Rule Path
Allow /

yeti

Rule Path
Allow /

yahoo-mmcrawler

Rule Path
Allow /

psbot

Rule Path
Allow /

yahoo-blogs/v3.9

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.newswaker.com/sitemap.xml
sitemap https://www.newswaker.com/sitemap-news.xml

Comments

  • robots.txt