newsis.com
robots.txt

Robots Exclusion Standard data for newsis.com

Resource Scan

Scan Details

Site Domain newsis.com
Base Domain newsis.com
Scan Status Ok
Last Scan2024-10-11T06:52:45+00:00
Next Scan 2024-10-18T06:52:45+00:00

Last Scan

Scanned2024-10-11T06:52:45+00:00
URL https://newsis.com/robots.txt
Redirect https://www.newsis.com/robots.txt
Redirect Domain www.newsis.com
Redirect Base newsis.com
Domain IPs 104.18.12.204, 104.18.13.204
Redirect IPs 104.18.12.204, 104.18.13.204
Response IP 104.18.12.204
Found Yes
Hash 862ef9612b18c18a1df6510cb448eb34034f8b665a3dd3dc94772a8957f6bfad
SimHash 6960512345f5

Groups

googlebot
googlebot-news
googlebot-image
mediapartners-google
google search console
googlebot/2.1
googlebot-smartphone
google-inspectiontool/1.0
bingbot
msnbot
msnbot-media
bingpreview
feedfetcher-google
twitterbot
popin_agent
facebot
yeti
facebookexternalhit
facebookexternalhit/1.1
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
grapeshot

Rule Path
Disallow /common/
Disallow /search/

*

Rule Path
Allow /ads.txt
Disallow /common/
Disallow /search/
Disallow /ar_detail/

Other Records

Field Value
sitemap https://www.newsis.com/sitemap.xml
sitemap https://www.newsis.com/newsis_news_google.xml
sitemap https://www.newsis.com/sitemap/images/sitemap_index.xml
sitemap https://www.newsis.com/sitemap/videos/sitemap_video.xml

Comments

  • Robots for www.newsis.com
  • ETC
  • SiteMap