newsis.com
robots.txt
Robots Exclusion Standard data for newsis.com
Resource Scan
Scan Details
Site Domain | newsis.com |
Base Domain | newsis.com |
Scan Status | Ok |
Last Scan | 2024-10-11T06:52:45+00:00 |
Next Scan | 2024-10-18T06:52:45+00:00 |
Last Scan
Scanned | 2024-10-11T06:52:45+00:00 |
URL | https://newsis.com/robots.txt |
Redirect | https://www.newsis.com/robots.txt |
Redirect Domain | www.newsis.com |
Redirect Base | newsis.com |
Domain IPs | 104.18.12.204, 104.18.13.204 |
Redirect IPs | 104.18.12.204, 104.18.13.204 |
Response IP | 104.18.12.204 |
Found | Yes |
Hash | 862ef9612b18c18a1df6510cb448eb34034f8b665a3dd3dc94772a8957f6bfad |
SimHash | 6960512345f5 |
Groups
googlebot
googlebot-news
googlebot-image
mediapartners-google
google search console
googlebot/2.1
googlebot-smartphone
google-inspectiontool/1.0
bingbot
msnbot
msnbot-media
bingpreview
feedfetcher-google
twitterbot
popin_agent
facebot
yeti
facebookexternalhit
facebookexternalhit/1.1
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
grapeshot
Rule | Path |
---|---|
Disallow | /common/ |
Disallow | /search/ |
*
Rule | Path |
---|---|
Allow | /ads.txt |
Disallow | /common/ |
Disallow | /search/ |
Disallow | /ar_detail/ |
Other Records
Field | Value |
---|---|
sitemap | https://www.newsis.com/sitemap.xml |
sitemap | https://www.newsis.com/newsis_news_google.xml |
sitemap | https://www.newsis.com/sitemap/images/sitemap_index.xml |
sitemap | https://www.newsis.com/sitemap/videos/sitemap_video.xml |
Comments