newsroompost.com
robots.txt
Robots Exclusion Standard data for newsroompost.com
Resource Scan
Scan Details
Site Domain | newsroompost.com |
Base Domain | newsroompost.com |
Scan Status | Ok |
Last Scan | 2024-11-11T02:48:06+00:00 |
Next Scan | 2024-11-18T02:48:06+00:00 |
Last Scan
Scanned | 2024-11-11T02:48:06+00:00 |
URL | https://newsroompost.com/robots.txt |
Domain IPs | 104.26.14.119, 104.26.15.119, 172.67.68.21, 2606:4700:20::681a:e77, 2606:4700:20::681a:f77, 2606:4700:20::ac43:4415 |
Response IP | 104.26.14.119 |
Found | Yes |
Hash | 6c2cb965adcaecad021006d90fa99ffbb5652fc3db4bd9950e2b4068374f6e6f |
SimHash | 2d4d7c488b53 |
Groups
*
Rule | Path |
---|---|
Allow | / |
Disallow | */page/* |
Disallow | */attachment/* |
Disallow | /wp-admin/ |
Disallow | */cdn-cgi/* |
Disallow | /favicon.ico |
Disallow | *.html/1* |
Disallow | *.html/2* |
Disallow | *?redirect* |
Disallow | *?s* |
Disallow | */h/g/cv/* |
Disallow | *?p* |
Disallow | *?s* |
Disallow | *?_gl* |
Disallow | */tag/* |
Other Records
Field | Value |
---|---|
sitemap | https://newsroompost.com/sitemap.xml |