worldjournal.com
robots.txt
Robots Exclusion Standard data for worldjournal.com
Resource Scan
Scan Details
Site Domain | worldjournal.com |
Base Domain | worldjournal.com |
Scan Status | Ok |
Last Scan | 2024-05-04T02:05:16+00:00 |
Next Scan | 2024-05-11T02:05:16+00:00 |
Last Scan
Scanned | 2024-05-04T02:05:16+00:00 |
URL | https://worldjournal.com/robots.txt |
Redirect | https://www.worldjournal.com:443/robots.txt |
Redirect Domain | www.worldjournal.com |
Redirect Base | worldjournal.com |
Domain IPs | 34.206.84.221, 54.156.154.180 |
Redirect IPs | 104.26.14.115, 104.26.15.115, 172.67.71.21, 2606:4700:20::681a:e73, 2606:4700:20::681a:f73, 2606:4700:20::ac43:4715 |
Response IP | 104.26.15.115 |
Found | Yes |
Hash | 36ddf3af89f07b3ae6fa040a85ac1d8fc2fa675c7ac53003fa72ef9b26eb32b9 |
SimHash | 6305985402d5 |
Groups
*
Rule | Path |
---|---|
Disallow | /EventCollect/* |
Disallow | /topic/page_eventposition |
Disallow | /topic/topic_daily |
Disallow | /.well-known/amphtml/apikey.pub |
Other Records
Field | Value |
---|---|
crawl-delay | 5 |
Other Records
Field | Value |
---|---|
sitemap | https://www.worldjournal.com/sitemapxml/wj/mapindex.xml |
sitemap | https://www.worldjournal.com/sitemap/gnews |