thejakartapost.com
robots.txt

Robots Exclusion Standard data for thejakartapost.com

Resource Scan

Scan Details

Site Domain thejakartapost.com
Base Domain thejakartapost.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-04-05T05:16:59+00:00
Next Scan 2024-07-04T05:16:59+00:00

Last Successful Scan

Scanned2023-06-09T05:42:59+00:00
URL https://www.thejakartapost.com/robots.txt
Domain IPs 13.33.146.4, 13.33.146.46, 13.33.146.71, 13.33.146.94
Response IP 52.85.61.29
Found Yes
Hash 3de65f6f4a6a2d4af964a0d705f64ed9e30ab431f658dc6c2f360eef090201f2
SimHash 6ac49201d5b1

Groups

*

Rule Path
Disallow /
Disallow /redirect
Disallow /testing-showcase.html
Disallow /search

googlebot

Rule Path
Allow /

twitterbot

Rule Path
Allow /

petalbot

Rule Path
Allow /

echoboxbot

Rule Path
Allow /

facebookexternalhit

Rule Path
Allow /

bitlybot

Rule Path
Allow /

yandex

Rule Path
Allow /

bingbot

Rule Path
Allow /

slurp

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.thejakartapost.com/sitemap.xml