jhlarson.com
robots.txt

Robots Exclusion Standard data for jhlarson.com

Resource Scan

Scan Details

Site Domain jhlarson.com
Base Domain jhlarson.com
Scan Status Ok
Last Scan2025-12-14T03:12:52+00:00
Next Scan 2026-01-13T03:12:52+00:00

Last Scan

Scanned2025-12-14T03:12:52+00:00
URL https://www.jhlarson.com/robots.txt
Domain IPs 3.212.150.83, 52.20.119.15, 54.174.38.120
Response IP 3.212.150.83
Found Yes
Hash 5b07a96270e4a129b17c59a97f166248002cdae438a2e66ec2512b88150f790a
SimHash 6b1bb451af83

Groups

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

swish-e

Rule Path
Disallow /

tagoobot

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

superpagesbot

Rule Path
Disallow /

superpagesurlverifybot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

zoomspider

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

*

Rule Path
Allow /

Other Records

Field Value
crawl-delay 600

*

Rule Path
Disallow /silver/*.jsp

*

Rule Path
Disallow /custom/*.jsp

*

Rule Path
Disallow /api/*.jsp

*

Rule Path
Disallow /theme/*/*.jsp

Other Records

Field Value
sitemap https://www.jhlarson.com/sitemap.xml

Warnings

  • 2 invalid lines.