cdn.journals.lww.com
robots.txt

Robots Exclusion Standard data for cdn.journals.lww.com

Resource Scan

Scan Details

Site Domain cdn.journals.lww.com
Base Domain lww.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-05-14T15:54:01+00:00
Next Scan 2024-07-13T15:54:01+00:00

Last Successful Scan

Scanned2024-02-22T15:52:54+00:00
URL https://cdn.journals.lww.com/robots.txt
Redirect https://journals.lww.com/robots.txt
Redirect Domain journals.lww.com
Redirect Base lww.com
Domain IPs 104.18.0.248, 104.18.1.248
Redirect IPs 104.18.0.248, 104.18.1.248
Response IP 104.18.0.248
Found Yes
Hash 541d6cdb305ec6ed31ec89df71b3851a1287d8287d22b7565cd912d9ae74ec87
SimHash e54493554d77

Groups

*

Rule Path
Disallow /secure/
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /Pages/subprocessors.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 25

grapeshot

Rule Path
Disallow /

googlebot

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

twitterbot

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

mediapartners-google

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

adsbot-google

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

googlebot-image

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

googlebot-mobile

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

msnbot

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

Other Records

Field Value
crawl-delay 5

slurp

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

baiduspider

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

picosearch/1.0

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

teoma

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

gigabot

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

scrubby

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

robozilla

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

gsa-crawler

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

portal-crawler

Rule Path
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

googlebot-news

Rule Path
Allow /oncology-times
Disallow /thehearingjournal/Pages/blogs.aspx
Disallow /Pages/myetocsunsubscribe.aspx
Disallow /jaanp/_layouts/15/1033/oaks.journals/informationforauthors.aspx

gptbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://journals.lww.com/_layouts/15/oaks.journals/Sitemap_xml.aspx?format=xml

Comments

  • Disallow spiders by default
  • Add Crawl-delay parameter for those crawlers that support it
  • Oracle Data Cloud Crawler disallow
  • Allow freindly spiders
  • "Disallow:" means don't disallow anything, so all can be crawled. Same as "Allow: /" but better supported
  • Yahoo
  • Google China
  • ask.com
  • gigablast.com
  • scrub the web
  • DMOZ
  • PRS defined crawler
  • ASPS/PSEN defined crawler
  • GPTBot