medicalxpress.com
robots.txt

Robots Exclusion Standard data for medicalxpress.com

Resource Scan

Scan Details

Site Domain medicalxpress.com
Base Domain medicalxpress.com
Scan Status Ok
Last Scan2024-10-31T21:09:31+00:00
Next Scan 2024-11-07T21:09:31+00:00

Last Scan

Scanned2024-10-31T21:09:31+00:00
URL https://medicalxpress.com/robots.txt
Domain IPs 2001:48c8:13:5::53, 72.251.233.233
Response IP 72.251.233.233
Found Yes
Hash 24955d5747017ae1b783a6f4a2fdd768e6a4cc32cf58624d72d1499e103054e5
SimHash 701c4843e0a3

Groups

*

Rule Path
Allow /
Disallow /search/
Disallow /rss-feed/search/
Disallow /rss-feed/breaking/search/
Disallow /rss-feed/tags/
Disallow /*/sort/

claudebot

Rule Path
Disallow /news/
Disallow /partners/
Disallow /journals/
Disallow /tags/

gptbot

Rule Path
Disallow /news/
Disallow /partners/
Disallow /journals/
Disallow /tags/

facebookbot

Rule Path
Disallow /news/
Disallow /partners/
Disallow /journals/
Disallow /tags/

ahrefsbot

Rule Path
Disallow /tags/

ccbot

Rule Path
Disallow /news/
Disallow /partners/
Disallow /journals/
Disallow /tags/

bytespider

Rule Path
Disallow /news/
Disallow /partners/
Disallow /journals/
Disallow /tags/

cohere-ai

Rule Path
Disallow /news/
Disallow /partners/
Disallow /journals/
Disallow /tags/

meta-externalagent

Rule Path
Disallow /news/
Disallow /partners/
Disallow /journals/
Disallow /tags/

google-extended

Rule Path
Disallow /news/
Disallow /tags/

applebot-extended

Rule Path
Disallow /news/
Disallow /tags/

scrapy

Rule Path
Disallow /

friendlycrawler

Rule Path
Disallow /

chatglm-spider

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /news/

proximic

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

Other Records

Field Value
sitemap https://medicalxpress.com/sitemap/indx/