biorxiv.org
robots.txt

Robots Exclusion Standard data for biorxiv.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	biorxiv.org
Base Domain	biorxiv.org
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	2025-08-10T06:17:21+00:00
Next Scan	2025-08-24T06:17:21+00:00

Last Successful Scan

Scanned	2025-07-03T06:17:08+00:00
URL	https://biorxiv.org/robots.txt
Redirect	https://www.biorxiv.org/robots.txt
Redirect Domain	www.biorxiv.org
Redirect Base	biorxiv.org
Domain IPs	104.21.112.1, 104.21.16.1, 104.21.32.1, 104.21.48.1, 104.21.64.1, 104.21.80.1, 104.21.96.1, 2606:4700:3030::6815:1001, 2606:4700:3030::6815:2001, 2606:4700:3030::6815:3001, 2606:4700:3030::6815:4001, 2606:4700:3030::6815:5001, 2606:4700:3030::6815:6001, 2606:4700:3030::6815:7001
Redirect IPs	104.18.34.83, 172.64.153.173, 2606:4700:4400::6812:2253, 2606:4700:4400::ac40:99ad
Response IP	172.64.153.173
Found	Yes
Hash	65b12f268c4cb2e981953fd192532bfcddae2b003a5dab389015b3ac464e5275
SimHash	7894175bce78

Groups

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

exabot

Rule	Path
Disallow	/

Rule

Path

Disallow

scanbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrush

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

youdaobot

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

yandexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

openai

Rule	Path
Disallow	/

Rule

Path

Disallow

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bolt

Rule	Path
Disallow	/

Rule

Path

Disallow

bunnyslippers

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

gigabot

Rule	Path
Disallow	/

Rule

Path

Disallow

hybridbot

Rule	Path
Disallow	/

Rule

Path

Disallow

jikespider

Rule	Path
Disallow	/

Rule

Path

Disallow

smtbot

Rule	Path
Disallow	/

Rule

Path

Disallow

screenerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

sitelockspider

Rule	Path
Disallow	/

Rule

Path

Disallow

superbot

Rule	Path
Disallow	/

Rule

Path

Disallow

superhttp

Rule	Path
Disallow	/

Rule

Path

Disallow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrush

Rule	Path
Disallow	/

Rule

Path

Disallow

dataforseobot

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

phxbot

Rule	Path
Disallow	/

Rule

Path

Disallow

awariobot

Rule	Path
Disallow	/

Rule

Path

Disallow

seekportbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

majestic12

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookexternalhit/1.1

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

qwantbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

python-requests/2.32.3

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

curl

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Allow	/misc/*.css$
Allow	/misc/*.css?
Allow	/misc/*.js$
Allow	/misc/*.js?
Allow	/misc/*.gif
Allow	/misc/*.jpg
Allow	/misc/*.jpeg
Allow	/misc/*.png
Allow	/modules/*.css$
Allow	/modules/*.css?
Allow	/modules/*.js$
Allow	/modules/*.js?
Allow	/modules/*.gif
Allow	/modules/*.jpg
Allow	/modules/*.jpeg
Allow	/modules/*.png
Allow	/profiles/*.css$
Allow	/profiles/*.css?
Allow	/profiles/*.js$
Allow	/profiles/*.js?
Allow	/profiles/*.gif
Allow	/profiles/*.jpg
Allow	/profiles/*.jpeg
Allow	/profiles/*.png
Allow	/themes/*.css$
Allow	/themes/*.css?
Allow	/themes/*.js$
Allow	/themes/*.js?
Allow	/themes/*.gif
Allow	/themes/*.jpg
Allow	/themes/*.jpeg
Allow	/themes/*.png
Disallow	/includes/
Disallow	/misc/
Disallow	/modules/
Disallow	/profiles/
Disallow	/scripts/
Disallow	/themes/
Disallow	/CHANGELOG.txt
Disallow	/cron.php
Disallow	/INSTALL.mysql.txt
Disallow	/INSTALL.pgsql.txt
Disallow	/INSTALL.sqlite.txt
Disallow	/install.php
Disallow	/INSTALL.txt
Disallow	/LICENSE.txt
Disallow	/MAINTAINERS.txt
Disallow	/update.php
Disallow	/UPGRADE.txt
Disallow	/xmlrpc.php
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips/
Disallow	/node/add/
Disallow	/search/
Disallow	/user/register/
Disallow	/user/password/
Disallow	/user/login/
Disallow	/user/logout/
Disallow	/?q=admin%2F
Disallow	/?q=comment%2Freply%2F
Disallow	/?q=filter%2Ftips%2F
Disallow	/?q=node%2Fadd%2F
Disallow	/?q=search%2F
Disallow	/?q=user%2Fpassword%2F
Disallow	/?q=user%2Fregister%2F
Disallow	/?q=user%2Flogin%2F
Disallow	/?q=user%2Flogout%2F
Disallow	/user

Rule

Path

Allow

/misc/*.css$

Allow

/misc/*.css?

Allow

/misc/*.js$

Allow

/misc/*.js?

Allow

/misc/*.gif

Allow

/misc/*.jpg

Allow

/misc/*.jpeg

Allow

/misc/*.png

Allow

/modules/*.css$

Allow

/modules/*.css?

Allow

/modules/*.js$

Allow

/modules/*.js?

Allow

/modules/*.gif

Allow

/modules/*.jpg

Allow

/modules/*.jpeg

Allow

/modules/*.png

Allow

/profiles/*.css$

Allow

/profiles/*.css?

Allow

/profiles/*.js$

Allow

/profiles/*.js?

Allow

/profiles/*.gif

Allow

/profiles/*.jpg

Allow

/profiles/*.jpeg

Allow

/profiles/*.png

Allow

/themes/*.css$

Allow

/themes/*.css?

Allow

/themes/*.js$

Allow

/themes/*.js?

Allow

/themes/*.gif

Allow

/themes/*.jpg

Allow

/themes/*.jpeg

Allow

/themes/*.png

Disallow

/includes/

Disallow

/misc/

Disallow

/modules/

Disallow

/profiles/

Disallow

/scripts/

Disallow

/themes/

Disallow

/CHANGELOG.txt

Disallow

/cron.php

Disallow

/INSTALL.mysql.txt

Disallow

/INSTALL.pgsql.txt

Disallow

/INSTALL.sqlite.txt

Disallow

/install.php

Disallow

/INSTALL.txt

Disallow

/LICENSE.txt

Disallow

/MAINTAINERS.txt

Disallow

/update.php

Disallow

/UPGRADE.txt

Disallow

/xmlrpc.php

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips/

Disallow

/node/add/

Disallow

/search/

Disallow

/user/register/

Disallow

/user/password/

Disallow

/user/login/

Disallow

/user/logout/

Disallow

/?q=admin%2F

Disallow

/?q=comment%2Freply%2F

Disallow

/?q=filter%2Ftips%2F

Disallow

/?q=node%2Fadd%2F

Disallow

/?q=search%2F

Disallow

/?q=user%2Fpassword%2F

Disallow

/?q=user%2Fregister%2F

Disallow

/?q=user%2Flogin%2F

Disallow

/?q=user%2Flogout%2F

Disallow

/user

Other Records

Field	Value
crawl-delay	7

Field

Value

crawl-delay

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
CSS, JS, Images
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)

biorxiv.orgrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

blexbot

bytespider

exabot

scanbot

semrush

semrushbot

youdaobot

baiduspider

yandexbot

openai

petalbot

bolt

bunnyslippers

chatgpt-user

gptbot

gigabot

hybridbot

jikespider

smtbot

screenerbot

sitelockspider

superbot

superhttp

amazonbot

ahrefsbot

applebot

semrush

dataforseobot

claudebot

dotbot

phxbot

awariobot

seekportbot

mj12bot

majestic12

facebookexternalhit/1.1

perplexitybot

qwantbot

ccbot

python-requests/2.32.3

wget

curl

*

Other Records

Comments

biorxiv.org
robots.txt