/.well-known/

Log In Sign Up

opencitations.net
robots.txt

Robots Exclusion Standard data for opencitations.net

Archived Snapshots

Resource Scan

Scan Details

Site Domain	opencitations.net
Base Domain	opencitations.net
Scan Status	Ok
Last Scan	2024-09-21T06:02:13+00:00
Next Scan	2024-10-21T06:02:13+00:00

Last Scan

Scanned	2024-09-21T06:02:13+00:00
URL	https://opencitations.net/robots.txt
Domain IPs	130.136.130.1
Response IP	130.136.130.1
Found	Yes
Hash	273a529ead164147ee67d32b031654eff8aef43c0ddde7701fbe3644347a4d75
SimHash	fc145583f2f9

Groups

crawler
spider
bot
yahoo! slurp
bubing
adsbot-google
adsbot-google-mobile-apps
adidxbot
applebot
applenewsbot
baiduspider
baiduspider-image
bingbot
bingpreview
ccbot
cliqzbot
coccoc
coccocbot-image
coccocbot-web
daumoa
dazoobot
deusu
duckduckbot
duckduckgo-favicons-bot
euripbot
exploratodo
facebot
feedly
findxbot
googlebot
googlebot-image
googlebot-mobile
googlebot-news
googlebot-video
haosouspider
ichiro
istellabot
jikespider
lycos
mail.ru
mediapartners-google
mojeekbot
msnbot
msnbot-media
orangebot
pinterest
plukkie
qwantify
rambler
seznambot
sosospider
slurp
sogou blog
sogou inst spider
sogou news spider
sogou orion spider
sogou spider2
sogou web spider
sputnikbot
teoma
twitterbot
wotbox
yacybot
yandex
yandexmobilebot
yeti
yioopbot
yoozbot
youdaobot
ahrefsbot
dotbot
semanticscholarbot
blexbot
mb2345browser
liebaofast
mqqbrowser
ucbrowser
aspiegelbot
petalbot

Rule

Path

Disallow

/corpus/

Disallow

/virtual/

Disallow

/index/coci/

semrushbot
semrushbot-sa

Rule

Path

Disallow

/

Back to top

Warnings

3 invalid lines.

Back to top