inaturalist.lu
robots.txt

Robots Exclusion Standard data for inaturalist.lu

Resource Scan

Scan Details

Site Domain inaturalist.lu
Base Domain inaturalist.lu
Scan Status Ok
Last Scan2025-06-14T22:46:47+00:00
Next Scan 2025-06-28T22:46:47+00:00

Last Scan

Scanned2025-06-14T22:46:47+00:00
URL https://inaturalist.lu/robots.txt
Domain IPs 40.65.101.206
Response IP 40.65.101.206
Found Yes
Hash 7d0fb4cbda6688b9463cae36d9f40036206c3da987f377261ee8c2a2fbbe1dd3
SimHash f81f6919c487

Groups

*

Rule Path
Disallow /calendar/
Disallow /observations?
Disallow /observations/?
Disallow /observations.csv
Disallow /observations.csv?
Disallow /places/wikipedia/*
Disallow /taxa/search
Disallow /taxa/search?
Disallow /*?
Disallow /taxa/*/description$
Disallow /taxa/*/map_layers$
Disallow /listed_taxa/*
Disallow /lifelists/*
Disallow *.atom*
Disallow *.csv*
Disallow *.json*
Disallow *page%3D*
Disallow /*?

ia_archiver
twitterbot

Rule Path
Disallow

ai2bot
anthropic-ai
applebot-extended
bytespider
ccbot
claude-web
claudebot
cohere-ai
diffbot
dotbot
facebookbot
google-extended
gptbot
kangaroo bot
meta-externalagent
omgili
timpibot
webzio-extended
chatglm-spider

Rule Path
Disallow /