eur-lex.europa.eu
robots.txt

Robots Exclusion Standard data for eur-lex.europa.eu

Resource Scan

Scan Details

Site Domain eur-lex.europa.eu
Base Domain europa.eu
Scan Status Ok
Last Scan2025-03-03T08:23:27+00:00
Next Scan 2025-04-02T08:23:27+00:00

Last Scan

Scanned2025-03-03T08:23:27+00:00
URL https://eur-lex.europa.eu/robots.txt
Domain IPs 13.226.2.102, 13.226.2.117, 13.226.2.129, 13.226.2.68
Response IP 108.156.22.72
Found Yes
Hash 213b1fc53a1e36a127dc653d29456a1012e7a4d5ac2419ae33acf73d5b3b8dfb
SimHash 6e3adb36c4a7

Groups

*

Rule Path
Disallow /legal-content/*/TXT/DOC/
Disallow /legal-content/*/TXT/SIG/
Disallow /legal-content/*/TXT/FMX/
Disallow /autocomplete
Disallow /change-displayed-metadata
Disallow /download-notice
Disallow /export-documents
Disallow /modal-message
Disallow /print-pdf
Disallow /save-document
Disallow /save-query
Disallow /prelex
Disallow /smartapi
Disallow /search
Disallow /eli-search
Disallow /advanced-search-form
Disallow /expert-search-form
Disallow /error/authentication-required.html
Disallow /logout.html
Disallow /protected
Disallow /my-eurlex
Disallow /TodayOJ
Disallow /fallback

Other Records

Field Value
crawl-delay 5

sogou web spider

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider-image

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

slurp

Rule Path
Disallow /

Other Records

Field Value
sitemap https://eur-lex.europa.eu/sitemap.xml

Comments

  • Sitemap
  • Catch-all rules
  • Specific content formats
  • Application endpoints
  • Search forms & result lists
  • Pages
  • Other site parts
  • User-agent specific rules