ted.europa.eu
robots.txt

Robots Exclusion Standard data for ted.europa.eu

Resource Scan

Scan Details

Site Domain ted.europa.eu
Base Domain europa.eu
Scan Status Ok
Last Scan2024-08-29T14:31:46+00:00
Next Scan 2024-09-28T14:31:46+00:00

Last Scan

Scanned2024-08-29T14:31:46+00:00
URL https://ted.europa.eu/robots.txt
Domain IPs 13.226.2.100, 13.226.2.125, 13.226.2.74, 13.226.2.93, 2600:9000:2175:2e00:18:bc6c:b540:93a1, 2600:9000:2175:3c00:18:bc6c:b540:93a1, 2600:9000:2175:4c00:18:bc6c:b540:93a1, 2600:9000:2175:6e00:18:bc6c:b540:93a1, 2600:9000:2175:8e00:18:bc6c:b540:93a1, 2600:9000:2175:9400:18:bc6c:b540:93a1, 2600:9000:2175:a600:18:bc6c:b540:93a1, 2600:9000:2175:c400:18:bc6c:b540:93a1
Response IP 18.165.171.18
Found Yes
Hash 1901adc275dbb4b96e0a5377dc3c8a0bc2123c4cab0424c088eb7cc5c025ab02
SimHash eb059abae196

Groups

*

Rule Path
Disallow /*/my-dashboard
Disallow /*/preferences
Disallow /c/
Disallow /combo/
Disallow /o/
Disallow /webdav/
Disallow /control_panel/
Disallow /group/
Disallow /user/
Disallow /web/
Disallow /documents/
Disallow /image/
Disallow /documents_and_media/
Disallow /portlet/
Disallow /api/jsonws/
Disallow /api/axis/
Disallow /api/liferay/
Disallow /api/secure/

sogou web spider

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

msnbot/bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://ted.europa.eu/en/sitemap.xml
sitemap https://ted.europa.eu/sitemap/notices/sitemap.xml

Comments

  • Liferay sitemap url
  • Notices sitemap url
  • Disallow private pages
  • Liferay internal use patterns
  • copied bot rules from ted_v1