pearson.eu
robots.txt

Robots Exclusion Standard data for pearson.eu

Resource Scan

Scan Details

Site Domain pearson.eu
Base Domain pearson.eu
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2025-10-29T11:33:01+00:00
Next Scan 2026-01-27T11:33:01+00:00

Last Successful Scan

Scanned2024-09-12T02:06:10+00:00
URL https://pearson.eu/robots.txt
Redirect https://www.pearson.eu/robots.txt
Redirect Domain www.pearson.eu
Redirect Base pearson.eu
Domain IPs 185.23.23.20
Redirect IPs 185.23.23.20
Response IP 185.23.23.20
Found Yes
Hash 8eb3a3cd5e26c5991d7112e49e9aa019d79f32fed006dfe2ab489184402cd1f3
SimHash 6a9ffbd7ca5b

Groups

*

Rule Path
Disallow /typo3/*
Disallow /typo3_src/*
Disallow /*?id=*
Disallow /*%26id%3D*
Disallow /*?L=0*
Disallow /*%26L%3D0*
Disallow /*?type=98*
Disallow /*%26type%3D98*
Disallow /*/Private/*
Disallow /*/Configuration/*
Disallow /typo3temp/*
Disallow /fileadmin/*
Allow /fileadmin/*.css$
Allow /fileadmin/*.css.*.gzip$
Allow /fileadmin/*.js$
Allow /fileadmin/*.js.*.gzip$
Allow /fileadmin/*.jpg$
Allow /fileadmin/*.gif$
Allow /fileadmin/*.png$
Allow /fileadmin/*.pdf$
Allow /fileadmin/*.mp4$
Allow /fileadmin/*.webm$
Allow /fileadmin/*.ogv$
Allow /

mediapartners-google

Rule Path
Disallow /

yahoo pipes 1.0

Rule Path
Disallow /

voltron

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /*/ivc/*
Disallow /users/flair/

Other Records

Field Value
sitemap http://pearson.pl/sitemap.xml

Comments

  • Only allow URLs generated with RealURL
  • L=0 is the default language
  • typeNum = 98 is usually the print version.
  • Should always be protected (.htaccess)
  • beware, the sections below WILL NOT INHERIT from the above!
  • http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=40360
  • disallow adsense bot, as we no longer do adsense.
  • Yahoo Pipes is for feeds not web pages.
  • 80legs
  • This isn't really an image

Warnings

  • 2 invalid lines.