cancerresearchuk.org
robots.txt

Robots Exclusion Standard data for cancerresearchuk.org

Resource Scan

Scan Details

Site Domain cancerresearchuk.org
Base Domain cancerresearchuk.org
Scan Status Ok
Last Scan2024-10-18T01:10:41+00:00
Next Scan 2024-11-17T01:10:41+00:00

Last Scan

Scanned2024-10-18T01:10:41+00:00
URL https://cancerresearchuk.org/robots.txt
Redirect https://www.cancerresearchuk.org:443/robots.txt
Redirect Domain www.cancerresearchuk.org
Redirect Base cancerresearchuk.org
Domain IPs 15.197.204.208, 3.33.221.122
Redirect IPs 18.155.68.113, 18.155.68.29, 18.155.68.69, 18.155.68.76
Response IP 18.155.68.113
Found Yes
Hash b09a9f668665280b72d130de006e0eaa829ac217b865c76126cb251a46b7f465
SimHash 18a4c71cee98

Groups

*

Rule Path
Disallow /utilities/glossary/
Disallow /*PrinterFriendly
Disallow */prod_consump/
Disallow /support-us/
Disallow /cancer-subject/
Disallow /career-level/
Disallow /client/
Disallow /content/
Disallow /container-type/
Disallow /content-department/
Disallow /country/
Disallow /event-type/
Disallow /file/
Disallow /file.html
Disallow /gift-calculator-ranges/
Disallow /gift-calculator-tags/
Disallow /glossary/
Disallow /phase-of-trial/
Disallow /research-area/
Disallow /research-type/
Disallow /restriction-type/
Disallow /site/
Disallow /sites/default/files/
Allow /sites/default/files/about_cancer_deindex_sitemap.xml
Disallow /standard-content/
Disallow /time-commitment/
Disallow /treatment-type/
Disallow /trial-status/
Disallow /trial-type/
Disallow /volunteer-role-environment/
Disallow /volunteer-role-type/
Disallow /includes/
Disallow /misc/
Disallow /modules/
Disallow /profiles/
Disallow /scripts/
Disallow /themes/
Disallow /uat/
Allow /utilities/glossary/index.htm
Allow /sites/default/files/styles/*
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /INSTALL.sqlite.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /file/apple-pay-imagejpg
Disallow /aggregator/sources/21
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips/
Disallow /node/add/
Disallow /populate/
Disallow /querytext/
Disallow /search/
Disallow /tstart/
Disallow /file/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/logout/
Disallow /?q=admin%2F
Disallow /?q=comment%2Freply%2F
Disallow /?q=filter%2Ftips%2F
Disallow /?q=node%2Fadd%2F
Disallow /?q=search%2F
Disallow /?q=user%2Fpassword%2F
Disallow /?q=user%2Fregister%2F
Disallow /?q=user%2Flogin%2F
Disallow /?q=user%2Flogout%2F
Disallow /*?field_shop_geocode_latlon=*&items_per_page=*
Disallow /*?f%5B0%5D=*
Disallow /funding-for-researchers/our-funding-schemes?*

twitterbot

Rule Path
Allow /sites/default/files/*

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.cancerresearchuk.org/sitemap.xml
sitemap https://about-cancer.cancerresearchuk.org/sitemap.xml
sitemap https://raceforlife.cancerresearchuk.org/sitemap.xml
sitemap https://shop.cancerresearchuk.org/sitemap.xml
sitemap https://news.cancerresearchuk.org/sitemap.xml
sitemap https://fundraise.cancerresearchuk.org/sitemap.xml
sitemap https://www.cancerresearchuk.org/sitemap.xml
sitemap https://cancerchat.cancerresearchuk.org/sitemapindex.ashx
sitemap https://www.cancerresearchuk.org/sitemap.xml

Comments

  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)
  • Twitterbot
  • crawl delay
  • Sitemap files