mydentist.co.uk
robots.txt

Robots Exclusion Standard data for mydentist.co.uk

Resource Scan

Scan Details

Site Domain mydentist.co.uk
Base Domain mydentist.co.uk
Scan Status Ok
Last Scan2024-09-19T20:26:46+00:00
Next Scan 2024-10-19T20:26:46+00:00

Last Scan

Scanned2024-09-19T20:26:46+00:00
URL https://mydentist.co.uk/robots.txt
Redirect https://www.mydentist.co.uk/robots.txt
Redirect Domain www.mydentist.co.uk
Redirect Base mydentist.co.uk
Domain IPs 104.16.4.14
Redirect IPs 104.16.4.14
Response IP 104.16.4.14
Found Yes
Hash be976ca12a59b963f48b98edcbfbf64d2c73defc962bab190bb18e1fc6ee1a74
SimHash 2245d1038993

Groups

*

Rule Path
Disallow /Sitefinity/Authenticate
Disallow /book-a-dental-appointment/
Disallow /book-a-dental-appointment/new-patient/
Disallow /book-a-dental-appointment/existing-patient/
Disallow /book-a-dental-appointment/activate-account/
Disallow /book-a-dental-appointment/login/
Disallow /dentists/practices/quick-contact-form/
Disallow /dentists/practices/dental-referral/
Disallow /*?ReturnUrl=*
Disallow /*/news/*/news/*
Disallow /*/Search/*
Disallow /*/ortho-referral/*
Disallow /*/-in-tags/tags/*
Disallow /find-a-dental-appointment/part-1
Disallow /*.axd$
Disallow /*.axd
Disallow /ScriptResource.axd
Disallow /WebResource.axd
Disallow /scriptresource.axd
Disallow /webresource.axd

baiduspider
baiduspider-video
baiduspider-image
baiduspider+

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

yandex

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

naverbot
yeti

Rule Path
Disallow /

moget
ichiro

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

naverbot
yeti

Rule Path
Disallow /

moget
ichiro

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.mydentist.co.uk/sitemap.xml

Comments

  • www.robotstxt.org/
  • www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156449
  • Disallows for Mydentist
  • Additional disallows 26/02/2016
  • Disallow for content experiments
  • Disallow for WebResource.axd caching issues. Several instances below to cover all search engines.
  • To specify matching the end of a URL, use $
  • However, WebResource.axd and ScriptResource.axd always include a query string parameter the URL does
  • not end with .axd thus, the correct robots.txt record for Google would be:
  • Not all crawlers recognize the wildcard '*' syntax. To comply with the robots.txt draft RFC
  • Note that the records are case sensitive, and error page is showing the requests to be in lower case
  • so let's include both cases below:
  • Disallows for bad search bots