remedyconnect.com
robots.txt

Robots Exclusion Standard data for remedyconnect.com

Resource Scan

Scan Details

Site Domain remedyconnect.com
Base Domain remedyconnect.com
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2024-09-12T03:45:06+00:00
Next Scan 2024-12-11T03:45:06+00:00

Last Successful Scan

Scanned2022-07-18T20:53:49+00:00
URL https://remedyconnect.com/robots.txt
Response IP 13.89.41.168
Found Yes
Hash 8079bdbc27714f7904b424811e3fa348efd5a2ce51515dcc49297e00b968ee2b
SimHash 3895d90bc4d5

Groups

*

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 600

googlebot
googlebot-image
mediapartners-google
msnbot
msnbot-media
slurp
yahoo-blogs
yahoo-mmcrawler
brokenlinkcheck
twitterbot

Rule Path
Disallow /app_code/
Disallow /css/
Disallow /js/
Disallow /CMS_Modules/
Disallow /app_data/
Disallow /bin/
Disallow /img/
Disallow /wp/
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /hitechavs/concrete/controllers/profile/utf.php
Disallow /CMSHelp/%3CHELP_LOCATION%3Edevguide/syndication_transformations.htm
Disallow /wp-login.php
Disallow /review/
Disallow /review
Disallow /admin/
Disallow /comment/reply/
Disallow /contact/
Disallow /logout/
Disallow /node/add/
Disallow /search/
Disallow /opensearch/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /wp-admin
Disallow /Resources/Kid-Site-Videos/
Disallow /Medical-Comprehensive/Kid-Site-Videos/
Disallow /Kid-Site-Videos/
Disallow /?q=admin%2F
Disallow /?q=comment%2Freply%2F
Disallow /?q=contact%2F
Disallow /?q=logout%2F
Disallow /?q=node%2Fadd%2F
Disallow /?q=search%2F
Disallow /?q=user%2Fpassword%2F
Disallow /?q=user%2Fregister%2F
Disallow /?q=user%2Flogin%2F

Other Records

Field Value
crawl-delay 600

Other Records

Field Value
sitemap https://remedyconnect.com/sitemap.xml

Comments

  • robots.txt Updated by Vickie 7/3/2020
  • - Added enabled code so that only allows crawling from site domain name
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/wc/robots.html
  • For syntax checking, see:
  • http://www.sxw.org.uk/computing/robots/check.html
  • disallow all
  • but allow only important bots
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)
  • Should be Restricting Development or secondary domain names
  • These 2 must match
  • remedyconnect.com
  • https://remedyconnect.com/robotstxt
  • True
  • and
  • True
  • -1 cannot include .remedyconnect.com
  • true