ctisus.com
robots.txt

Robots Exclusion Standard data for ctisus.com

Resource Scan

Scan Details

Site Domain ctisus.com
Base Domain ctisus.com
Scan Status Ok
Last Scan2024-05-24T21:30:06+00:00
Next Scan 2024-06-23T21:30:06+00:00

Last Scan

Scanned2024-05-24T21:30:06+00:00
URL https://ctisus.com/robots.txt
Domain IPs 104.21.234.150, 104.21.234.151, 2606:4700:3038::6815:ea96, 2606:4700:3038::6815:ea97
Response IP 104.21.234.150
Found Yes
Hash 11fef17171583b26baee21bbda693560fb8f04ec319ae3da0da758996b312a1a
SimHash bb441d414ee8

Groups

*

Rule Path Comment
Disallow /siteErrors error documents
Disallow /temp -
Disallow /cgi-bin -
Disallow /web_sitemap_test_000.xml -
Disallow /*.html$ -
Disallow /*.htm$ -
Disallow /*.pdf$ -
Disallow /*.swf$ -
Disallow /askthefish?open%3A* -
Disallow /__media__ -
Disallow /ultrabb1172 -
Disallow /syllabus%20main$ -
Disallow /syllabus%20main -

Other Records

Field Value
sitemap http://www.ctisus.com/sitemap.xml

Comments

  • No robots allowed in the following directories
  • Last Updated: December 9, 2013
  • Disallow: /mobile #mobile production site
  • Disallow: /resources #images and css

Warnings

  • 2 invalid lines.