triple-c.com
robots.txt

Robots Exclusion Standard data for triple-c.com

Resource Scan

Scan Details

Site Domain triple-c.com
Base Domain triple-c.com
Scan Status Ok
Last Scan2025-10-19T23:33:19+00:00
Next Scan 2025-11-18T23:33:19+00:00

Last Scan

Scanned2025-10-19T23:33:19+00:00
URL https://triple-c.com/robots.txt
Domain IPs 104.21.66.159, 172.67.205.108, 2606:4700:3032::6815:429f, 2606:4700:3037::ac43:cd6c
Response IP 104.21.66.159
Found Yes
Hash e3fa635896c84e4782f82c98005d0733a048ccc3f7d51788c2cfa1f978e7c6ea
SimHash 23381977c7d1

Groups

googlebot*

Rule Path
Allow /

*

Rule Path
Allow /
Disallow /CFIDE/scripts/ajax
Disallow /js

Other Records

Field Value
sitemap http://www.triple-c.com/sitemap_index.xml
sitemap http://www.triple-c.com/sitemap_index.xml

Comments

  • robots.txt file for www.triple-c.com
  • addresses all robots by using wild card *
  • list folders robots are not allowed to index
  • Disallow: /tutorials/meta/
  • list specific files robots are not allowed to index
  • End of robots.txt file