occi-wg.org
robots.txt

Robots Exclusion Standard data for occi-wg.org

Resource Scan

Scan Details

Site Domain occi-wg.org
Base Domain occi-wg.org
Scan Status Ok
Last Scan2025-08-08T03:30:29+00:00
Next Scan 2025-09-07T03:30:29+00:00

Last Scan

Scanned2025-08-08T03:30:29+00:00
URL https://occi-wg.org/robots.txt
Redirect https://www.carlyscafe.com/robots.txt
Redirect Domain www.carlyscafe.com
Redirect Base carlyscafe.com
Domain IPs 104.18.18.15, 104.18.19.15, 2606:4700::6812:120f, 2606:4700::6812:130f
Redirect IPs 104.21.59.63, 172.67.216.159, 2606:4700:3034::6815:3b3f, 2606:4700:3036::ac43:d89f
Response IP 172.67.216.159
Found Yes
Hash 9d54c43357613b3a0af9ba114c48ff631918752dfe2a534e2d33d5a6a970ca3c
SimHash be1b78526b80

Groups

amazonbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

emailcollector

Rule Path
Disallow /

emailsiphon

Rule Path
Disallow /

webbandit

Rule Path
Disallow /

webzip

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

web downloader

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

offline explorer pro

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

httrack website copier

Rule Path
Disallow /

offline commander

Rule Path
Disallow /

leech

Rule Path
Disallow /

websnake

Rule Path
Disallow /

blackwidow

Rule Path
Disallow /

http weazel

Rule Path
Disallow /

*

Rule Path
Disallow /wp-admin/
Disallow /wp-includes/

Other Records

Field Value
sitemap /sitemap_index.php

Comments

  • NOTICE: The collection of content and other data on this
  • site through automated means, including any device, tool,
  • or process designed to data mine or scrape content, is
  • prohibited except (1) for the purpose of search engine indexing or
  • artificial intelligence retrieval augmented generation or (2) with express
  • written permission from this site’s operator.
  • To request permission to license our intellectual
  • property and/or other materials, please contact this
  • site’s operator directly.
  • BEGIN Cloudflare Managed content
  • END Cloudflare Managed Content