co-shaoghal.net
robots.txt

Robots Exclusion Standard data for co-shaoghal.net

Resource Scan

Scan Details

Site Domain co-shaoghal.net
Base Domain co-shaoghal.net
Scan Status Ok
Last Scan2024-10-29T21:46:46+00:00
Next Scan 2024-11-28T21:46:46+00:00

Last Scan

Scanned2024-10-29T21:46:46+00:00
URL https://co-shaoghal.net/robots.txt
Domain IPs 88.198.220.232
Response IP 88.198.220.232
Found Yes
Hash b69290a317134937f0da14a306395c84051dac7b5837d29afcf2e79b101a03c3
SimHash 029c9fc05661

Groups

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

fedicrawl/1.0

Rule Path
Disallow /

fedilist agent/2

Rule Path
Disallow /

facebookexternalhit/1.1

Rule Path
Disallow /

facebookexternalhit/1.0

Rule Path
Disallow /

facebookcatalog/1.0

Rule Path
Disallow /

facebookexternalua

Rule Path
Disallow /

cortex/1.0

Rule Path
Disallow /

adreview/1.0

Rule Path
Disallow /

facebookplatform/1.0

Rule Path
Disallow /

visionutils/0.2

Rule Path
Disallow /

facebot/1.0

Rule Path
Disallow /

adsbot-google-mobile

Rule Path
Disallow /

adsbot-google

Rule Path
Disallow /

mediapartners-google

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

storebot-google

Rule Path
Disallow /

*

Rule Path
Disallow /media_proxy/
Disallow /interact/

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • AI & marketing bots
  • Fediverse crawlers
  • Facebook
  • Google
  • Default