selfidc.com
robots.txt

Robots Exclusion Standard data for selfidc.com

Resource Scan

Scan Details

Site Domain selfidc.com
Base Domain selfidc.com
Scan Status Ok
Last Scan2025-06-02T10:48:08+00:00
Next Scan 2025-07-02T10:48:08+00:00

Last Scan

Scanned2025-06-02T10:48:08+00:00
URL http://selfidc.com/robots.txt
Response IP 133.18.238.208
Found Yes
Hash 6dab9c87a5c57a2ca79079dbcd0e2bbe15e98a0eabce3ac59944121e1d686ea0
SimHash 583dd250e5ab

Groups

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

duckduckbot

Rule Path
Allow /

applebot

Rule Path
Allow /

yahoo

Rule Path
Allow /

yandex

Rule Path
Allow /

baiduspider

Rule Path
Allow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

facebookexternalhit

Rule Path
Disallow /

meta-externalfetcher

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin/
Disallow /wp-admin/
Disallow /wp-includes/
Disallow /tmp/
Disallow /private/
Disallow /config/

Comments

  • Allow well-known search engines and reputable crawlers
  • Block known unwanted bots and scrapers
  • General restrictions for all other bots