dalessandris.it
robots.txt

Robots Exclusion Standard data for dalessandris.it

Resource Scan

Scan Details

Site Domain dalessandris.it
Base Domain dalessandris.it
Scan Status Ok
Last Scan2025-07-14T04:54:45+00:00
Next Scan 2025-08-13T04:54:45+00:00

Last Scan

Scanned2025-07-14T04:54:45+00:00
URL https://dalessandris.it/robots.txt
Domain IPs 62.149.128.151, 62.149.128.154, 62.149.128.157, 93.95.216.62
Response IP 93.95.216.62
Found Yes
Hash 588855e58bd11f82e2fc095c62ec6c40ef7d4e2363b63920499bb1b708542ae7
SimHash 71d74250c5d3

Groups

googlebot

Rule Path
Disallow

googlebot-image

Rule Path
Disallow

googlebot-mobile

Rule Path
Disallow

googlebot

Rule Path
Disallow

googlebot-image

Rule Path
Disallow

googlebot-mobile

Rule Path
Disallow

oai-searchbot

Rule Path
Allow /

chatgpt-user
chatgpt-user/2.0

Rule Path
Allow /

gptbot

Rule Path Comment
Allow / everything else

anthropic-ai

Product Comment
anthropic-ai bulk model training
Rule Path
Allow /

claudebot
claude-web

Product Comment
claudebot chat citation fetch
claude-web web-focused crawl
Rule Path
Allow /

perplexitybot

Product Comment
perplexitybot index builder
Rule Path
Allow /

perplexity-user

Product Comment
perplexity-user human-triggered visit
Rule Path
Allow /

google-extended

Rule Path
Allow /

bingbot

Rule Path
Allow /

amazonbot

Rule Path
Allow /

applebot
applebot-extended

Rule Path
Allow /

facebookbot
meta-externalagent

Rule Path
Allow /

linkedinbot

Rule Path
Allow /

bytespider

Rule Path
Allow /

duckassistbot

Rule Path
Allow /

cohere-ai

Rule Path
Allow /

ai2bot
ccbot
diffbot
omgili

Rule Path
Allow /

timpibot
youbot

Rule Path
Allow /

facebookexternalhit

Rule Path
Disallow

*

Rule Path
Disallow /
Disallow /cgi-bin/

*

Rule Path
Disallow /
Disallow /cgi-bin/

Other Records

Field Value
sitemap https://www.tuttoprofessionale.it/sitemap_index.xml

Comments

  • ——— OPENAI ———
  • Search (shows my webpages as links inside ChatGPT search). NOT used for model training.
  • User-driven browsing from ChatGPT and Custom GPTs. Acts after a human click.
  • Model-training crawler. Opt-out here if I don’t want content in GPT-4o or GPT-5.
  • ——— ANTHROPIC (Claude) ———
  • ——— PERPLEXITY ———
  • ——— GOOGLE (Gemini) ———
  • ——— MICROSOFT (Bing / Copilot) ———
  • ——— AMAZON ———
  • ——— APPLE ———
  • ——— META ———
  • ——— LINKEDIN ———
  • ——— BYTEDANCE ———
  • ——— DUCKDUCKGO ———
  • ——— COHERE ———
  • ——— ALLEN INSTITUTE / COMMON CRAWL / OTHER RESEARCH ———
  • ——— EMERGING SEARCH START-UPS ———