doc.studenti.it
robots.txt

Robots Exclusion Standard data for doc.studenti.it

Resource Scan

Scan Details

Site Domain doc.studenti.it
Base Domain studenti.it
Scan Status Ok
Last Scan2024-11-03T09:16:20+00:00
Next Scan 2024-12-03T09:16:20+00:00

Last Scan

Scanned2024-11-03T09:16:20+00:00
URL https://doc.studenti.it/robots.txt
Domain IPs 23.50.90.17, 2600:1413:b000:68a::3198, 2600:1413:b000:68d::3198
Response IP 104.69.46.165
Found Yes
Hash 49f72f139f47474edeca2c9414487b0a0190cf68aef332838661ecd5ae0eb21d
SimHash 4140ff76c413

Groups

mediapartners-google

Rule Path
Disallow

*

Rule Path
Disallow /*?
Allow /sitemap
Disallow /download
Disallow /tag
Disallow /appunti-audio
Disallow /appunti-video
Disallow /appunti-trovati
Disallow /definizioni
Disallow *.php
Disallow /r-appunti
Disallow /appunti-trovati
Disallow /classific
Disallow /search
Disallow /vedi_tutto
Disallow /download*

gptbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

seekr

Rule Path
Disallow /

meltwater

Rule Path
Disallow /

Other Records

Field Value
sitemap https://doc.studenti.it/sitemap.xml

Comments

  • Enable crawler for AdSense
  • --- Impostazioni del vecchio appunti. ---
  • Disallow: /cerca
  • Disallow: /archivio
  • Disallow: /archivio-appunti
  • Disallow: /materie
  • Disallow: /top
  • --- Impostazioni del nuovo appunti. ---