archive.comune.verona.it
robots.txt

Robots Exclusion Standard data for archive.comune.verona.it

Resource Scan

Scan Details

Site Domain archive.comune.verona.it
Base Domain comune.verona.it
Scan Status Ok
Last Scan2025-12-09T01:20:36+00:00
Next Scan 2026-01-08T01:20:36+00:00

Last Scan

Scanned2025-12-09T01:20:36+00:00
URL https://archive.comune.verona.it/robots.txt
Domain IPs 213.171.100.121
Response IP 213.171.100.121
Found Yes
Hash 5bada44e1d95830bf0ed6759f8be8501a87a9d50c08ae28098ee0f8281f5f04e
SimHash b95cf053669d

Groups

googlebot
google
googlebot-mobile
googlebot-image
adsbot-google
notebooklm
bingbot
bing

Rule Path
Disallow /temp/
Disallow /cache/
Disallow /versioningmedia/
Disallow /stagingmedia/
Disallow /public_security/
Disallow /include/
Disallow /images/
Disallow /nqcontent/
Disallow /admin/
Disallow /quickedit/
Disallow /media/
Disallow /bannertrack/
Disallow /utils/
Disallow /templates/
Disallow /poll/

Other Records

Field Value
crawl-delay 1

*

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

hypercrawl

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

*

Rule Path
Disallow /

powermapper

Rule Path
Allow /

facebookexternalhit

Rule Path
Disallow /temp/
Disallow /cache/
Disallow /versioningmedia/
Disallow /stagingmedia/
Disallow /public_security/
Disallow /include/
Disallow /images/
Disallow /nqcontent/
Disallow /admin/
Disallow /quickedit/
Disallow /bannertrack/
Disallow /utils/
Disallow /templates/
Disallow /poll/

Comments

  • aggiunti da Alessia per richiesta di Zago Nicola 09/01/2025
  • fine aggiunta
  • Non voglio che i motori di ricerca indicizzino queste directory
  • blocco di tutti i crawler a parte google e bing in testa al file
  • alcuni non leggono l'asterisco