globogis.eu
robots.txt

Robots Exclusion Standard data for globogis.eu

Resource Scan

Scan Details

Site Domain globogis.eu
Base Domain globogis.eu
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2025-12-23T20:55:07+00:00
Next Scan 2026-03-23T20:55:07+00:00

Last Successful Scan

Scanned2023-08-13T10:48:15+00:00
URL http://globogis.eu/robots.txt
Redirect http://www.globogis.eu/robots.txt
Redirect Domain www.globogis.eu
Redirect Base globogis.eu
Domain IPs 62.149.128.151, 62.149.128.154, 62.149.128.157, 62.149.128.160, 62.149.128.163, 62.149.128.166
Redirect IPs 82.115.171.53
Response IP 82.115.171.53
Found Yes
Hash dc0f7d9d62adb675646c6c35e2925b61558db79c2c8b8707732cb1661fa4c178
SimHash 38961d1bcf74

Groups

*

Rule Path Comment
Disallow /includes/ -
Disallow /misc/ -
Disallow /modules/ -
Disallow /profiles/ -
Disallow /scripts/ -
Disallow /themes/ -
Disallow /CHANGELOG.txt -
Disallow /cron.php -
Disallow /INSTALL.mysql.txt -
Disallow /INSTALL.pgsql.txt -
Disallow /INSTALL.sqlite.txt -
Disallow /install.php -
Disallow /INSTALL.txt -
Disallow /LICENSE.txt -
Disallow /MAINTAINERS.txt -
Disallow /update.php -
Disallow /UPGRADE.txt -
Disallow /xmlrpc.php -
Disallow /admin/ -
Disallow /comment/reply/ -
Disallow /filter/tips/ -
Disallow /node/add/ -
Disallow /search/ -
Disallow /user/register/ -
Disallow /user/password/ -
Disallow /user/login/ -
Disallow /user/logout/ -
Disallow /?q=admin%2F -
Disallow /?q=comment%2Freply%2F -
Disallow /?q=filter%2Ftips%2F -
Disallow /?q=node%2Fadd%2F -
Disallow /?q=search%2F -
Disallow /?q=user%2Fpassword%2F -
Disallow /?q=user%2Fregister%2F -
Disallow /?q=user%2Flogin%2F -
Disallow /?q=user%2Flogout%2F -
Disallow /sportello_telematico/ -
Disallow /backoffice_to_frontoffice/ -
Disallow /modulistica/ -
Disallow /printpdf/ -
Disallow /reversale_web/ -
Disallow /AttivitaEconomiche/ -
Disallow /image_captcha/ -
Disallow /validazione_dati/ -
Disallow /lista_procedimenti/ -
Disallow /norme/ -
Disallow /lista_istanze_utente/ -
Disallow /globogis_sync/ -
Disallow *.pdf Block pdf files. Non-standard but works for major search engines.

Other Records

Field Value
crawl-delay 20

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/wc/robots.html
  • For syntax checking, see:
  • http://www.sxw.org.uk/computing/robots/check.html
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)
  • STU