gazzanet.gazzetta.it
robots.txt

Robots Exclusion Standard data for gazzanet.gazzetta.it

Resource Scan

Scan Details

Site Domain gazzanet.gazzetta.it
Base Domain gazzetta.it
Scan Status Ok
Last Scan2024-11-15T12:42:28+00:00
Next Scan 2024-11-16T12:42:28+00:00

Last Scan

Scanned2024-11-15T12:42:28+00:00
URL https://gazzanet.gazzetta.it/robots.txt
Domain IPs 34.90.132.17
Response IP 34.90.132.17
Found Yes
Hash fe6669f197f631d43adb55fa634fa82146405a1cb00755fe55aa2ec50308727a
SimHash 211f52e043f1

Groups

turnitinbot

Rule Path
Disallow /

npbot-1/2.0

Rule Path
Disallow /

npbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

sogou

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

coccoc

Rule Path
Disallow /

exabot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

fatbot

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

psbot

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

willybot

Rule Path
Disallow /

yodaobot

Rule Path
Disallow /

germcrawler

Rule Path
Disallow /

huaweisymantecspider

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webzip

Rule Path
Disallow /

xaldon_webspider

Rule Path
Disallow /

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

*

Rule Path
Disallow */commenti/$
Disallow /rcs-community-comments-rest-api/
Disallow /archivio/pagina-*/pagina-
Disallow /archivio/page/
Disallow /archivio/categoria/
Disallow /archivio/gallery/
Disallow /archivio/video/
Disallow /*commenti/
Disallow /*?app_v2
Disallow /*?app_v1

Other Records

Field Value
sitemap https://www.gazzetta.it/sitemaps/sitemap.xml
sitemap https://www.gazzetta.it/sitemaps/sitemap-news.xml