calabriadirettanews.com
robots.txt

Robots Exclusion Standard data for calabriadirettanews.com

Resource Scan

Scan Details

Site Domain calabriadirettanews.com
Base Domain calabriadirettanews.com
Scan Status Ok
Last Scan2024-11-05T18:26:21+00:00
Next Scan 2024-11-12T18:26:21+00:00

Last Scan

Scanned2024-11-05T18:26:21+00:00
URL https://calabriadirettanews.com/robots.txt
Domain IPs 194.76.118.100
Response IP 194.76.118.100
Found Yes
Hash a3eddcaed4eca6a5b01eafbcfa6fb6ab9afad677f5f1c13b84981253dd1d008a
SimHash eeca9e33c553

Groups

*

Rule Path
Disallow /

googlebot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

googlebot-image

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

mediapartners-google

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

bingbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

facebookexternalhit

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

twitterbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

slurp

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

duckduckbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

yandex

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

whatsapp

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

telegrambot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

instagram

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

*

Rule Path
Allow /ads/preferences/
Allow /gpt/
Allow /pagead/show_ads.js
Allow /pagead/js/adsbygoogle.js
Allow /pagead/js/*/show_ads_impl.js
Allow /static/glade.js
Allow /static/glade/

*

Rule Path
Disallow /calendar/action~posterboard/
Disallow /calendar/action~agenda/
Disallow /calendar/action~oneday/
Disallow /calendar/action~month/
Disallow /calendar/action~week/
Disallow /calendar/action~stream/
Disallow /calendar/action~undefined/
Disallow /calendar/action~http%3A/
Disallow /calendar/action~default/
Disallow /calendar/action~poster/
Disallow /calendar/action~*/
Disallow /*controller%3Dai1ec_exporter_controller*
Disallow /*/action~*/

Other Records

Field Value
sitemap https://www.calabriadirettanews.com/sitemap_index.xml

Comments

  • Blocca tutti i bot per default
  • Consenti l'accesso ai bot essenziali con ritardo tra le richieste
  • Googlebot
  • Googlebot-Image
  • Mediapartners-Google (AdSense)
  • Bingbot
  • Facebot (Facebook)
  • Twitterbot
  • Yahoo Slurp
  • DuckDuckBot
  • YandexBot
  • WhatsApp
  • TelegramBot
  • Instagram
  • Consenti accesso ai file specifici di Google Ads e script
  • Direttive specifiche per il calendario
  • Mappa Sito

Warnings

  • `request-rate` is not a known field.