carrefour.gazetki-promocyjne.net.pl
robots.txt

Robots Exclusion Standard data for carrefour.gazetki-promocyjne.net.pl

Resource Scan

Scan Details

Site Domain carrefour.gazetki-promocyjne.net.pl
Base Domain gazetki-promocyjne.net.pl
Scan Status Ok
Last Scan2024-11-11T21:00:33+00:00
Next Scan 2024-11-18T21:00:33+00:00

Last Scan

Scanned2024-11-11T21:00:33+00:00
URL https://carrefour.gazetki-promocyjne.net.pl/robots.txt
Domain IPs 149.202.70.178
Response IP 149.202.70.178
Found Yes
Hash 25766cd76749da77364c4f64b3a63dbd033e3ada86be4d665b7b55c6944ff4c1
SimHash 70dcc102e773

Groups

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

alexibot

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

xenu's

Rule Path
Disallow /

xenu's link sleuth 1.1c

Rule Path
Disallow /

googlebot

Rule Path
Allow /

httrack

Rule Path
Disallow /

moget

Rule Path
Disallow /

ichiro

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

yeti

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

bombora

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

bomborabot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

smtbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://carrefour.gazetki-promocyjne.net.pl/sitemap.xml

Comments

  • User-agent: AhrefsBot
  • Disallow: /
  • User-agent: Bingbot
  • Disallow: /
  • user-agent: AhrefsBot
  • Disallow: /
  • User-agent: Yandex
  • Disallow: /
  • User-agent: *
  • Disallow: /
  • User-agent: DuckDuckGo
  • Disallow: /
  • User-agent: GrapeshotCrawler
  • Disallow: /
  • User-agent: Getintent Crawler
  • Disallow: /

Warnings

  • 2 invalid lines.