thepizza.co
robots.txt

Robots Exclusion Standard data for thepizza.co

Resource Scan

Scan Details

Site Domain thepizza.co
Base Domain thepizza.co
Scan Status Ok
Last Scan2024-09-27T23:37:23+00:00
Next Scan 2024-10-04T23:37:23+00:00

Last Scan

Scanned2024-09-27T23:37:23+00:00
URL https://thepizza.co/robots.txt
Domain IPs 104.21.23.70, 172.67.209.125, 2606:4700:3030::6815:1746, 2606:4700:3035::ac43:d17d
Response IP 172.67.209.125
Found Yes
Hash 8907b99028823f721aebbaa3313182bcc5a3bad897302fe26cbe583fed686552
SimHash ae6fcc00bcb8

Groups

*

Rule Path
Disallow /wp-json/
Disallow /cdn-cgi/bm/cv/
Disallow /cdn-cgi/challenge-platform/

nuclei
wikido
riddler
petalbot
zoominfobot
go-http-client
node/simplecrawler
cazoodlebot
dotbot/1.0
gigabot
barkrowler
blexbot
magpie-crawler

Rule Path
Disallow /

Other Records

Field Value
sitemap https://thepizza.co/sitemap_index.xml

Comments

  • START YOAST BLOCK
  • Global rules
  • -----------------
  • Prevent crawling CF challenge URLs
  • ---------------------------
  • Ban bots that don't benefit us.
  • --------------------------------
  • END YOAST BLOCK