printpal.io
robots.txt

Robots Exclusion Standard data for printpal.io

Resource Scan

Scan Details

Site Domain printpal.io
Base Domain printpal.io
Scan Status Ok
Last Scan2025-12-16T09:42:15+00:00
Next Scan 2026-01-15T09:42:15+00:00

Last Scan

Scanned2025-12-16T09:42:15+00:00
URL https://printpal.io/robots.txt
Domain IPs 104.26.14.244, 104.26.15.244, 172.67.74.99, 2606:4700:20::681a:ef4, 2606:4700:20::681a:ff4, 2606:4700:20::ac43:4a63
Response IP 104.26.15.244
Found Yes
Hash 0cd4915465aab9d28a38e601ee7408cf7b21076a318bea1c107d51a676709e34
SimHash 44188aa1e683

Groups

*

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

prowl

Rule Path
Disallow /

nimbostratus-bot

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

googlebot

Rule Path
Disallow
Allow /sitemap.xml
Allow /sitemap-part*.xml

Other Records

Field Value
crawl-delay 60

bingbot

Rule Path
Disallow
Allow /sitemap.xml
Allow /sitemap-part*.xml

Other Records

Field Value
crawl-delay 60

Other Records

Field Value
sitemap https://platform.printpal.io/sitemap.xml

Comments

  • Specific bots to block completely
  • Allow major search engines but with strict rate limits
  • Sitemap location

Warnings

  • `request-rate` is not a known field.