pipilprint.com
robots.txt

Robots Exclusion Standard data for pipilprint.com

Resource Scan

Scan Details

Site Domain pipilprint.com
Base Domain pipilprint.com
Scan Status Ok
Last Scan2026-01-28T04:32:35+00:00
Next Scan 2026-02-27T04:32:35+00:00

Last Scan

Scanned2026-01-28T04:32:35+00:00
URL https://pipilprint.com/robots.txt
Domain IPs 50.31.176.166
Response IP 50.31.176.166
Found Yes
Hash 1ffcb5e76dd234511f4b4620845e22f3e4cd9e296def95d330779c936ca92fe4
SimHash 7e2ff930ecf3

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Disallow /*?q=*
Disallow /*%26q%3D*
Disallow /?p=*
Disallow /?s=*
Disallow /?r=*
Disallow /search?q=*
Disallow /.well-known/
Disallow /blog/tag/
Disallow /blog/author/
Disallow /external-scripts
Disallow /external-header
Disallow /external-footer
Disallow /api/

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://pipilprint.com/sitemap_index.xml
sitemap https://pipilprint.com/post-sitemap.xml
sitemap https://pipilprint.com/page-sitemap.xml

Comments

  • Sitemap files
  • Block access to specific groups of pages
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot
  • Block GPT