guys.nl
robots.txt

Robots Exclusion Standard data for guys.nl

Resource Scan

Scan Details

Site Domain guys.nl
Base Domain guys.nl
Scan Status Ok
Last Scan2024-11-12T18:10:47+00:00
Next Scan 2024-11-19T18:10:47+00:00

Last Scan

Scanned2024-11-12T18:10:47+00:00
URL https://guys.nl/robots.txt
Domain IPs 104.21.77.9, 172.67.203.19, 2606:4700:3030::ac43:cb13, 2606:4700:3031::6815:4d09
Response IP 172.67.203.19
Found Yes
Hash 86a978e9615a0bd59b51c0bbd3e02b0730e89e4bc3cd28ecce2073da9956aa81
SimHash 4929d8e2a423

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Disallow /wp-content/cache/
Disallow /*?*infinite_scroll=
Disallow /?s=
Disallow /page/*/?s=
Disallow /search/
Disallow *?attachment_id*
Disallow /*.pdf

gptbot
chatgpt-user
ccbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://guys.nl/sitemap_index.xml

Comments

  • Block WP endpoints
  • ---------------------
  • Block params
  • ---------------------
  • Block internal search
  • ---------------------
  • Block others
  • ---------------------
  • Block AI/Scrapers
  • ---------------------