manly.nl
robots.txt

Robots Exclusion Standard data for manly.nl

Resource Scan

Scan Details

Site Domain manly.nl
Base Domain manly.nl
Scan Status Ok
Last Scan2024-11-16T04:11:27+00:00
Next Scan 2024-11-23T04:11:27+00:00

Last Scan

Scanned2024-11-16T04:11:27+00:00
URL https://manly.nl/robots.txt
Domain IPs 104.21.5.36, 172.67.132.223, 2606:4700:3034::ac43:84df, 2606:4700:3035::6815:524
Response IP 172.67.132.223
Found Yes
Hash 1e30e55e86f9125639f006a9a6b341663748a7cfd638c8c0cb77c8a21c6165aa
SimHash 6169d8f2a532

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Disallow /wp-content/cache/
Disallow /*?*infinite_scroll=
Disallow /*?*p=
Disallow /?s=
Disallow /page/*/?s=
Disallow /search/
Disallow *?attachment_id*
Disallow /*.pdf

gptbot
chatgpt-user
ccbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://manly.nl/sitemap_index.xml

Comments

  • Block WP endpoints
  • ---------------------
  • Block params
  • ---------------------
  • Block internal search
  • ---------------------
  • Block others
  • ---------------------
  • Block AI/Scrapers
  • ---------------------