moorsmagazine.com
robots.txt

Robots Exclusion Standard data for moorsmagazine.com

Resource Scan

Scan Details

Site Domain moorsmagazine.com
Base Domain moorsmagazine.com
Scan Status Ok
Last Scan2024-08-31T21:21:44+00:00
Next Scan 2024-09-30T21:21:44+00:00

Last Scan

Scanned2024-08-31T21:21:44+00:00
URL https://moorsmagazine.com/robots.txt
Domain IPs 104.21.23.188, 172.67.212.192, 2606:4700:3031::ac43:d4c0, 2606:4700:3032::6815:17bc
Response IP 104.21.23.188
Found Yes
Hash aaf4cb6604b922dba0c1ddde1ed48f1bea5281bf4b2cf4e1518fcb64fa9a175f
SimHash aeef4e58f4a2

Groups

*
adsbot-google

Rule Path
Disallow /wp-json/
Disallow /?rest_route=
Disallow /wp-admin/
Disallow /wp-content/cache/
Disallow /wp-content/plugins/
Disallow /wp-login.php
Disallow /xmlrpc.php

*
adsbot-google

Rule Path
Disallow /wp-includes/
Allow /wp-includes/css/
Allow /wp-includes/js/

*
adsbot-google

Rule Path
Disallow /?s=
Disallow /page/*/?s=
Disallow /search/

adsbot-google

Rule Path
Disallow /*.woff2

*
adsbot-google

Rule Path
Disallow /cdn-cgi/bm/cv/
Disallow /cdn-cgi/challenge-platform/
Disallow /cdn-cgi/images/trace/
Disallow /cdn-cgi/rum
Disallow /cdn-cgi/scripts/
Disallow /cdn-cgi/styles/
Disallow /cdn-cgi/zaraz/

*
adsbot-google

Rule Path
Disallow /api/

nuclei
wikido
riddler
petalbot
zoominfobot
go-http-client
node/simplecrawler
cazoodlebot
dotbot/1.0
gigabot
barkrowler
blexbot
magpie-crawler
mj12bot
ahrefsbot

Rule Path
Disallow /

*

Rule Path
Disallow /*/mp3s.xml$
Disallow /*.swf$

Other Records

Field Value
sitemap https://www.moorsmagazine.com/sitemap_index.xml

Comments

  • Block some general WP endpoints
  • -------------------------------
  • Special handling for /wp-includes/
  • ----------------------------------
  • Block internal search
  • ---------------------
  • Adsbot doesn't ever need to crawl fonts
  • ---------------------------------------
  • Block leaky Cloudflare endpoints
  • --------------------------------
  • Block crawling of our API endpoints
  • --------------------------------
  • Ban noisy bots
  • --------------
  • Sitemap
  • -------