prosettings.net
robots.txt

Robots Exclusion Standard data for prosettings.net

Resource Scan

Scan Details

Site Domain prosettings.net
Base Domain prosettings.net
Scan Status Ok
Last Scan2024-09-26T08:14:07+00:00
Next Scan 2024-10-03T08:14:07+00:00

Last Scan

Scanned2024-09-26T08:14:07+00:00
URL https://prosettings.net/robots.txt
Domain IPs 104.26.6.55, 104.26.7.55, 172.67.74.216, 2606:4700:20::681a:637, 2606:4700:20::681a:737, 2606:4700:20::ac43:4ad8
Response IP 104.26.7.55
Found Yes
Hash d6b5479d7381fcebfccb4c64a61d95f6de7d242c1189cb9181a44e0595723e40
SimHash 6eebce58f4a0

Groups

*
adsbot-google

Rule Path
Disallow /wp-json/
Disallow /?rest_route=
Disallow /wp-admin/
Disallow /wp-content/cache/
Disallow /wp-content/plugins/
Disallow /wp-login.php
Disallow /xmlrpc.php

*
adsbot-google

Rule Path
Disallow /wp-includes/
Allow /wp-includes/css/
Allow /wp-includes/js/

*
adsbot-google

Rule Path
Disallow /?s=
Disallow /page/*/?s=
Disallow /search/

adsbot-google

Rule Path
Disallow /*.woff2

*
adsbot-google

Rule Path
Disallow /wp-content/uploads/*.zip
Disallow /configs/*.zip

*
adsbot-google

Rule Path
Disallow /cdn-cgi/bm/cv/
Disallow /cdn-cgi/challenge-platform/
Disallow /cdn-cgi/images/trace/
Disallow /cdn-cgi/rum
Disallow /cdn-cgi/scripts/
Disallow /cdn-cgi/styles/
Disallow /cdn-cgi/zaraz/

nuclei
wikido
riddler
petalbot
zoominfobot
go-http-client
node/simplecrawler
cazoodlebot
dotbot/1.0
gigabot
barkrowler
blexbot
magpie-crawler
mj12bot
siteauditbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://prosettings.net/sitemap_index.xml

Comments

  • Block some general WP endpoints
  • -----------------
  • Special handling for /wp-includes/
  • ----------------------------------
  • Block internal search
  • ---------------------
  • Adsbot doesn't ever need to crawl fonts
  • ---------------------------------------
  • Nobody needs to crawl our settings zip files.
  • --------------------------------
  • Block leaky Cloudflare endpoints
  • --------------------------------
  • Disallow noisy bots
  • -----------------
  • Sitemap
  • -----------------