practicalbusinessideas.com
robots.txt

Robots Exclusion Standard data for practicalbusinessideas.com

Resource Scan

Scan Details

Site Domain practicalbusinessideas.com
Base Domain practicalbusinessideas.com
Scan Status Ok
Last Scan2024-09-26T22:08:46+00:00
Next Scan 2024-10-03T22:08:46+00:00

Last Scan

Scanned2024-09-26T22:08:46+00:00
URL https://practicalbusinessideas.com/robots.txt
Domain IPs 140.82.4.225
Response IP 140.82.4.225
Found Yes
Hash 707d2f6127ca89064e5caa97a3e2c657f07be6fbc6b97c8cafa41ffa0292a343
SimHash e6ed4e31f4b0

Groups

*
adsbot-google

Rule Path
Disallow /wp-json/
Disallow /?rest_route=
Disallow /wp-admin/
Disallow /wp-content/cache/
Disallow /wp-content/plugins/
Disallow /xmlrpc.php

*
adsbot-google

Rule Path
Disallow /wp-includes/
Allow /wp-includes/css/
Allow /wp-includes/js/

*
adsbot-google

Rule Path
Disallow /?s=
Disallow /page/*/?s=
Disallow /search/

adsbot-google

Rule Path
Disallow /*.woff2

*
adsbot-google

Rule Path
Disallow /porpoiseant/
Disallow /detroitchicago/
Disallow /beardeddragon/
Disallow /tardisrocinante/
Disallow /ezoic/

*
adsbot-google

Rule Path
Disallow /workers/

*
adsbot-google

Rule Path
Disallow /~partytown

*
adsbot-google

Rule Path
Disallow /wp-content/uploads/complianz/
Disallow /?wp-ajax=

*
adsbot-google

Rule Path
Disallow /cdn-cgi/bm/cv/
Disallow /cdn-cgi/challenge-platform/
Disallow /cdn-cgi/images/trace/
Disallow /cdn-cgi/rum
Disallow /cdn-cgi/scripts/
Disallow /cdn-cgi/styles/
Disallow /cdn-fpw/sxg/

nuclei
wikido
riddler
petalbot
zoominfobot
go-http-client
node/simplecrawler
cazoodlebot
dotbot/1.0
gigabot
barkrowler
blexbot
magpie-crawler
mj12bot
ahrefsbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.practicalbusinessideas.com/sitemap_index.xml

Comments

  • Block some general WP endpoints
  • -------------------------------
  • Special handling for /wp-includes/
  • ----------------------------------
  • Block internal search
  • ---------------------
  • Adsbot doesn't ever need to crawl fonts
  • ---------------------------------------
  • Block legacy Ezoic URLs
  • -----------------------
  • Block workers
  • -------------
  • Block partytown
  • ---------------
  • Block leaky plugins
  • -------------------
  • Block leaky Cloudflare endpoints
  • --------------------------------
  • Ban noisy bots
  • --------------
  • Sitemap
  • -------