practicalwebtools.com
robots.txt

Robots Exclusion Standard data for practicalwebtools.com

Resource Scan

Scan Details

Site Domain practicalwebtools.com
Base Domain practicalwebtools.com
Scan Status Ok
Last Scan2025-10-30T20:48:09+00:00
Next Scan 2025-11-06T20:48:09+00:00

Last Scan

Scanned2025-10-30T20:48:09+00:00
URL https://practicalwebtools.com/robots.txt
Domain IPs 104.21.59.169, 172.67.181.91, 2606:4700:3032::ac43:b55b, 2606:4700:3035::6815:3ba9
Response IP 104.21.59.169
Found Yes
Hash 42d00d279a3432ae7c849eba43e6e5de78c44b05e2cf6301cae550e880f96b6a
SimHash 6955f957ef26

Groups

*

Rule Path
Allow /
Disallow /api/
Disallow /_next/static/
Disallow /_next/image/
Disallow /_next/data/
Disallow /_next/on-demand-entries-ping
Disallow /_next/webpack-hmr
Disallow /*/embed$
Disallow /*?*

googlebot

Rule Path
Allow /
Disallow /api/

bingbot

Rule Path
Allow /
Disallow /api/

yandex

Rule Path
Allow /
Disallow /api/

Other Records

Field Value
sitemap https://practicalwebtools.com/sitemap-index.xml

Comments

  • Global rules for all search engines
  • Disallow server-side rendering paths and internal APIs
  • Prevent indexing of duplicate content with query strings
  • Specific rules for Googlebot
  • Specific rules for Bingbot
  • Specific rules for Yandex
  • Crawl-delay: 10 # Uncomment and adjust if needed to prevent server overload
  • Sitemaps - XML format for all search engines
  • Host directive - for Yandex
  • IndexNow - for Bing and Yandex
  • Verify indexnow.txt is available at root

Warnings

  • `host` is not a known field.