type-test.site
robots.txt

Robots Exclusion Standard data for type-test.site

Resource Scan

Scan Details

Site Domain type-test.site
Base Domain type-test.site
Scan Status Ok
Last Scan2025-10-14T16:45:52+00:00
Next Scan 2025-10-21T16:45:52+00:00

Last Scan

Scanned2025-10-14T16:45:52+00:00
URL https://type-test.site/robots.txt
Domain IPs 34.111.179.208
Response IP 34.111.179.208
Found Yes
Hash 650ad31b3aff7dbc96d355df6f542250c2ce010cebdc2c94a03b30c8b2cba606
SimHash 450898522500

Groups

*

Rule Path
Allow /
Allow /attachment-style
Allow /narcissism
Allow /teto-egen
Allow /favicon.ico
Allow /favicon.svg
Allow /sitemap.xml
Allow /ads.txt
Allow /manifest.json
Allow /browserconfig.xml
Allow /humans.txt
Allow /.well-known/
Disallow /src/
Disallow /node_modules/
Disallow /dist/
Disallow /.git/
Disallow /server/
Disallow /shared/
Disallow /_next/
Disallow /api/internal/

Other Records

Field Value
crawl-delay 1

googlebot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 0

bingbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

yandex

Rule Path
Allow /

Other Records

Field Value
crawl-delay 2

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://type-test.site/sitemap.xml

Comments

  • Important pages to crawl
  • Allow important files
  • Disallow unnecessary paths
  • Crawl-delay for general crawlers
  • Specific rules for major search engines
  • Disallow aggressive crawlers
  • Sitemap