alashan99.com
robots.txt

Robots Exclusion Standard data for alashan99.com

Resource Scan

Scan Details

Site Domain alashan99.com
Base Domain alashan99.com
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2025-08-03T07:24:39+00:00
Next Scan 2025-11-01T07:24:39+00:00

Last Successful Scan

Scanned2024-09-15T07:20:38+00:00
URL https://alashan99.com/robots.txt
Redirect https://www.alashan99.com/robots.txt
Redirect Domain www.alashan99.com
Redirect Base alashan99.com
Domain IPs 43.255.29.1
Redirect IPs 43.255.29.1
Response IP 43.255.29.1
Found Yes
Hash 6e39e5dd929827b175ba081c34ccb97e366dd78fa7f73bc4827e7b77270939e4
SimHash 2d0cb9c34793

Groups

*

Rule Path
Disallow /private/
Disallow /temp/
Disallow /errors/
Disallow /redirects/
Disallow /noindex/
Allow /public/
Allow /content/
Allow /blog/
Disallow /*?*
Allow /public/page-with-noindex.html
Allow /public/page-with-canonical.html

badbot

Rule Path
Disallow /

evilcrawler

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.alashan99.com/sitemap.xml

Comments

  • robots.txt file to follow server-side rules and project options
  • Allow all user-agents
  • Disallow specific directories and pages
  • Allow crawling of specific patterns
  • Sitemap location
  • Disallow query parameters
  • Allow specific pages with canonical and noindex tags
  • Block specific bots