aartisangrah.in
robots.txt

Robots Exclusion Standard data for aartisangrah.in

Resource Scan

Scan Details

Site Domain aartisangrah.in
Base Domain aartisangrah.in
Scan Status Ok
Last Scan2025-03-12T06:42:44+00:00
Next Scan 2025-04-11T06:42:44+00:00

Last Scan

Scanned2025-03-12T06:42:44+00:00
URL https://aartisangrah.in/robots.txt
Domain IPs 103.160.144.217
Response IP 103.160.144.217
Found Yes
Hash 1a775ec4ed94a24d465316b7575c836a5bf528b03a6e0581a63856cb35145a75
SimHash 24024111abe1

Groups

*

Rule Path
Disallow /wp-admin/
Disallow /wp-login.php
Disallow /wp-json/
Disallow /404/
Allow /wp-admin/admin-ajax.php

meta-externalagent

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.aartisangrah.in/sitemap_index.xml

Comments

  • Disallow general crawlers from accessing admin areas and sensitive URLs
  • Allow Ajax requests from the admin panel
  • Allow the specific user-agent 'meta-externalagent' to crawl the entire website
  • Sitemap location for search engines