a4pt.org
robots.txt

Robots Exclusion Standard data for a4pt.org

Resource Scan

Scan Details

Site Domain a4pt.org
Base Domain a4pt.org
Scan Status Ok
Last Scan2025-07-25T03:23:20+00:00
Next Scan 2025-08-24T03:23:20+00:00

Last Scan

Scanned2025-07-25T03:23:20+00:00
URL https://www.a4pt.org/robots.txt
Domain IPs 35.169.50.49, 35.173.82.140, 35.174.132.21
Response IP 35.173.82.140
Found Yes
Hash 0a1db03f6dd5523095a7b1c0030d97dd3e270af34ca9a6bf199700a902011d1e
SimHash ed9c9d42c3c9

Groups

*

Rule Path
Disallow /global_inc/

*

Rule Path
Disallow /global_engine/ajax/
Disallow /members/

Other Records

Field Value
sitemap http://www.a4pt.org/autositemapindex.xml

Comments

  • When crawlers hit the engine dir they sometimes publish confusing links to site content
  • in their search results so we exclude these specific engines from crawling it.
  • Note: Certain crawlers do need access to this directory so we do not want a blanket
  • exlude statment here.

Warnings

  • 18 invalid lines.