apaonline.org
robots.txt

Robots Exclusion Standard data for apaonline.org

Resource Scan

Scan Details

Site Domain apaonline.org
Base Domain apaonline.org
Scan Status Ok
Last Scan5/25/2025, 7:37:02 AM
Next Scan 6/24/2025, 7:37:02 AM

Last Scan

Scanned5/25/2025, 7:37:02 AM
URL https://www.apaonline.org/robots.txt
Domain IPs 35.169.50.49, 35.173.82.140, 35.174.132.21
Response IP 35.174.132.21
Found Yes
Hash eaa059ebdddc2daacdf4bab31ef96c97be509c470a7badeeffa5487e76788833
SimHash ec941d42c1d8

Groups

*

Rule Path
Disallow /global_inc/
Allow /global_inc/*.css
Allow /global_inc/*.js

*

Rule Path
Disallow /global_engine/ajax/

Other Records

Field Value
sitemap https://www.apaonline.org/autositemapindex.xml

Comments

  • When crawlers hit the engine dir they sometimes publish confusing links to site content
  • in their search results so we exclude these specific engines from crawling it.
  • Note: Certain crawlers do need access to this directory so we do not want a blanket
  • exclude statement here.

Warnings

  • 27 invalid lines.