stories.tamu.edu
robots.txt

Robots Exclusion Standard data for stories.tamu.edu

Resource Scan

Scan Details

Site Domain stories.tamu.edu
Base Domain tamu.edu
Scan Status Ok
Last Scan2025-07-29T06:21:28+00:00
Next Scan 2025-08-12T06:21:28+00:00

Last Scan

Scanned2025-07-29T06:21:28+00:00
URL https://stories.tamu.edu/robots.txt
Domain IPs 141.193.213.10, 141.193.213.11
Response IP 141.193.213.10
Found Yes
Hash 9cc8072dbefee1bc7f4c0fef7cc6210472538fa122c193c2dfac5196ec5287b3
SimHash bc795d49e6f3

Groups

elastic-crawler

Rule Path
Disallow /?q=
Disallow /?s=
Disallow /search/*
Disallow /wp-admin/*
Disallow /wp-login.php
Disallow /story-author/
Disallow /news/page/*
Disallow /stories/page/*
Disallow /tag/*
Disallow /category/*
Allow /

siteimprovebot

Rule Path
Disallow /?q=
Disallow /?s=
Disallow /search/
Disallow /wp-admin/
Disallow /wp-login.php
Allow /

*

Rule Path
Disallow /wp-admin/*
Disallow /author/*
Allow /

Other Records

Field Value
sitemap https://stories.tamu.edu/wp-sitemap.xml

Comments

  • Note: robots.txt is read top to bottom. The order of rules matters,
  • the more specific Disallow rule needs to come before the general Allow
  • rule for each user agent.
  • Allow Elastic Search but restrict restrict some routes
  • Allow SiteImprove but restrict some routes