/.well-known/

Log In Sign Up

tpta.org
robots.txt

Robots Exclusion Standard data for tpta.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	tpta.org
Base Domain	tpta.org
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	2025-07-06T20:58:47+00:00
Next Scan	2025-09-04T20:58:47+00:00

Last Successful Scan

Scanned	2025-04-15T20:57:32+00:00
URL	https://tpta.org/robots.txt
Domain IPs	35.169.50.49, 35.173.82.140, 35.174.132.21
Response IP	35.173.82.140
Found	Yes
Hash	e27bd0433bb89079c465ccf65c9c37195978f1da0dd45656b5e790ff86faec1b
SimHash	ec941d42c1d9

Groups

*

Rule

Path

Disallow

/global_inc/

Allow

/global_inc/*.css

Allow

/global_inc/*.js

*

Rule

Path

Disallow

/global_engine/ajax/

Back to top

Other Records

Field

Value

sitemap

https://tpta.org/autositemapindex.xml

Back to top

Comments

When crawlers hit the engine dir they sometimes publish confusing links to site content
in their search results so we exclude these specific engines from crawling it.
Note: Certain crawlers do need access to this directory so we do not want a blanket
exlude statment here.

Back to top

Warnings

18 invalid lines.

Back to top