/.well-known/

Log In Sign Up

triangletribune.com
robots.txt

Robots Exclusion Standard data for triangletribune.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	triangletribune.com
Base Domain	triangletribune.com
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	2024-11-02T23:48:25+00:00
Next Scan	2025-01-31T23:48:25+00:00

Last Successful Scan

Scanned	2023-07-12T15:56:31+00:00
URL	https://triangletribune.com/robots.txt
Redirect	http://www.triangletribune.com/robots.txt
Redirect Domain	www.triangletribune.com
Redirect Base	triangletribune.com
Domain IPs	104.21.22.224, 172.67.207.138, 2606:4700:3037::6815:16e0, 2606:4700:3037::ac43:cf8a
Redirect IPs	104.21.22.224, 172.67.207.138, 2606:4700:3037::6815:16e0, 2606:4700:3037::ac43:cf8a
Response IP	104.21.22.224
Found	Yes
Hash	eda8ba5ee9781486935d6570ae97827322b99fb298d2c8e603501d4151b4eda8
SimHash	adc89ae0e5b2

Groups

*

Rule

Path

Disallow

/*print%3Dpdf*

Other Records

Field

Value

crawl-delay

5

Back to top

Comments

ROBOTS.TXT
www.triangletribune.com
Google
User-agent: Googlebot
Disallow:
Yahoo
User-agent: Slurp
Disallow:
Alta-Vista
User-agent: Scooter
Disallow:
Excite
User-agent: ArchitextSpider
Disallow:
InfoSeek
User-agent: UltraSeek
Disallow:
Lycos
User-agent: Lycos_Spider_(T-Rex)
Disallow:
LookSmart
User-agent: MantraAgent
Disallow:
Alltheweb
User-agent: FAST-WebCrawler
Disallow:

Back to top