/.well-known/

Log In Sign Up

first.edu
robots.txt

Robots Exclusion Standard data for first.edu

Archived Snapshots

Resource Scan

Scan Details

Site Domain	first.edu
Base Domain	first.edu
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	2025-08-14T13:38:32+00:00
Next Scan	2025-11-12T13:38:32+00:00

Last Successful Scan

Scanned	2023-01-01T10:16:27+00:00
URL	https://first.edu/robots.txt
Domain IPs	104.26.2.226, 104.26.3.226, 172.67.70.157, 2606:4700:20::681a:2e2, 2606:4700:20::681a:3e2, 2606:4700:20::ac43:469d
Response IP	172.67.70.157
Found	Yes
Hash	542cbfa05d6566d422f254a457dedc4c038eac070792252d1b473ccc48cfa630
SimHash	29b45d216555

Groups

*

Rule

Path

Disallow

/dev/

Back to top

Comments

****************************************************************************
robots.txt
: Robots, spiders, and search engines use this file to detmine which
content they should *not* crawl while indexing your website.
: This system is called "The Robots Exclusion Standard."
: It is strongly encouraged to use a robots.txt validator to check
for valid syntax before any robots read it!
Examples:
Instruct all robots to stay out of the admin area.
: User-agent: *
: Disallow: /admin/
Restrict Google and MSN from indexing your images.
: User-agent: Googlebot
: Disallow: /images/
: User-agent: MSNBot
: Disallow: /images/
****************************************************************************

Back to top