web.eecs.umich.edu
robots.txt

Robots Exclusion Standard data for web.eecs.umich.edu

Archived Snapshots

Resource Scan

Scan Details

Site Domain	web.eecs.umich.edu
Base Domain	umich.edu
Scan Status	Ok
Last Scan	2025-07-31T23:20:37+00:00
Next Scan	2025-08-30T23:20:37+00:00

Last Scan

Scanned	2025-07-31T23:20:37+00:00
URL	https://web.eecs.umich.edu/robots.txt
Domain IPs	141.212.113.214
Response IP	141.212.113.214
Found	Yes
Hash	0a2429062938c0f36e85f3fc6b76f1a5639e1f7db11e52b502234a42c7de96ad
SimHash	a80859574cd9

Groups

*

Rule	Path
Disallow	/~smbrain

Rule

Path

Disallow

/~smbrain

*

Rule	Path
Disallow	/robots.txt

Rule

Path

Disallow

/robots.txt

*

Rule	Path
Disallow	/courses/eecs484

Rule

Path

Disallow

/courses/eecs484

*

Rule	Path
Disallow	/courses

Rule

Path

Disallow

/courses

*

Rule	Path
Disallow	/etc/

Rule

Path

Disallow

/etc/

*

Rule	Path
Disallow	/~imarkov/5

Rule

Path

Disallow

/~imarkov/5

*

Rule	Path
Disallow	/etc/calendar

Rule

Path

Disallow

/etc/calendar

*

Rule	Path
Disallow	/eecs/etc/calendar

Rule

Path

Disallow

/eecs/etc/calendar

*

Rule	Path
Disallow	/techday

Rule

Path

Disallow

/techday

*

Rule	Path
Disallow	/vlsipool

Rule

Path

Disallow

/vlsipool

Back to top

Comments

Description : Search engine exclusion file for http://www.eecs.umich.edu/
Author(s) : DCO staff
Organization: University of Michigan EECS DCO
Created : 1996-12-06 22:10 EDT
Version : $Revision$
RCS id : $Id$
-----------------------------------------------------------------------------
Summary of the file format, drawn from the documentation available at
http://info.webcrawler.com/mak/projects/robots/.
Each record contains lines of the form
<field>:<optionalspace><value><optionalspace>
The field name is case insensitive.
Comments can be included in file using UNIX bourne shell conventions: the
'#' character is used to indicate that preceding space (if any) and the
remainder of the line up to the line termination is discarded. Lines
containing only a comment are discarded completely, and therefore do not
indicate a record boundary.
The record starts with one or more User-agent lines, followed by one or
more Disallow lines, as detailed below. Unrecognised headers are ignored.
User-agent
The value of this field is the name of the robot the record is
describing access policy for.
If more than one User-agent field is present the record describes an
identical access policy for more than one robot. At least one field
needs to be present per record.
The robot should be liberal in interpreting this field. A case
insensitive substring match of the name without version information
is recommended.
If the value is '*', the record describes the default access policy
for any robot that has not matched any of the other records. It is
not allowed to have multiple such records in the "/robots.txt" file.
Disallow
The value of this field specifies a partial URL that is not to be
visited. This can be a full path, or a partial path; any URL that
starts with this value will not be retrieved. For example, Disallow:
/help disallows both /help.html and /help/index.html, whereas
Disallow: /help/ would disallow /help/index.html but allow
/help.html.
Any empty value, indicates that all URLs can be retrieved. At least
one Disallow field needs to be present in a record.
The presence of an empty "/robots.txt" file has no explicit associated
semantics, it will be treated as if it was not present, i.e. all robots
will consider themselves welcome.
Examples
The following example "/robots.txt" file specifies that no robots should
visit any URL starting with "/cyberworld/map/" or "/tmp/:
User-agent: *
Disallow: /cyberworld/map/ # This is an infinite virtual URL space
Disallow: /tmp/ # these will soon disappear
-----------------------------------------------------------------------------

Back to top

web.eecs.umich.edurobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

*

*

*

*

*

*

*

*

*

Comments

web.eecs.umich.edu
robots.txt