newmanu.edu
robots.txt

Robots Exclusion Standard data for newmanu.edu

Archived Snapshots

Resource Scan

Scan Details

Site Domain	newmanu.edu
Base Domain	newmanu.edu
Scan Status	Ok
Last Scan	2024-04-29T14:17:37+00:00
Next Scan	2024-05-29T14:17:37+00:00

Last Scan

Scanned	2024-04-29T14:17:37+00:00
URL	https://newmanu.edu/robots.txt
Domain IPs	104.26.2.138, 104.26.3.138, 172.67.69.152, 2606:4700:20::681a:28a, 2606:4700:20::681a:38a, 2606:4700:20::ac43:4598
Response IP	104.26.3.138
Found	Yes
Hash	1597f3f0b6b48b2a6f7f35d7ea2c2ba54f3fb1220e783f19b132a3ae9d518a23
SimHash	a31f1559436d

Groups

*

Rule	Path
Disallow	*.pdf$
Disallow	*.doc$
Disallow	*.docx$
Disallow	/ncate/
Disallow	/administrator/
Disallow	/bin/
Disallow	/cache/
Disallow	/cli/
Disallow	/components/
Disallow	/includes/
Disallow	/installation/
Disallow	/language/
Disallow	/layouts/
Disallow	/libraries/
Disallow	/logs/
Disallow	/modules/
Disallow	/plugins/
Disallow	/tmp/
Disallow	/secure/
Disallow	/blogs/
Disallow	/404-page-not-found
Disallow	/?view=category
Disallow	/*.php

Rule

Path

Disallow

*.pdf$

Disallow

*.doc$

Disallow

*.docx$

Disallow

/ncate/

Disallow

/administrator/

Disallow

/bin/

Disallow

/cache/

Disallow

/cli/

Disallow

/components/

Disallow

/includes/

Disallow

/installation/

Disallow

/language/

Disallow

/layouts/

Disallow

/libraries/

Disallow

/logs/

Disallow

/modules/

Disallow

/plugins/

Disallow

/tmp/

Disallow

/secure/

Disallow

/blogs/

Disallow

/404-page-not-found

Disallow

/*?view=category*

Disallow

/*.php

Back to top

Other Records

Field	Value
sitemap	https://newmanu.edu/sitemap

Field

Value

sitemap

https://newmanu.edu/sitemap

Back to top

Comments

If the Joomla site is installed within a folder such as at
e.g. www.example.com/joomla/ the robots.txt file MUST be
moved to the site root at e.g. www.example.com/robots.txt
AND the joomla folder name MUST be prefixed to the disallowed
path, e.g. the Disallow rule for the /administrator/ folder
MUST be changed to read Disallow: /joomla/administrator/
For more information about the robots.txt standard, see:
http://www.robotstxt.org/orig.html
For syntax checking, see:
http://tool.motoricerca.info/robots-checker.phtml
if SEF URLS ARE NOT ACTIVE THEN REMOVE THE BELOW LINE!!!

Back to top

newmanu.edurobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Comments

newmanu.edu
robots.txt