sjf.edu
robots.txt

Robots Exclusion Standard data for sjf.edu

Resource Scan

Scan Details

Site Domain sjf.edu
Base Domain sjf.edu
Scan Status Ok
Last Scan2024-10-29T04:27:06+00:00
Next Scan 2024-11-28T04:27:06+00:00

Last Scan

Scanned2024-10-29T04:27:06+00:00
URL https://sjf.edu/robots.txt
Redirect https://www.sjf.edu/robots.txt
Redirect Domain www.sjf.edu
Redirect Base sjf.edu
Domain IPs 15.197.216.187, 99.83.216.214
Redirect IPs 3.222.17.243, 54.152.172.176, 54.157.167.50
Response IP 54.157.167.50
Found Yes
Hash 238ee543f89bf60fb4a76bd233c61fc79f1030aea95a9a662a46cc7ce4993d82
SimHash cc408045cb16

Groups

terminalfour-nutch-spider

Rule Path
Allow /

*
funnelback crawler

No rules defined. All paths allowed.

Other Records

Field Value
sitemap https://www.sjf.edu/sitemap.xml

Warnings

  • 8 invalid lines.