johnlarkin.me
robots.txt

Robots Exclusion Standard data for johnlarkin.me

Resource Scan

Scan Details

Site Domain johnlarkin.me
Base Domain johnlarkin.me
Scan Status Ok
Last Scan2026-02-12T09:51:41+00:00
Next Scan 2026-03-14T09:51:41+00:00

Last Scan

Scanned2026-02-12T09:51:41+00:00
URL http://johnlarkin.me/robots.txt
Redirect http://www.johnlarkin.me/robots.txt
Redirect Domain www.johnlarkin.me
Redirect Base johnlarkin.me
Domain IPs 199.34.228.58
Redirect IPs 199.34.228.58
Response IP 199.34.228.58
Found Yes
Hash 2f61f51fbaad9004b2418f729ee81af655773f8e41643dd1bdd86b5f10e38b7d
SimHash 187191dbc5d3

Groups

nerdybot

Rule Path
Disallow /

*

Rule Path
Disallow /ajax/
Disallow /apps/
Disallow /blog-template.html
Disallow /http%3A//johnlarkin.me/projects
Disallow /http%3A//twinteract.com
Disallow /facebook-advertising-for-ecommerce-merchants.html
Disallow /http%3A//www.recaply.com
Disallow /using-data-to-gain-an-unfair-advantage-in-advertising.html
Disallow /free-dropbox-space.html
Disallow /http%3A//twitquest.com
Disallow /http%3A//photoapp.johnlarkin.me
Disallow /http%3A//tatch.me
Disallow /https%3A//www.facebook.com/OfficeRageApocalypse
Disallow /https%3A//github.com/jlarkin353
Disallow /image-dump.html
Disallow /resizable.html
Disallow /authorshipform.html

Other Records

Field Value
sitemap http://www.johnlarkin.me/sitemap.xml