mattwalshblog.com
robots.txt

Robots Exclusion Standard data for mattwalshblog.com

Resource Scan

Scan Details

Site Domain mattwalshblog.com
Base Domain mattwalshblog.com
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2024-10-25T19:18:36+00:00
Next Scan 2024-12-24T19:18:36+00:00

Last Successful Scan

Scanned2021-10-12T07:10:53+00:00
URL http://mattwalshblog.com/robots.txt
Redirect https://themattwalshblog.com/robots.txt
Redirect Domain themattwalshblog.com
Redirect Base themattwalshblog.com
Found Yes
Hash c09ad468349ead6b87da30ddb5b28ba31d11d7fae6c954c43f0a77ff741f04be
SimHash 6e025a85c2b7

Groups

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

msnbot-media

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

*

Rule Path
Disallow /wp-content/uploads/*
Disallow /?s=
Disallow /search/
Disallow /wp-login.php
Allow /wp-admin/admin-ajax.php

Other Records

Field Value
crawl-delay 30

twitterbot

Rule Path
Allow *

facebookexternalhit

Rule Path
Allow *

facebot

Rule Path
Allow *

baiduspider
baiduspider-image
baiduspider-video
baiduspider-news
baiduspider-favo
baiduspider-ads
baiduspider-cpro
genieo
hoaxybot
laserlikebot
semrushbot
seoscanners.net
seznambot
spbot
storygizebot
yandex
yandexbot
yandeximages
yandexmobilebot

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /

Other Records

Field Value
sitemap https://themattwalshblog.com/sitemap_index.xml