websitetoolbox.com
robots.txt

Robots Exclusion Standard data for websitetoolbox.com

Resource Scan

Scan Details

Site Domain websitetoolbox.com
Base Domain websitetoolbox.com
Scan Status Ok
Last Scan2024-06-08T06:36:37+00:00
Next Scan 2024-06-15T06:36:37+00:00

Last Scan

Scanned2024-06-08T06:36:37+00:00
URL https://websitetoolbox.com/robots.txt
Redirect https://www.websitetoolbox.com/robots.txt
Redirect Domain www.websitetoolbox.com
Redirect Base websitetoolbox.com
Domain IPs 107.21.35.214, 18.213.166.18
Redirect IPs 107.21.35.214, 18.213.166.18
Response IP 107.21.35.214
Found Yes
Hash c5112a3e0eb12aad0aaa07c4096d9c01c69ccfca70a5def01b861d629f3d00ef
SimHash 6842d2d0f5c0

Groups

bubing
alphaseobot
ltx71
companybook-crawler
bdcbot
spbot
semrushbot
siteauditbot
ahrefsbot
mj12bot
dotbot
omgili
blexbot
magpie-crawler
extlinksbot
netseer
weborama-fetcher
linkfluence
sentibot
seokicks
barkrowler
ccbot
trendictionbot
amazonbot
serpstatbot
petalbot
dataforseobot
censysinspect
awariorssbot
awariosmartbot
awariobot
phxbot
bytespider
bl.uk_lddc_bot
ecoresearchcrawler
turnitinbot
zoominfobot
dataprovider
velenpublicwebcrawler
domainstatsbot
hypestat
panscient.com
yak
lcc
makemerrybot
ioncrawl
googleother
anderspinkbot
checkmarknetwork/1.0 (+https://www.checkmarknetwork.com/spider.html)

Rule Path
Disallow /

googlebot-image
googleother-image
googleother-video

Rule Path
Disallow /
Allow /site/images/support/

facebookexternalhit

Rule Path
Disallow /images/
Disallow /pm
Disallow /album
Disallow /mbactions
Disallow /register
Disallow /email
Disallow /search
Disallow /upload
Disallow /printthread
Disallow /post/show_single_post
Disallow /post/printadd
Disallow /post/upost
Disallow /post/hpt
Disallow /subscribe
Disallow /calendar
Disallow /calendar/newevent
Disallow /calendar/daydetail?*&nav=
Disallow /calendar/display*view%3Dweekly
Disallow /calendar/display*view%3Dmonthly
Disallow /calendar/showbirthday
Disallow /external
Disallow /tool/view/gb/private
Disallow /tool/view/gb/email
Disallow /tool/pm
Disallow /tool/members/
Disallow /tool/ticket/
Disallow /cgi/view/poll.cgi
Disallow /cgi/view/topsites.cgi
Disallow /cgi/view/member.cgi
Disallow /cgi/view/out.cgi
Disallow /?authtoken=
Disallow *?*&sort=
Disallow *?sort=
Allow /site/images/support/

Other Records

Field Value
crawl-delay 15

mediapartners-google

Rule Path
Disallow /images/
Disallow /pm
Disallow /album
Disallow /mbactions
Disallow /register
Disallow /email
Disallow /search
Disallow /profile
Disallow /file
Disallow /thumb
Disallow /upload
Disallow /printthread
Disallow /post/show_single_post
Disallow /post/printadd
Disallow /post/upost
Disallow /post/hpt
Disallow /subscribe
Disallow /calendar
Disallow /calendar/newevent
Disallow /calendar/daydetail?*&nav=
Disallow /calendar/display*view%3Dweekly
Disallow /calendar/display*view%3Dmonthly
Disallow /calendar/showbirthday
Disallow /external
Disallow /tool/view/gb/private
Disallow /tool/view/gb/email
Disallow /tool/pm
Disallow /tool/members/
Disallow /tool/ticket/
Disallow /cgi/view/poll.cgi
Disallow /cgi/view/topsites.cgi
Disallow /cgi/view/member.cgi
Disallow /cgi/view/out.cgi
Disallow /?authtoken=

Other Records

Field Value
crawl-delay 15

*

Rule Path
Disallow /images/
Disallow /pm
Disallow /album
Disallow /mbactions
Disallow /register
Disallow /email
Disallow /search
Disallow /profile
Disallow /file
Disallow /thumb
Disallow /upload
Disallow /printthread
Disallow /post/show_single_post
Disallow /post/printadd
Disallow /post/upost
Disallow /post/hpt
Disallow /subscribe
Disallow /calendar
Disallow /calendar/newevent
Disallow /calendar/daydetail?*&nav=
Disallow /calendar/display*view%3Dweekly
Disallow /calendar/display*view%3Dmonthly
Disallow /calendar/showbirthday
Disallow /external
Disallow /tags
Disallow /tool/view/gb/private
Disallow /tool/view/gb/email
Disallow /tool/pm
Disallow /tool/members/
Disallow /tool/ticket/
Disallow /cgi/view/poll.cgi
Disallow /cgi/view/topsites.cgi
Disallow /cgi/view/member.cgi
Disallow /cgi/view/out.cgi
Disallow /?authtoken=
Disallow /contact?*subject=
Disallow /post*?goto=
Disallow /post*%26goto%3D
Disallow /post*?id=
Disallow *?*&sort=
Disallow *?sort=
Disallow *?full_version=
Disallow *?*full_version=
Disallow /?s=
Disallow /?t=
Disallow /?p=
Disallow /?id=
Disallow /?profile=
Disallow /*.jpg$
Disallow /*.jpeg$
Disallow /*.gif$
Disallow /*.png$
Disallow /tool/members/login?action=logout
Disallow /new$
Disallow /top$
Disallow /top$
Disallow /top?period=
Allow /site/images/support/
Allow /tool/members/signup
Allow /tool/members/login
Allow /images/favicon/

Other Records

Field Value
crawl-delay 15

Comments

  • Most desirable images are hosted on S3. These images will just be icons and stuff, so block them.
  • allowing access to image scripts and topic pages for proper sharing
  • allowing access to topic pages for proper context ads
  • Disallow pages with very little content, duplicate content, or different links pointing to the same content
  • End