engineeringmix.discussion.community
robots.txt

Robots Exclusion Standard data for engineeringmix.discussion.community

Resource Scan

Scan Details

Site Domain engineeringmix.discussion.community
Base Domain discussion.community
Scan Status Ok
Last Scan2025-10-21T01:04:50+00:00
Next Scan 2025-10-28T01:04:50+00:00

Last Scan

Scanned2025-10-21T01:04:50+00:00
URL https://engineeringmix.discussion.community/robots.txt
Domain IPs 107.21.35.214, 18.213.166.18
Response IP 18.213.166.18
Found Yes
Hash 3c66c913a67966d68bc0c2bbdd414a02f511d9eacc59ad0ed7b3d49d63d80f2b
SimHash 7a429272d5c0

Groups

bubing
alphaseobot
ltx71
companybook-crawler
bdcbot
spbot
semrushbot
siteauditbot
ahrefsbot
mj12bot
dotbot
omgili
blexbot
magpie-crawler
extlinksbot
netseer
weborama-fetcher
linkfluence
sentibot
seokicks
ccbot
trendictionbot
amazonbot
serpstatbot
petalbot
dataforseobot
censysinspect
awariorssbot
awariosmartbot
awariobot
phxbot
bytespider
bl.uk_lddc_bot
ecoresearchcrawler
turnitinbot
zoominfobot
dataprovider
velenpublicwebcrawler
domainstatsbot
hypestat
panscient.com
yak
lcc
makemerrybot
ioncrawl
googleother
anderspinkbot
friendlycrawler
gptbot
meta-externalagent
chatglm
linkwalker
checkmarknetwork/1.0 (+https://www.checkmarknetwork.com/spider.html)
facebookexternalhit
surdotlybot
aliyunsecbot
imagesiftbot
timpibot
geedoproductsearch
barkrowler
scrapy
yacybot
archive.org_bot
zoombot
builtwith

Rule Path
Disallow /

googlebot-image
googleother-image
googleother-video

Rule Path
Disallow /
Allow /site/images/support/

mediapartners-google

Rule Path
Disallow /images/
Disallow /pm
Disallow /album
Disallow /mbactions
Disallow /register
Disallow /email
Disallow /search
Disallow /profile
Disallow /file
Disallow /files
Disallow /thumb
Disallow /upload
Disallow /printthread
Disallow /post/show_single_post
Disallow /post/printadd
Disallow /post/upost
Disallow /post/hpt
Disallow /post/hpti
Disallow /subscribe
Disallow /calendar
Disallow /calendar/newevent
Disallow /calendar/daydetail?*&nav=
Disallow /calendar/display*view%3Dweekly
Disallow /calendar/display*view%3Dmonthly
Disallow /calendar/showbirthday
Disallow /external
Disallow /tool/view/gb/private
Disallow /tool/view/gb/email
Disallow /tool/pm
Disallow /tool/members/
Disallow /tool/ticket/
Disallow /cgi/view/poll.cgi
Disallow /cgi/view/topsites.cgi
Disallow /cgi/view/member.cgi
Disallow /cgi/view/out.cgi
Disallow /?authtoken=
Disallow /chat
Disallow /tags

Other Records

Field Value
crawl-delay 15

*

Rule Path
Disallow /images/
Disallow /pm
Disallow /album
Disallow /mbactions
Disallow /register
Disallow /email
Disallow /search
Disallow /profile
Disallow /file
Disallow /files
Disallow /thumb
Disallow /upload
Disallow /printthread
Disallow /post/show_single_post
Disallow /post/printadd
Disallow /post/upost
Disallow /post/hpt
Disallow /post/hpti
Disallow /subscribe
Disallow /calendar
Disallow /calendar/newevent
Disallow /calendar/daydetail?*&nav=
Disallow /calendar/display*view%3Dweekly
Disallow /calendar/display*view%3Dmonthly
Disallow /calendar/showbirthday
Disallow /external
Disallow /tags
Disallow /tool/view/gb/private
Disallow /tool/view/gb/email
Disallow /tool/pm
Disallow /tool/members/
Disallow /tool/ticket/
Disallow /cgi/view/poll.cgi
Disallow /cgi/view/topsites.cgi
Disallow /cgi/view/member.cgi
Disallow /cgi/view/out.cgi
Disallow /?authtoken=
Disallow /contact?*subject=
Disallow /post*?goto=
Disallow /post*%26goto%3D
Disallow /post*?id=
Disallow *?*&sort=
Disallow *?sort=
Disallow *?full_version=
Disallow *?*full_version=
Disallow /?s=
Disallow /?t=
Disallow /?p=
Disallow /?id=
Disallow /?profile=
Disallow /*.jpg$
Disallow /*.jpeg$
Disallow /*.gif$
Disallow /*.png$
Disallow /tool/members/login?action=logout
Disallow /new$
Disallow /top$
Disallow /top$
Disallow /top?period=
Disallow /chat
Allow /site/images/support/
Allow /images/favicon/
Disallow /contact/
Disallow /support/

Other Records

Field Value
crawl-delay 15

Other Records

Field Value
sitemap https://engineeringmix.discussion.community/sitemap.xml

Comments

  • Most desirable images are hosted on S3. These images will just be icons and stuff, so block them.
  • allowing access to topic pages for proper context ads
  • Disallow pages with very little content, duplicate content, or different links pointing to the same content