computerlist.biz
robots.txt

Robots Exclusion Standard data for computerlist.biz

Resource Scan

Scan Details

Site Domain computerlist.biz
Base Domain computerlist.biz
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2025-12-14T12:04:55+00:00
Next Scan 2026-03-14T12:04:55+00:00

Last Successful Scan

Scanned2025-05-19T01:44:09+00:00
URL https://computerlist.biz/robots.txt
Redirect https://www.computerlist.biz/robots.txt
Redirect Domain www.computerlist.biz
Redirect Base computerlist.biz
Domain IPs 87.118.100.62
Redirect IPs 87.118.100.62
Response IP 87.118.100.62
Found Yes
Hash 3183e3f2f15648794f085a51b168c2b670774270bd9ed53f0d9279add0b749e2
SimHash 2c94b583c764

Groups

adsbot-google
adsbot-google-mobile
adsbot-google-mobile-apps
apis-google
feedfetcher-google
duplexweb-google
google favicon
googlebot
googlebot-image
googlebot-mobile
googlebot-news
googlebot-video
google-read-aloud
googleweblight
storebot-google
googleother
applebot
bingbot
bingpreview
mediapartners
mediapartners-google
mmsnbot_mobile
msnbot
msnbot-media
adidxbot
duckduckbot
duplexweb-google
exabot
facebot
fast-webcrawler
ia_archiver
scooter
slurp
teoma
yahooseeker/m1a1-r2d2
yandexbot
yandex
swiftbot
ccbot/2.0

Rule Path
Allow /

*

Rule Path
Disallow /

*

Rule Path
Allow /core/*.css$
Allow /core/*.css?
Allow /core/*.js$
Allow /core/*.js?
Allow /core/*.gif
Allow /core/*.jpg
Allow /core/*.jpeg
Allow /core/*.png
Allow /core/*.svg
Allow /profiles/*.css$
Allow /profiles/*.css?
Allow /profiles/*.js$
Allow /profiles/*.js?
Allow /profiles/*.gif
Allow /profiles/*.jpg
Allow /profiles/*.jpeg
Allow /profiles/*.png
Allow /profiles/*.svg
Disallow /core/
Disallow /profiles/
Disallow /README.txt
Disallow /web.config
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips
Disallow /node/add/
Disallow /search/
Disallow /user/register
Disallow /user/password
Disallow /user/login
Disallow /user/logout
Disallow /media/oembed
Disallow /*/media/oembed
Disallow /index.php/admin/
Disallow /index.php/comment/reply/
Disallow /index.php/filter/tips
Disallow /index.php/node/add/
Disallow /index.php/search/
Disallow /index.php/user/password
Disallow /index.php/user/register
Disallow /index.php/user/login
Disallow /index.php/user/logout
Disallow /index.php/media/oembed
Disallow /index.php/*/media/oembed

Comments

  • CRAWLING ALLOWED ONLY FOR
  • https://www.keycdn.com/blog/web-crawlers
  • https://kinsta.com/de/blog/crawler-liste/#4-apple-bot
  • ORIGINAL DRUPAL 10.2.3
  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html
  • CSS, JS, Images
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)