automat.icu
robots.txt

Robots Exclusion Standard data for automat.icu

Resource Scan

Scan Details

Site Domain automat.icu
Base Domain automat.icu
Scan Status Ok
Last Scan2025-10-18T21:21:45+00:00
Next Scan 2025-11-17T21:21:45+00:00

Last Scan

Scanned2025-10-18T21:21:45+00:00
URL https://automat.icu/robots.txt
Domain IPs 149.202.77.211, 2001:41d0:d:36d3::1
Response IP 149.202.77.211
Found Yes
Hash 024ecf653779b5a4668a76bcb9badbe737e0ac4ddf86662ee013e1b81ab0388a
SimHash e6855ba532f1

Groups

googlebot
adsbot-google
adsbot-google-mobile
adsbot-google-mobile-apps
google favicon
googlebot-news
googlebot-image
googlebot-video
mediapartners-google
apis-google
duplexweb-google
bingbot
slurp
duckduckbot
baiduspider
ahrefsbot
rogerbot
yandexbot
dotbot
twitterbot
bingpreview
linkedinbot
yandexbot
facebot
facebookexternalhit
msnbot
msnbot-media

Rule Path
Allow /
Allow /matomo.php
Allow /piwik.php
Allow /matomo.js
Allow /piwik.js
Allow /js/

Comments

  • See http://www.robotstxt.org/orig.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-Agent: *
  • Disallow: /
  • To ban all spiders from only specific directories such as /people /u or /tag etc.
  • User-Agent: *
  • Disallow: /people/
  • Disallow: /u/
  • Disallow: /camo/
  • Disallow: /
  • Disallow: /people/
  • Disallow: /u/
  • Disallow: /camo/