forums.fark.com
robots.txt

Robots Exclusion Standard data for forums.fark.com

Resource Scan

Scan Details

Site Domain forums.fark.com
Base Domain fark.com
Scan Status Ok
Last Scan2024-05-31T10:08:54+00:00
Next Scan 2024-06-14T10:08:54+00:00

Last Scan

Scanned2024-05-31T10:08:54+00:00
URL https://forums.fark.com/robots.txt
Redirect https://www.fark.com/robots.txt
Redirect Domain www.fark.com
Redirect Base fark.com
Domain IPs 104.21.82.150, 172.67.203.12, 2606:4700:3030::ac43:cb0c, 2606:4700:3035::6815:5296
Redirect IPs 104.21.82.150, 172.67.203.12, 2606:4700:3030::ac43:cb0c, 2606:4700:3035::6815:5296
Response IP 172.67.203.12
Found Yes
Hash 041502eecf30c887d84954a3301b36ac5a38482698b1a2a95bfbf224f8284bf8
SimHash bc359971c5d3

Groups

googlebot

Rule Path
Disallow /nospam
Disallow /nospam/
Disallow /nomirror
Disallow /nomirror/
Disallow /admin/admin.php
Disallow /submit
Disallow /comments/8008135
Disallow /confirm
Disallow /unsub
Disallow /passwordreset
Disallow /ajax
Disallow /login
Disallow /archives/index-*
Disallow /archives/index.1*
Disallow /archives/index.2*
Disallow /*/archives/index-*
Disallow /*/archives/index.1*
Disallow /*/archives/index.2*
Allow /users
Allow /cgi/users.pl
Allow /cgi/fark/users.pl
Allow /comments
Allow /cgi/comments.pl
Allow /cgi/fark/comments.pl
Disallow /cgi/

mediapartners-google

Rule Path
Disallow /nospam
Disallow /nospam/
Disallow /nomirror
Disallow /nomirror/
Disallow /admin/admin.php
Disallow /comments/8008135
Disallow /passwordreset
Disallow /ajax
Disallow /login
Disallow /archives/index-*
Disallow /archives/index.1*
Disallow /archives/index.2*
Disallow /*/archives/index-*
Disallow /*/archives/index.1*
Disallow /*/archives/index.2*
Allow /users
Allow /cgi/users.pl
Allow /cgi/fark/users.pl
Allow /comments
Allow /cgi/comments.pl
Allow /cgi/fark/comments.pl
Allow /confirm
Allow /unsub
Allow /submit
Allow /cgi/submit.pl
Allow /cgi/feedback.pl
Allow /cgi/forgotpassword.pl
Allow /cgi/newuser.pl
Disallow /cgi/

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

*

Rule Path
Disallow /nospam
Disallow /nospam/
Disallow /nomirror
Disallow /nomirror/
Disallow /admin/admin.php
Disallow /submit
Disallow /users
Disallow /comments/8008135
Disallow /confirm
Disallow /unsub
Disallow /passwordreset
Allow /ajax/headlines
Disallow /ajax
Disallow /login
Disallow /archives/index-*
Disallow /archives/index.1*
Disallow /archives/index.2*
Disallow /*/archives/index-*
Disallow /*/archives/index.1*
Disallow /*/archives/index.2*
Allow /comments
Allow /cgi/comments.pl
Allow /cgi/fark/comments.pl
Disallow /cgi/

Other Records

Field Value
crawl-delay 5

Comments

  • IMPORTANT NOTE:
  • Fark user profiles have a meta tag on the page to tell search engines to NOT
  • index them. But to read the meta tag, the engines have to be able to crawl
  • the page. A disallow means "don't crawl", NOT "don't index" -- if some other
  • site has a link to a URL we have in our disallow list, search engines may
  • still index it anyway. So, counterintuitively, the reason we allow
  • Googlebot to crawl user profiles is so that they WON'T index them.
  • Our intent is that Fark.com user profiles NOT appear in search engines.
  • (So all you SEO spammers are wasting your time -- not that you'll ever read this)
  • This is the same reason we allow /go and /goto through here, as well as
  • /api and /ajax. We block all these from indexing with either the meta tag
  • or the X-Robots-Tag header.
  • Duke Sucks