fark.net
robots.txt

Robots Exclusion Standard data for fark.net

Resource Scan

Scan Details

Site Domain fark.net
Base Domain fark.net
Scan Status Ok
Last Scan2024-06-24T06:05:45+00:00
Next Scan 2024-07-01T06:05:45+00:00

Last Scan

Scanned2024-06-24T06:05:45+00:00
URL https://www.fark.net/robots.txt
Redirect https://www.fark.com/robots.txt
Redirect Domain www.fark.com
Redirect Base fark.com
Domain IPs 104.21.87.104, 172.67.169.94, 2606:4700:3035::6815:5768, 2606:4700:3037::ac43:a95e
Redirect IPs 104.21.82.150, 172.67.203.12, 2606:4700:3030::ac43:cb0c, 2606:4700:3035::6815:5296
Response IP 172.67.203.12
Found Yes
Hash 041502eecf30c887d84954a3301b36ac5a38482698b1a2a95bfbf224f8284bf8
SimHash bc359971c5d3

Groups

googlebot

Rule Path
Disallow /nospam
Disallow /nospam/
Disallow /nomirror
Disallow /nomirror/
Disallow /admin/admin.php
Disallow /submit
Disallow /comments/8008135
Disallow /confirm
Disallow /unsub
Disallow /passwordreset
Disallow /ajax
Disallow /login
Disallow /archives/index-*
Disallow /archives/index.1*
Disallow /archives/index.2*
Disallow /*/archives/index-*
Disallow /*/archives/index.1*
Disallow /*/archives/index.2*
Allow /users
Allow /cgi/users.pl
Allow /cgi/fark/users.pl
Allow /comments
Allow /cgi/comments.pl
Allow /cgi/fark/comments.pl
Disallow /cgi/

mediapartners-google

Rule Path
Disallow /nospam
Disallow /nospam/
Disallow /nomirror
Disallow /nomirror/
Disallow /admin/admin.php
Disallow /comments/8008135
Disallow /passwordreset
Disallow /ajax
Disallow /login
Disallow /archives/index-*
Disallow /archives/index.1*
Disallow /archives/index.2*
Disallow /*/archives/index-*
Disallow /*/archives/index.1*
Disallow /*/archives/index.2*
Allow /users
Allow /cgi/users.pl
Allow /cgi/fark/users.pl
Allow /comments
Allow /cgi/comments.pl
Allow /cgi/fark/comments.pl
Allow /confirm
Allow /unsub
Allow /submit
Allow /cgi/submit.pl
Allow /cgi/feedback.pl
Allow /cgi/forgotpassword.pl
Allow /cgi/newuser.pl
Disallow /cgi/

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

*

Rule Path
Disallow /nospam
Disallow /nospam/
Disallow /nomirror
Disallow /nomirror/
Disallow /admin/admin.php
Disallow /submit
Disallow /users
Disallow /comments/8008135
Disallow /confirm
Disallow /unsub
Disallow /passwordreset
Allow /ajax/headlines
Disallow /ajax
Disallow /login
Disallow /archives/index-*
Disallow /archives/index.1*
Disallow /archives/index.2*
Disallow /*/archives/index-*
Disallow /*/archives/index.1*
Disallow /*/archives/index.2*
Allow /comments
Allow /cgi/comments.pl
Allow /cgi/fark/comments.pl
Disallow /cgi/

Other Records

Field Value
crawl-delay 5

Comments

  • IMPORTANT NOTE:
  • Fark user profiles have a meta tag on the page to tell search engines to NOT
  • index them. But to read the meta tag, the engines have to be able to crawl
  • the page. A disallow means "don't crawl", NOT "don't index" -- if some other
  • site has a link to a URL we have in our disallow list, search engines may
  • still index it anyway. So, counterintuitively, the reason we allow
  • Googlebot to crawl user profiles is so that they WON'T index them.
  • Our intent is that Fark.com user profiles NOT appear in search engines.
  • (So all you SEO spammers are wasting your time -- not that you'll ever read this)
  • This is the same reason we allow /go and /goto through here, as well as
  • /api and /ajax. We block all these from indexing with either the meta tag
  • or the X-Robots-Tag header.
  • Duke Sucks