discord.org
robots.txt

Robots Exclusion Standard data for discord.org

Resource Scan

Scan Details

Site Domain discord.org
Base Domain discord.org
Scan Status Ok
Last Scan2025-06-21T19:45:14+00:00
Next Scan 2025-06-28T19:45:14+00:00

Last Scan

Scanned2025-06-21T19:45:14+00:00
URL https://discord.org/robots.txt
Domain IPs 104.21.3.3, 172.67.129.247, 2606:4700:3034::6815:303, 2606:4700:3035::ac43:81f7
Response IP 104.21.3.3
Found Yes
Hash 39d9ddbe9771f24db3ab5c99572335657952eb8a0fb23b5d9c4d492061f1025c
SimHash ab1ed934b748

Groups

emailwolf

Rule Path
Disallow /

extractorpro

Rule Path
Disallow /

mozilla.*newt

Rule Path
Disallow /

crescent

Rule Path
Disallow /

cherrypicker

Rule Path
Disallow /

webbandit

Rule Path
Disallow /

nicerspro

Rule Path
Disallow /

microsoft.url

Rule Path
Disallow /

emailcollector

Rule Path
Disallow /

daviesbot/1.7

Rule Path
Disallow /

e-societyrobot

Rule Path
Disallow /

ichiro/2.0

Rule Path
Disallow /

rufusbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

becomebot

Rule Path
Disallow /cgi-bin/

Other Records

Field Value
crawl-delay 10

velenpublicwebcrawler

Rule Path
Disallow /cgi-bin/

scrapy

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin/

Comments

  • Format of this file is described at
  • http://info.webcrawler.com/mak/projects/robots/robots.html
  • Most of these web-scraping spam email collectors probably ignore
  • the robots exclusion protocol, but they do so at their peril on
  • this web site. These user agents all get special treatment if
  • they come to www.discord.org.
  • Also see http://www.psychedelix.com/agents1.html
  • Hammering my site at night with over 300 GET requests.
  • http://www.yama.info.waseda.ac.jp/~yamana/es/index_eng.htm
  • Added 2013-03-26 (first change since Nov 5 2005)
  • http://www.become.com/site_owners.html
  • Added 2023-11-13
  • Added 2024-11-25