twiter.com
robots.txt

Robots Exclusion Standard data for twiter.com

Resource Scan

Scan Details

Site Domain twiter.com
Base Domain twiter.com
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2025-09-30T10:26:54+00:00
Next Scan 2025-12-29T10:26:54+00:00

Last Successful Scan

Scanned2025-03-04T16:40:49+00:00
URL http://twiter.com/robots.txt
Redirect https://twitter.com/robots.txt
Redirect Domain twitter.com
Redirect Base twitter.com
Domain IPs 199.16.156.6, 199.16.156.70, 199.59.148.10
Redirect IPs 104.244.42.1, 104.244.42.129, 104.244.42.193, 104.244.42.65
Response IP 104.244.42.1
Found Yes
Hash da5b6efc5b11e34574d4359aa27114d473ead4cd527f73c8a2ebfd74b47189c2
SimHash 223efa19c4f5

Groups

googlebot

Rule Path
Allow /*?lang=
Allow /hashtag/*?src=
Allow /search?q=%23
Allow /i/api/
Disallow /search/realtime
Disallow /search/users
Disallow /search/*/grid
Disallow /*?
Disallow /*/followers
Disallow /*/following
Disallow /account/deactivated
Disallow /settings/deactivated
Disallow /%5B_0-9a-zA-Z%5D%2B/status/%5B0-9%5D%2B/likes
Disallow /%5B_0-9a-zA-Z%5D%2B/status/%5B0-9%5D%2B/retweets
Disallow /%5B_0-9a-zA-Z%5D%2B/likes
Disallow /%5B_0-9a-zA-Z%5D%2B/media
Disallow /%5B_0-9a-zA-Z%5D%2B/photo

google-extended

Rule Path
Disallow *

facebookbot

Rule Path
Disallow *

facebookexternalhit

Rule Path
Disallow *

discordbot

Rule Path
Disallow *

bingbot

Rule Path
Disallow *

*

Rule Path
Disallow /
Disallow /i/u

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://twitter.com/sitemap.xml

Comments

  • Google Search Engine Robot
  • ==========================
  • Every bot that might possibly read and respect this file
  • ========================================================
  • WHAT-4882 - Block indexing of links in notification emails. This applies to all bots.
  • =====================================================================================
  • Wait 1 second between successive requests. See ONBOARD-2698 for details.
  • Independent of user agent. Links in the sitemap are full URLs using https:// and need to match
  • the protocol of the sitemap.

Warnings

  • `noindex` is not a known field.