twitter.com
robots.txt

Robots Exclusion Standard data for twitter.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	twitter.com
Base Domain	twitter.com
Scan Status	Ok
Last Scan	2024-11-05T22:10:04+00:00
Next Scan	2024-11-12T22:10:04+00:00

Last Scan

Scanned	2024-11-05T22:10:04+00:00
URL	https://twitter.com/robots.txt
Domain IPs	104.244.42.129
Response IP	104.244.42.129
Found	Yes
Hash	da5b6efc5b11e34574d4359aa27114d473ead4cd527f73c8a2ebfd74b47189c2
SimHash	223efa19c4f5

Groups

googlebot

Rule	Path
Allow	/*?lang=
Allow	/hashtag/*?src=
Allow	/search?q=%23
Allow	/i/api/
Disallow	/search/realtime
Disallow	/search/users
Disallow	/search/*/grid
Disallow	/*?
Disallow	/*/followers
Disallow	/*/following
Disallow	/account/deactivated
Disallow	/settings/deactivated
Disallow	/%5B_0-9a-zA-Z%5D%2B/status/%5B0-9%5D%2B/likes
Disallow	/%5B_0-9a-zA-Z%5D%2B/status/%5B0-9%5D%2B/retweets
Disallow	/%5B_0-9a-zA-Z%5D%2B/likes
Disallow	/%5B_0-9a-zA-Z%5D%2B/media
Disallow	/%5B_0-9a-zA-Z%5D%2B/photo

Rule

Path

Allow

/*?lang=

Allow

/hashtag/*?src=

Allow

/search?q=%23

Allow

/i/api/

Disallow

/search/realtime

Disallow

/search/users

Disallow

/search/*/grid

Disallow

/*?

Disallow

/*/followers

Disallow

/*/following

Disallow

/account/deactivated

Disallow

/settings/deactivated

Disallow

/%5B_0-9a-zA-Z%5D%2B/status/%5B0-9%5D%2B/likes

Disallow

/%5B_0-9a-zA-Z%5D%2B/status/%5B0-9%5D%2B/retweets

Disallow

/%5B_0-9a-zA-Z%5D%2B/likes

Disallow

/%5B_0-9a-zA-Z%5D%2B/media

Disallow

/%5B_0-9a-zA-Z%5D%2B/photo

google-extended

Rule	Path
Disallow	*

Rule

Path

Disallow

*

facebookbot

Rule	Path
Disallow	*

Rule

Path

Disallow

*

facebookexternalhit

Rule	Path
Disallow	*

Rule

Path

Disallow

*

discordbot

Rule	Path
Disallow	*

Rule

Path

Disallow

*

bingbot

Rule	Path
Disallow	*

Rule

Path

Disallow

*

Rule	Path
Disallow	/
Disallow	/i/u

Rule

Path

Disallow

/

Disallow

/i/u

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

1

Back to top

Other Records

Field	Value
sitemap	https://twitter.com/sitemap.xml

Field

Value

sitemap

https://twitter.com/sitemap.xml

Back to top

Comments

Google Search Engine Robot
==========================
Every bot that might possibly read and respect this file
========================================================
WHAT-4882 - Block indexing of links in notification emails. This applies to all bots.
=====================================================================================
Wait 1 second between successive requests. See ONBOARD-2698 for details.
Independent of user agent. Links in the sitemap are full URLs using https:// and need to match
the protocol of the sitemap.

Back to top

Warnings

`noindex` is not a known field.

Back to top

twitter.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

googlebot

google-extended

facebookbot

facebookexternalhit

discordbot

bingbot

*

Other Records

Other Records

Comments

Warnings

twitter.com
robots.txt