india.gov.in
robots.txt

Robots Exclusion Standard data for india.gov.in

Archived Snapshots

Resource Scan

Scan Details

Site Domain	india.gov.in
Base Domain	india.gov.in
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	4/2/2025, 8:43:31 AM
Next Scan	7/1/2025, 8:43:31 AM

Last Successful Scan

Scanned	12/10/2023, 7:19:47 AM
URL	https://www.india.gov.in/robots.txt
Domain IPs	23.211.140.128, 23.211.140.171, 23.211.140.74, 2600:1413:b000:1e::17d1:2e54, 2600:1413:b000:1e::17d1:2e61
Response IP	23.47.190.169
Found	Yes
Hash	7dfae0a2d5217213c394a62b8bce19aebce6704449fe2f5305417109d45c98cc
SimHash	781cbd1a8f7c

Groups

urlappendbot/1.0; +http://www.profound.net/urlappendbot.html

Rule	Path
Disallow	/

Rule

Path

Disallow

tweetmemebot/3.0; +http://tweetmeme.com/

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot/v1.4.3, http://www.majestic12.co.uk/bot.php?+

Rule	Path
Disallow	/

Rule

Path

Disallow

sosospider/2.0

Rule	Path
Disallow	/

Rule

Path

Disallow

python-urllib/2.6

Rule	Path
Disallow	/

Rule

Path

Disallow

python-requests/1.2.3 cpython/2.7.3 linux/3.3.8-gcg-201308121035

Rule	Path
Disallow	/

Rule

Path

Disallow

python-requests/1.2.3 cpython/2.7.2+ linux/3.0.0-16-virtual

Rule	Path
Disallow	/

Rule

Path

Disallow

python-requests/1.2.3 cpython/2.7.4 linux/3.8.11-ec2

Rule	Path
Disallow	/

Rule

Path

Disallow

python-urllib/2.7

Rule	Path
Disallow	/

Rule

Path

Disallow

python-httplib2/0.7.7 (gzip)

Rule	Path
Disallow	/

Rule

Path

Disallow

python-requests/1.1.0 cpython/2.6.8 linux/3.4.48-45.46.amzn1.x86_64

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Disallow	/includes/
Disallow	/misc/
Disallow	/modules/
Disallow	/profiles/
Disallow	/scripts/
Disallow	/themes/
Disallow	/node/
Disallow	/taxonomy/term/
Disallow	/comment/
Disallow	/*.txt$
Disallow	/*.php$
Allow	/index.php
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips/
Disallow	/node/add/
Disallow	/search/
Disallow	/user/register/
Disallow	/user/password/
Disallow	/user/login/
Disallow	/user/logout/
Disallow	/?q=admin%2F
Disallow	/?q=comment%2Freply%2F
Disallow	/?q=filter%2Ftips%2F
Disallow	/?q=node%2Fadd%2F
Disallow	/?q=search%2F
Disallow	/?q=user%2Fpassword%2F
Disallow	/?q=user%2Fregister%2F
Disallow	/?q=user%2Flogin%2F
Disallow	/?q=user%2Flogout%2F

Rule

Path

Disallow

/includes/

Disallow

/misc/

Disallow

/modules/

Disallow

/profiles/

Disallow

/scripts/

Disallow

/themes/

Disallow

/node/

Disallow

/taxonomy/term/

Disallow

/comment/

Disallow

/*.txt$

Disallow

/*.php$

Allow

/index.php

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips/

Disallow

/node/add/

Disallow

/search/

Disallow

/user/register/

Disallow

/user/password/

Disallow

/user/login/

Disallow

/user/logout/

Disallow

/?q=admin%2F

Disallow

/?q=comment%2Freply%2F

Disallow

/?q=filter%2Ftips%2F

Disallow

/?q=node%2Fadd%2F

Disallow

/?q=search%2F

Disallow

/?q=user%2Fpassword%2F

Disallow

/?q=user%2Fregister%2F

Disallow

/?q=user%2Flogin%2F

Disallow

/?q=user%2Flogout%2F

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/wc/robots.html
For syntax checking, see:
http://www.sxw.org.uk/computing/robots/check.html
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)

india.gov.inrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

urlappendbot/1.0; +http://www.profound.net/urlappendbot.html

tweetmemebot/3.0; +http://tweetmeme.com/

mj12bot/v1.4.3, http://www.majestic12.co.uk/bot.php?+

sosospider/2.0

python-urllib/2.6

python-requests/1.2.3 cpython/2.7.3 linux/3.3.8-gcg-201308121035

python-requests/1.2.3 cpython/2.7.2+ linux/3.0.0-16-virtual

python-requests/1.2.3 cpython/2.7.4 linux/3.8.11-ec2

python-urllib/2.7

python-httplib2/0.7.7 (gzip)

python-requests/1.1.0 cpython/2.6.8 linux/3.4.48-45.46.amzn1.x86_64

*

Other Records

Comments

india.gov.in
robots.txt