/.well-known/

Log In Sign Up

idc.com
robots.txt

Robots Exclusion Standard data for idc.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	idc.com
Base Domain	idc.com
Scan Status	Ok
Last Scan	2024-04-28T05:40:58+00:00
Next Scan	2024-05-28T05:40:58+00:00

Last Scan

Scanned	2024-04-28T05:40:58+00:00
URL	https://idc.com/robots.txt
Redirect	https://www.idc.com/robots.txt
Redirect Domain	www.idc.com
Redirect Base	idc.com
Domain IPs	18.155.68.111, 18.155.68.40, 18.155.68.77, 18.155.68.96
Redirect IPs	18.155.68.111, 18.155.68.40, 18.155.68.77, 18.155.68.96
Response IP	18.155.68.111
Found	Yes
Hash	98163158f6e85ce4e7118c629e95d33460f3b92259bd6100726be9fbe6b64e7d
SimHash	ea1dd8128f53

Groups

ahrefsbot

Rule

Path

Disallow

/

sitebot

Rule

Path

Disallow

/

*

Rule

Path

Disallow

/getdoc.jsp?containerId=SEV

Disallow

/search/

Disallow

/action/login

Other Records

Field

Value

crawl-delay

10

Back to top

Comments

This is a file retrieved by webwalkers a.k.a. spiders that
conform to a defacto standard.
See <URL:http://www.robotstxt.org/wc/exclusion.html#robotstxt>
Comments to the webmaster should be sent to webmaster@idc.com
Format is:
User-agent: <name of spider>
Disallow: <nothing> | <path>
-----------------------------------------------------------------------------

Back to top