deiequipment.com
robots.txt

Robots Exclusion Standard data for deiequipment.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	deiequipment.com
Base Domain	deiequipment.com
Scan Status	Ok
Last Scan	2024-11-14T20:03:31+00:00
Next Scan	2024-11-21T20:03:31+00:00

Last Scan

Scanned	2024-11-14T20:03:31+00:00
URL	https://deiequipment.com/robots.txt
Domain IPs	192.0.78.24, 192.0.78.25
Response IP	192.0.78.24
Found	Yes
Hash	a3487056d54f7147bb3e8672f88a5922f9462d0e9e62a9cefebbe57cb27f177e
SimHash	389d9d1245c0

Groups

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

alphaseobot-sa

Rule	Path
Disallow	/

Rule

Path

Disallow

alphaseobot

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-sa

Rule	Path
Disallow	/

Rule

Path

Disallow

the knowledge ai

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

trendictionbot

Rule	Path
Disallow	/

Rule

Path

Disallow

smtbot

Rule	Path
Disallow	/

Rule

Path

Disallow

dataforseobot

Rule	Path
Disallow	/

Rule

Path

Disallow

msnbot

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

*

Rule	Path
Disallow	/wp-admin
Disallow	/myaccount
Disallow	/myaccount/edit-account

Rule

Path

Disallow

/wp-admin

Disallow

/myaccount

Disallow

/myaccount/edit-account

Other Records

Field	Value
sitemap	https://deiequipment.com/sitemap.xml

Field

Value

sitemap

https://deiequipment.com/sitemap.xml

Comments

This file allows you to control web crawlers access to specific pages on your site. Web crawlers are
programs that search engines run to view and analyze your site to index the content in their search engines.
Common crawlers include Googlebot and bingbot. These are the default rules defined for your site and include
pages and directories that crawlers do not need access to.
Unwanted robots
Amazon's user agent
Spam bot from outside the US
Alexa Web and Site Audit Crawlers
Slow down crawlers
Other rules
These rules apply to all crawlers
Crawlers do not need access to your console, so this rule disallows all console pages for crawlers
Login Pages should not be indexed

deiequipment.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

amazonbot

ahrefsbot

blexbot

mj12bot

ia_archiver

alphaseobot-sa

alphaseobot

rogerbot

dotbot

semrushbot

semrushbot-sa

the knowledge ai

seznambot

trendictionbot

smtbot

dataforseobot

msnbot

Other Records

*

Other Records

Comments

deiequipment.com
robots.txt