appliance411.com
robots.txt

Robots Exclusion Standard data for appliance411.com

Resource Scan

Scan Details

Site Domain appliance411.com
Base Domain appliance411.com
Scan Status Ok
Last Scan2024-06-06T06:09:06+00:00
Next Scan 2024-06-13T06:09:06+00:00

Last Scan

Scanned2024-06-06T06:09:06+00:00
URL http://appliance411.com/robots.txt
Redirect http://www.appliance411.com/robots.txt
Redirect Domain www.appliance411.com
Redirect Base appliance411.com
Domain IPs 209.237.150.20
Redirect IPs 209.237.150.20
Response IP 209.237.150.20
Found Yes
Hash 3174568377fc8d2d572cd02de8ba91b340d0fe6174877d58a7390431c429bf30
SimHash 0b109d34a1b2

Groups

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 43200

weborama-fetcher

Rule Path
Disallow /

weborama-fetcher (+http://www.weborama.com)

Rule Path
Disallow /

*

Rule Path
Disallow /ads/
Disallow /item.php
Disallow /data.php
Disallow /display
Disallow /clinic
Disallow /work
Disallow /info
Disallow /refer
Disallow /test
Disallow /cgi-bin
Disallow /jump.cgi?
Disallow /links/jump.cgi?
Disallow /PDF
Disallow /PDFdocs
Disallow /tech
Disallow /tech/links
Disallow /tech/links/cgibin/
Disallow /tech/links/cgi-bin/jump.cgi?

atomz/1.0

Rule Path
Allow /PDF
Disallow /PDFdocs

Comments

  • robot.txt file