appliance411.ca
robots.txt

Robots Exclusion Standard data for appliance411.ca

Resource Scan

Scan Details

Site Domain appliance411.ca
Base Domain appliance411.ca
Scan Status Ok
Last Scan2024-05-29T17:29:53+00:00
Next Scan 2024-06-05T17:29:53+00:00

Last Scan

Scanned2024-05-29T17:29:53+00:00
URL http://www.appliance411.ca/robots.txt
Domain IPs 209.237.150.20
Response IP 209.237.150.20
Found Yes
Hash 3174568377fc8d2d572cd02de8ba91b340d0fe6174877d58a7390431c429bf30
SimHash 0b109d34a1b2

Groups

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 43200

weborama-fetcher

Rule Path
Disallow /

weborama-fetcher (+http://www.weborama.com)

Rule Path
Disallow /

*

Rule Path
Disallow /ads/
Disallow /item.php
Disallow /data.php
Disallow /display
Disallow /clinic
Disallow /work
Disallow /info
Disallow /refer
Disallow /test
Disallow /cgi-bin
Disallow /jump.cgi?
Disallow /links/jump.cgi?
Disallow /PDF
Disallow /PDFdocs
Disallow /tech
Disallow /tech/links
Disallow /tech/links/cgibin/
Disallow /tech/links/cgi-bin/jump.cgi?

atomz/1.0

Rule Path
Allow /PDF
Disallow /PDFdocs

Comments

  • robot.txt file