reprap.org
robots.txt

Robots Exclusion Standard data for reprap.org

Resource Scan

Scan Details

Site Domain reprap.org
Base Domain reprap.org
Scan Status Ok
Last Scan2024-09-12T05:10:36+00:00
Next Scan 2024-10-12T05:10:36+00:00

Last Scan

Scanned2024-09-12T05:10:36+00:00
URL https://reprap.org/robots.txt
Domain IPs 104.21.32.230, 172.67.156.84, 2606:4700:3032::ac43:9c54, 2606:4700:3033::6815:20e6
Response IP 104.21.32.230
Found Yes
Hash 587e9a968953fbcb1802b5632177f720f86c9d6940855c7fbd3789bbd5bba6e7
SimHash 671cc861a76e

Groups

*

Rule Path
Disallow /wiki/index.php
Disallow /*action%3Dedit*
Disallow /*diff%3D*
Disallow /*printable%3Dyes*
Disallow /*action%3Dhistory*

msnbot*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 60

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 60

yandexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 60

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 60

heritrix

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 30

wget

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

httrack

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

sitesucker

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

spinn3r

Rule Path
Disallow /

paperlibot

Rule Path
Disallow /

riddler

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /