perma.cc
robots.txt

Robots Exclusion Standard data for perma.cc

Resource Scan

Scan Details

Site Domain perma.cc
Base Domain perma.cc
Scan Status Ok
Last Scan2024-07-05T08:09:58+00:00
Next Scan 2024-08-04T08:09:58+00:00

Last Scan

Scanned2024-07-05T08:09:58+00:00
URL https://perma.cc/robots.txt
Domain IPs 104.17.61.185, 104.17.62.185
Response IP 104.17.62.185
Found Yes
Hash b94698a4ba9a164487903f23501b118dfab1cd26a982825502461bc8492931f2
SimHash 0a919b29cde7

Groups

*

Rule Path
Allow /$
Allow /about
Allow /copyright-policy
Allow /terms-of-service
Allow /privacy-policy
Allow /return-policy
Allow /contingency-plan
Allow /report
Allow /contact
Allow /contact/thanks
Allow /docs
Allow /docs/perma-link-creation
Allow /docs/libraries
Allow /docs/faq
Allow /docs/accounts
Allow /docs/developer
Allow /libraries
Allow /robots.txt
Disallow /2
Disallow /3
Disallow /4
Disallow /5
Disallow /6
Disallow /7
Disallow /8
Disallow /9
Disallow /A
Disallow /B
Disallow /C
Disallow /D
Disallow /E
Disallow /F
Disallow /G
Disallow /H
Disallow /J
Disallow /K
Disallow /L
Disallow /M
Disallow /N
Disallow /P
Disallow /Q
Disallow /R
Disallow /S
Disallow /T
Disallow /U
Disallow /V
Disallow /W
Disallow /X
Disallow /Y
Disallow /Z
Disallow /_
Disallow /archive-
Disallow /api_key
Disallow /errors
Disallow /log
Disallow /manage
Disallow /password
Disallow /register
Disallow /service
Disallow /settings
Disallow /sign-up

siteimprovebot

Rule Path
Disallow /

siteimprovebot-crawler

Rule Path
Disallow /

googlebot/nutch-1.7

Rule Path
Disallow /