cgspace.cgiar.org
robots.txt
Robots Exclusion Standard data for cgspace.cgiar.org
Resource Scan
Scan Details
Site Domain | cgspace.cgiar.org |
Base Domain | cgiar.org |
Scan Status | Ok |
Last Scan | 2024-11-11T15:54:50+00:00 |
Next Scan | 2024-11-25T15:54:50+00:00 |
Last Scan
Scanned | 2024-11-11T15:54:50+00:00 |
URL | https://cgspace.cgiar.org/robots.txt |
Domain IPs | 188.34.177.10, 2a01:4f8:c17:70c3::1 |
Response IP | 188.34.177.10 |
Found | Yes |
Hash | bf846d96177058b6a03c3020e6eaa4fbf817216a0b1691f93c89675d75a34663 |
SimHash | a49c5f1fe5f7 |
Groups
*
Rule | Path |
---|---|
Disallow | /search |
Disallow | /admin/* |
Disallow | /processes |
Disallow | /submit |
Disallow | /workspaceitems |
Disallow | /profile |
Disallow | /workflowitems |
Disallow | /entities/*?f |
Disallow | /browse/* |
Disallow | /statistics |
Disallow | /contact |
Disallow | /feedback |
Disallow | /forgot |
Disallow | /login |
Disallow | /register |
Other Records
Field | Value |
---|---|
sitemap | https://cgspace.cgiar.org/sitemap_index.xml |
sitemap | https://cgspace.cgiar.org/sitemap_index.html |
Comments