partner-gitlab.mioffice.cn
robots.txt

Robots Exclusion Standard data for partner-gitlab.mioffice.cn

Resource Scan

Scan Details

Site Domain partner-gitlab.mioffice.cn
Base Domain mioffice.cn
Scan Status Ok
Last Scan2025-09-25T18:11:03+00:00
Next Scan 2025-10-09T18:11:03+00:00

Last Scan

Scanned2025-09-25T18:11:03+00:00
URL https://partner-gitlab.mioffice.cn/robots.txt
Domain IPs 120.92.110.238
Response IP 120.92.110.238
Found Yes
Hash e0f245c34805eeea56ebaabdd4d0b6106b0f4a70970aaff600a1d76e0fc724b2
SimHash 661299560077

Groups

*

Rule Path
Disallow /autocomplete/users
Disallow /search
Disallow /api
Disallow /admin
Disallow /profile
Disallow /dashboard
Disallow /users
Disallow /help
Disallow /s/
Allow /users/sign_in

*

Rule Path
Disallow /*/new
Disallow /*/edit
Disallow /*/raw

*

Rule Path
Disallow /groups/*/analytics
Disallow /groups/*/contribution_analytics
Disallow /groups/*/group_members

*

Rule Path
Disallow /*/*.git
Disallow /*/archive/
Disallow /*/repository/archive*
Disallow /*/activity
Disallow /*/blame
Disallow /*/commits
Disallow /*/commit
Disallow /*/commit/*.patch
Disallow /*/commit/*.diff
Disallow /*/compare
Disallow /*/network
Disallow /*/graphs
Disallow /*/merge_requests/*.patch
Disallow /*/merge_requests/*.diff
Disallow /*/merge_requests/*/diffs
Disallow /*/deploy_keys
Disallow /*/hooks
Disallow /*/services
Disallow /*/protected_branches
Disallow /*/uploads/
Disallow /*/project_members
Disallow /*/settings

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-Agent: *
  • Disallow: /
  • Add a 1 second delay between successive requests to the same server, limits resources used by crawler
  • Only some crawlers respect this setting, e.g. Googlebot does not
  • Crawl-delay: 1
  • Based on details in https://gitlab.com/gitlab-org/gitlab/blob/master/config/routes.rb,
  • https://gitlab.com/gitlab-org/gitlab/blob/master/spec/routing, and using application
  • Global routes
  • Only specifically allow the Sign In page to avoid very ugly search results
  • Generic resource routes like new, edit, raw
  • This will block routes like:
  • - /projects/new
  • - /gitlab-org/gitlab-foss/issues/123/-/edit
  • Group details
  • Project details