unitedrugby.com
robots.txt

Robots Exclusion Standard data for unitedrugby.com

Resource Scan

Scan Details

Site Domain unitedrugby.com
Base Domain unitedrugby.com
Scan Status Ok
Last Scan2026-02-16T21:47:49+00:00
Next Scan 2026-02-23T21:47:49+00:00

Last Scan

Scanned2026-02-16T21:47:49+00:00
URL https://unitedrugby.com/robots.txt
Domain IPs 141.193.213.10, 141.193.213.11
Response IP 141.193.213.11
Found Yes
Hash 6bae1e6607f5c12da0c7d9d2c5b76fd1b5946c24c2c468a98632092734bedbf8
SimHash 6d16d017f6d5

Groups

*

Rule Path Comment
Disallow /wp-sitemap.xml -
Disallow /wp-sitemap-posts-post-*.xml -
Disallow /wp-sitemap-posts-page-*.xml -
Disallow /wp-sitemap-posts-ad-*.xml -
Disallow /wp-sitemap-posts-poll-*.xml -
Disallow /wp-sitemap-taxonomies-category-*.xml -
Disallow /wp-sitemap-taxonomies-post_tag-*.xml -
Disallow /wp-sitemap-taxonomies-display_category-*.xml -
Disallow /wp-sitemap-users-*.xml -
Disallow /posts/ -
Disallow /latest/ -
Disallow /category/ -
Disallow /tag/ -
Allow / Homepage
Allow /about-urc/ -
Allow /membership/ -
Allow /competition-rules/ -
Allow /contact-us/ -
Allow /partners-suppliers/ -
Allow /ticketing/ -
Allow /where-to-watch/ -
Allow /favicon/ -

Other Records

Field Value
sitemap https://unitedrugby.com/sitemap.xml

Comments

  • Block all sitemaps that contain posts
  • Block all post content
  • Allow only these specific pages
  • Point only to the clean sitemap (must match the above 9 URLs)