pancan.org
robots.txt
Robots Exclusion Standard data for pancan.org
Resource Scan
Scan Details
Site Domain | pancan.org |
Base Domain | pancan.org |
Scan Status | Ok |
Last Scan | 2025-05-08T22:42:26+00:00 |
Next Scan | 2025-06-07T22:42:26+00:00 |
Last Scan
Scanned | 2025-05-08T22:42:26+00:00 |
URL | https://pancan.org/robots.txt |
Domain IPs | 20.40.202.9 |
Response IP | 20.40.202.9 |
Found | Yes |
Hash | 2ec2ebd15d6d58dcb037ff4b89884462f11b2fff88695d57d1c0ce0106e76f34 |
SimHash | c095bd409033 |
Groups
*
Rule | Path |
---|---|
Disallow | /wp-admin/ |
Disallow | /email/ |
Disallow | /purpleride/ |
Disallow | /section_about/ |
Disallow | /section_facing_pancreatic_cancer/ |
Disallow | /section_get_involved/ |
Disallow | /section_stories/ |
Disallow | /timeforhope/ |
Disallow | /outreach/ |
Other Records
Field | Value |
---|---|
sitemap | https://www.pancan.org/sitemap_index.xml |