pancan.org
robots.txt

Robots Exclusion Standard data for pancan.org

Resource Scan

Scan Details

Site Domain pancan.org
Base Domain pancan.org
Scan Status Ok
Last Scan2025-05-08T22:42:26+00:00
Next Scan 2025-06-07T22:42:26+00:00

Last Scan

Scanned2025-05-08T22:42:26+00:00
URL https://pancan.org/robots.txt
Domain IPs 20.40.202.9
Response IP 20.40.202.9
Found Yes
Hash 2ec2ebd15d6d58dcb037ff4b89884462f11b2fff88695d57d1c0ce0106e76f34
SimHash c095bd409033

Groups

*

Rule Path
Disallow /wp-admin/
Disallow /email/
Disallow /purpleride/
Disallow /section_about/
Disallow /section_facing_pancreatic_cancer/
Disallow /section_get_involved/
Disallow /section_stories/
Disallow /timeforhope/
Disallow /outreach/

Other Records

Field Value
sitemap https://www.pancan.org/sitemap_index.xml