landmark.edu
robots.txt

Robots Exclusion Standard data for landmark.edu

Resource Scan

Scan Details

Site Domain landmark.edu
Base Domain landmark.edu
Scan Status Ok
Last Scan2025-12-19T12:30:12+00:00
Next Scan 2026-01-18T12:30:12+00:00

Last Scan

Scanned2025-12-19T12:30:12+00:00
URL https://landmark.edu/robots.txt
Redirect https://www.landmark.edu/robots.txt
Redirect Domain www.landmark.edu
Redirect Base landmark.edu
Domain IPs 104.26.0.208, 104.26.1.208, 172.67.71.32, 2606:4700:20::681a:1d0, 2606:4700:20::681a:d0, 2606:4700:20::ac43:4720
Redirect IPs 104.26.0.208, 104.26.1.208, 172.67.71.32, 2606:4700:20::681a:1d0, 2606:4700:20::681a:d0, 2606:4700:20::ac43:4720
Response IP 172.67.71.32
Found Yes
Hash c509d01d310b03e88a8764950edb8e55e386049e5105734b23a4358a6c2859f4
SimHash 61549b2ace36

Groups

*

Rule Path
Disallow /cpresources/
Disallow /vendor/
Disallow /.env
Disallow /cache/
Disallow /sitemap
Disallow /sitemap.xml

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.landmark.edu/sitemaps-1-sitemap.xml

Comments

  • robots.txt for https://www.landmark.edu/
  • live - don't allow web crawlers to index cpresources/ or vendor/
  • Disallow ChatGPT bot, as there's no benefit to allowing it to index your site
  • Disallow Google Bard and Vertex AI bots, as there's no benefit to allowing it to index your site
  • Disallow Perplexity bot, as there's no benefit to allowing it to index your site