concordia.ca
robots.txt

Robots Exclusion Standard data for concordia.ca

Resource Scan

Scan Details

Site Domain concordia.ca
Base Domain concordia.ca
Scan Status Ok
Last Scan2025-07-02T02:45:27+00:00
Next Scan 2025-08-01T02:45:27+00:00

Last Scan

Scanned2025-07-02T02:45:27+00:00
URL https://concordia.ca/robots.txt
Redirect https://www.concordia.ca/robots.txt
Redirect Domain www.concordia.ca
Redirect Base concordia.ca
Domain IPs 132.205.244.185
Redirect IPs 132.205.244.70
Response IP 132.205.244.70
Found Yes
Hash 349c5e8d6064838051d41fa153f0b30135a89d92dd88a5e94c1dd677dcb88a44
SimHash 25b809810fe5

Groups

*

Rule Path
Disallow *scientific-monitoring.html?*
Disallow *veille-scientifique.html?*
Disallow *calendar-deseve.html?*
Disallow *calendar-va114.html?*
Disallow /search.html?search-mode=*
Disallow /cuevents/*/200*
Disallow /cuevents/*/201*
Disallow /cuevents/*/2020/*
Disallow /ucactualites/*/200*
Disallow /ucactualites/*/201*
Disallow /ucactualites/*/2020/*
Disallow */all-groups.html*.com
Disallow /faculty.html$

Comments

  • control parameterization on scientific monitoring pages
  • control calendar indexation
  • control search spam
  • disallow crawling of old events pages - 2021 & 2022 still produce traffic
  • Control crawling of errant social media site URLs being appended to all-groups.html
  • Block empty root faculty page from crawling so primary faculty page ranks