ngcm.soton.ac.uk
robots.txt

Robots Exclusion Standard data for ngcm.soton.ac.uk

Resource Scan

Scan Details

Site Domain ngcm.soton.ac.uk
Base Domain soton.ac.uk
Scan Status Ok
Last Scan2025-10-20T18:42:22+00:00
Next Scan 2025-11-19T18:42:22+00:00

Last Scan

Scanned2025-10-20T18:42:22+00:00
URL https://ngcm.soton.ac.uk/robots.txt
Domain IPs 152.78.118.225
Response IP 152.78.118.225
Found Yes
Hash 056617e815d10e48d38e6efda546961909a214e0ffaa4fbc182015b006498189
SimHash fa220d315dbb

Groups

scrapy

Rule Path
Allow /

googlebot

Rule Path
Disallow /*.php$
Disallow /*.js$
Disallow /*.inc$
Disallow /*.swf$
Disallow /*.zip$
Disallow /southamptonconnects/
Disallow /test/

*

Rule Path
Disallow /cgi-bin/
Disallow /wp-admin/
Disallow /wp-includes/
Disallow /wp-content/plugins/
Disallow /wp-content/cache/
Disallow /wp-login.php
Disallow /wp-register.php
Disallow /trackback/
Disallow /feed/
Disallow /author/
Disallow /comments/
Disallow /calendar/action~posterboard/
Disallow /calendar/action~agenda/
Disallow /calendar/action~oneday/
Disallow /calendar/action~month/
Disallow /calendar/action~week/
Disallow /calendar/action~stream/
Disallow /calendar/action~undefined/
Disallow /calendar/action~http%3A/
Disallow /calendar/action~default/
Disallow /calendar/action~poster/
Disallow /calendar/action~*/
Disallow /*controller%3Dai1ec_exporter_controller*
Disallow /*/action~*/

semrushbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

*

Rule Path
Disallow /ics/ourevents/*
Disallow /ai3sd-events/*

Comments

  • Disallow: /*.css$
  • Disallow: /*?*
  • Disallow: /*?
  • Disallow: /wp-content/themes/