airquality.co.uk
robots.txt

Robots Exclusion Standard data for airquality.co.uk

Resource Scan

Scan Details

Site Domain airquality.co.uk
Base Domain airquality.co.uk
Scan Status Ok
Last Scan2024-06-12T16:39:08+00:00
Next Scan 2024-07-12T16:39:08+00:00

Last Scan

Scanned2024-06-12T16:39:08+00:00
URL http://airquality.co.uk/robots.txt
Redirect https://uk-air.defra.gov.uk/robots.txt
Redirect Domain uk-air.defra.gov.uk
Redirect Base defra.gov.uk
Domain IPs 195.211.92.160
Redirect IPs 143.244.49.179, 2400:52e0:1a01::994:1
Response IP 143.244.50.87
Found Yes
Hash 378f253b96ec302994638ad183888bfda9658d8daf5b3d3607ec2fae0c00286b
SimHash 3e74eb780f99

Groups

*

Rule Path
Disallow /assets/downloads/
Disallow /datastore/
Disallow /assets/weekly_graphs/
Disallow /assets/graphs/
Disallow /data-providers/
Disallow /forecasting/locations
Disallow /data/data_selector
Disallow /data/exceedence
Disallow /data/data-availability
Disallow /data/DAQI-regional-data
Disallow /data/non-auto-data
Disallow /data/gis-mapping
Disallow /data/openair
Disallow /data/laqm-background-maps
Disallow /data/ozone-data
Disallow /data/uv-data
Disallow /data/uv-index-graphs

Comments

  • All robots will spider the domain
  • Disallow directories
  • Disallow interactive data sections
  • to stop bots hammering the databases