peta.org.uk
robots.txt

Robots Exclusion Standard data for peta.org.uk

Resource Scan

Scan Details

Site Domain peta.org.uk
Base Domain peta.org.uk
Scan Status Ok
Last Scan2024-09-29T22:27:15+00:00
Next Scan 2024-10-29T22:27:15+00:00

Last Scan

Scanned2024-09-29T22:27:15+00:00
URL https://peta.org.uk/robots.txt
Redirect https://www.peta.org.uk//robots.txt
Redirect Domain www.peta.org.uk
Redirect Base peta.org.uk
Domain IPs 104.18.30.128, 104.18.31.128
Redirect IPs 104.18.30.128, 104.18.31.128
Response IP 104.18.31.128
Found Yes
Hash 7d0a7d8937918aa68e1d1ebfda9ee0a14273bf78b94164bc1303ef91d6149b7a
SimHash a90edea008f8

Groups

*

Rule Path
Disallow /*.bmp$
Disallow /*.axd$
Disallow */feed/
Disallow */trackback/
Disallow /wp-login.php
Disallow /page-not-found/
Disallow /scripts/
Disallow /themes/
Disallow /languages/
Disallow /ext/
Disallow /administrator/
Disallow /cache/
Disallow /components/
Disallow /includes/
Disallow /installation/
Disallow /language/
Disallow /libraries/
Disallow /modules/
Disallow /plugins/
Disallow /templates/
Disallow /tmp/
Disallow /xmlrpc/
Disallow /?s=
Disallow /?sort=
Disallow /?replytocom=
Disallow /?send_to=
Disallow /?c=
Disallow /?r=
Disallow /?post_type=

sogou

Product Comment
sogou www.sogou.com crawler
Rule Path Comment
Disallow / block it entirely

synapse

Product Comment
synapse synapse crawler
Rule Path Comment
Disallow / block it entirely

mj12bot

Rule Path
Disallow

python-urllib

Rule Path
Disallow /

Other Records

Field Value
sitemap http://www.peta.org.uk/sitemap_index.xml

Warnings

  • 3 invalid lines.
  • `reference` is not a known field.