daysoftheyear.co.uk
robots.txt

Robots Exclusion Standard data for daysoftheyear.co.uk

Resource Scan

Scan Details

Site Domain daysoftheyear.co.uk
Base Domain daysoftheyear.co.uk
Scan Status Ok
Last Scan2024-05-11T01:02:19+00:00
Next Scan 2024-05-18T01:02:19+00:00

Last Scan

Scanned2024-05-11T01:02:19+00:00
URL https://daysoftheyear.co.uk/robots.txt
Redirect https://www.daysoftheyear.com/robots.txt
Redirect Domain www.daysoftheyear.com
Redirect Base daysoftheyear.com
Domain IPs 104.21.84.168, 172.67.195.80, 2606:4700:3036::6815:54a8, 2606:4700:3036::ac43:c350
Redirect IPs 172.66.40.137, 172.66.43.119, 2606:4700:3108::ac42:2889, 2606:4700:3108::ac42:2b77
Response IP 172.66.43.119
Found Yes
Hash ff2d8df025ddb047a9d3461c77fbd7da9541ac9be8d3418b3a1bb75aa074649e
SimHash aeef4e70b4b8

Groups

*
adsbot-google

Rule Path
Disallow /wp-json/
Disallow /?rest_route=
Disallow /wp-admin/
Disallow /wp-content/cache/
Disallow /wp-content/plugins/
Disallow /wp-login.php
Disallow /xmlrpc.php

*
adsbot-google

Rule Path
Disallow /wp-includes/
Allow /wp-includes/css/
Allow /wp-includes/js/

*
adsbot-google

Rule Path
Disallow /?s=
Disallow /page/*/?s=
Disallow /search/

adsbot-google

Rule Path
Disallow /*.woff2

*
adsbot-google

Rule Path
Disallow /ezoic/
Disallow /porpoiseant/
Disallow /detroitchicago/

*
adsbot-google

Rule Path
Disallow /workers/

*
adsbot-google

Rule Path
Disallow /~partytown

*
adsbot-google

Rule Path
Disallow /wp-content/uploads/complianz/
Disallow /?wp-ajax=

*
adsbot-google

Rule Path
Disallow /cdn-cgi/bm/cv/
Disallow /cdn-cgi/challenge-platform/
Disallow /cdn-cgi/images/trace/
Disallow /cdn-cgi/rum
Disallow /cdn-cgi/scripts/
Disallow /cdn-cgi/styles/
Disallow /cdn-cgi/zaraz/

nuclei
wikido
riddler
petalbot
zoominfobot
go-http-client
node/simplecrawler
cazoodlebot
dotbot/1.0
gigabot
barkrowler
blexbot
magpie-crawler
mj12bot
ahrefsbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.daysoftheyear.com/sitemap_index.xml

Comments

  • Block some general WP endpoints
  • -------------------------------
  • Special handling for /wp-includes/
  • ----------------------------------
  • Block internal search
  • ---------------------
  • Adsbot doesn't ever need to crawl fonts
  • ---------------------------------------
  • Block legacy Ezoic URLs
  • -----------------------
  • Block workers
  • -------------
  • Block partytown
  • ---------------
  • Block leaky plugins
  • -------------------
  • Block leaky Cloudflare endpoints
  • --------------------------------
  • Ban noisy bots
  • --------------
  • Sitemap
  • -------