broadsheet.ie
robots.txt

Robots Exclusion Standard data for broadsheet.ie

Resource Scan

Scan Details

Site Domain broadsheet.ie
Base Domain broadsheet.ie
Scan Status Ok
Last Scan2024-09-21T13:48:31+00:00
Next Scan 2024-09-28T13:48:31+00:00

Last Scan

Scanned2024-09-21T13:48:31+00:00
URL https://broadsheet.ie/robots.txt
Redirect http://www.broadsheet.ie/robots.txt
Redirect Domain www.broadsheet.ie
Redirect Base broadsheet.ie
Domain IPs 91.210.235.12
Redirect IPs 91.210.235.12
Response IP 91.210.235.12
Found Yes
Hash 1fcacf446d28f4da8dbad56546a869408f293aa4a67f01fc9eaa0453e4af381b
SimHash fe30bd3afdcd

Groups

mediapartners-google*

Rule Path
Disallow

*

Rule Path
Disallow /*blackhole
Disallow /?blackhole
Disallow /*?share=*
Disallow /*?s=*

Other Records

Field Value
sitemap https://www.broadsheet.ie/wp-sitemap.xml

Comments

  • robots.txt
  • Sitemap
  • Description : Google AdSense delivers advertisements to a broad network of affiliated sites.
  • A robot analyses the pages that display the ads in order to target the ads to the page content.
  • Disallow: /wp-content/
  • Will block any links that contain share as a GET parameter