businesspost.ie
robots.txt

Robots Exclusion Standard data for businesspost.ie

Resource Scan

Scan Details

Site Domain businesspost.ie
Base Domain businesspost.ie
Scan Status Ok
Last Scan2024-09-27T12:08:02+00:00
Next Scan 2024-10-04T12:08:02+00:00

Last Scan

Scanned2024-09-27T12:08:02+00:00
URL https://businesspost.ie/robots.txt
Domain IPs 13.226.2.51, 13.226.2.58, 13.226.2.61, 13.226.2.77
Response IP 3.163.125.22
Found Yes
Hash 2448cff0421606e9d619f722db7826eb76cfacfcc0305828d947bf434c0f0370
SimHash 38294850e4c5

Groups

*

Rule Path
Disallow /_alive/
Disallow /_alive
Disallow /bors/search*
Disallow /*?tab=
Disallow /*?page=
Disallow /konto/*
Disallow /insider-bolag/
Disallow /finansiell-information/pressreleaser-per-foretag/
Disallow /finansiell-information/pressreleaser/?page=
Disallow /insider-person/
Disallow /Comments/WebServices/
Disallow /*?fhtab=
Disallow /*%26fhtab%3D
Disallow /*?allakommentarer=
Disallow /*%26allakommentarer%3D
Disallow /*?timestamp=
Disallow /*%26timestamp%3D
Disallow /*?t=
Disallow /*%26t%3D
Disallow /*?flik=
Disallow /*%26flik%3D
Disallow /*?qr=
Disallow /*%26qr%3D
Disallow /*?screenwidth
Disallow /*%26screenwidth
Disallow /*?screenheight
Disallow /*%26screenheight
Disallow /*?ctl
Disallow /*%26ctl
Disallow /*?currentIndex=
Disallow /*?ns_
Disallow /*%26ns_
Disallow /bip/
Disallow /bip-callback*
Disallow /akademi/anmalan/*
Disallow /akademi/info-utbildningar/
Disallow /34405621/bn/*
Disallow /di-fonder/*
Disallow /_akademi/
Disallow /_akademi

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /
Disallow /walls/freemium/*
Disallow /walls/premium/*

ia_archiver

Rule Path
Disallow /

archive.today

Rule Path
Disallow /

Comments

  • Block the Internet Archive (archive.org) from crawling the site
  • Block archive.ph from crawling the site