r.thejournal.ie
robots.txt

Robots Exclusion Standard data for r.thejournal.ie

Resource Scan

Scan Details

Site Domain r.thejournal.ie
Base Domain thejournal.ie
Scan Status Ok
Last Scan2024-11-18T01:16:48+00:00
Next Scan 2024-11-25T01:16:48+00:00

Last Scan

Scanned2024-11-18T01:16:48+00:00
URL https://r.thejournal.ie/robots.txt
Domain IPs 52.30.123.31, 54.76.182.75, 54.76.191.163
Response IP 52.30.123.31
Found Yes
Hash 910489bebeff5fa64e3f1f0b9875ec88f643a53e12277f7ce7611e21d116a7d0
SimHash eb3f5f107236

Groups

*

Rule Path
Disallow

*

Rule Path
Disallow /search/*
Disallow /article-search*
Disallow */feed/
Disallow */feed/*
Disallow *oauth*
Disallow *subscription-admin%3D*
Disallow *switcher%3D*
Disallow *logout.php*
Disallow *category*Khadr*
Disallow *category*What%20are%20we%20voting%20on%20and%20why*
Disallow *category*Should%20the%20President%20be%20more%20than%20an%20ambassadorial%20role*
Disallow *category*AP%20Photo*
Disallow *category*who%20had%20been%20in%20power%20for%2023%20years*
Disallow *category*currentvacancies*
Disallow *category*THE%20BIGGEST%20GAME%20of%20the*
Disallow /profile/
Disallow /profile/*
Disallow /topic/*
Disallow /*/news
Disallow /*/news/page/*

bingbot

Rule Path
Disallow */*news*
Disallow /author*

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

Warnings

  • 4 invalid lines.