thechurchnews.com
robots.txt

Robots Exclusion Standard data for thechurchnews.com

Resource Scan

Scan Details

Site Domain thechurchnews.com
Base Domain thechurchnews.com
Scan Status Ok
Last Scan2024-11-06T14:47:54+00:00
Next Scan 2024-11-20T14:47:54+00:00

Last Scan

Scanned2024-11-06T14:47:54+00:00
URL https://thechurchnews.com/robots.txt
Redirect https://www.thechurchnews.com/robots.txt
Redirect Domain www.thechurchnews.com
Redirect Base thechurchnews.com
Domain IPs 104.22.22.125, 104.22.23.125, 172.67.11.56, 2606:4700:10::6816:167d, 2606:4700:10::6816:177d, 2606:4700:10::ac43:b38
Redirect IPs 23.209.46.80, 23.209.46.88, 2600:1413:b000:13::b857:c192, 2600:1413:b000:13::b857:c196
Response IP 42.99.140.194
Found Yes
Hash 5ecb910f49cf2ec7bd34d1dd931a68d88f416ae53e43379cbef893b8e24368e0
SimHash 640091684511

Groups

gptbot

Rule Path
Allow /almanac/
Disallow /

google-extended

Rule Path
Allow /almanac/
Disallow /

anthropic-ai

Rule Path
Allow /almanac/
Disallow /

cohere-ai

Rule Path
Allow /almanac/
Disallow /

omgili

Rule Path
Allow /almanac/
Disallow /

omgilibot

Rule Path
Allow /almanac/
Disallow /

piplbot

Rule Path
Allow /almanac/
Disallow /

bytespider

Rule Path
Disallow /

*

Rule Path
Disallow

Other Records

Field Value
sitemap https://www.thechurchnews.com/arc/outboundfeeds/sitemap-index/
sitemap https://www.thechurchnews.com/arc/outboundfeeds/sitemap-news-index/
sitemap https://www.thechurchnews.com/arc/outboundfeeds/sitemap-section-index/
sitemap https://www.thechurchnews.com/arc/outboundfeeds/sitemap-index-year/
sitemap https://media.thechurchnews.com/sitemaps/churchnews/sitemap-index.xml