celticsblog.net
robots.txt

Robots Exclusion Standard data for celticsblog.net

Resource Scan

Scan Details

Site Domain celticsblog.net
Base Domain celticsblog.net
Scan Status Ok
Last Scan2025-09-13T21:42:14+00:00
Next Scan 2025-10-13T21:42:14+00:00

Last Scan

Scanned2025-09-13T21:42:14+00:00
URL http://www.celticsblog.net/robots.txt
Redirect https://www.celticsblog.com/robots.txt
Redirect Domain www.celticsblog.com
Redirect Base celticsblog.com
Domain IPs 151.101.1.34, 151.101.129.34, 151.101.193.34, 151.101.65.34
Redirect IPs 199.232.193.246, 199.232.197.246
Response IP 146.75.93.246
Found Yes
Hash a39ade2d53363eb13a8bb290f41167b3045662ccb137fa07f2f243f111dfc34c
SimHash 6880cb60a5f1

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

gptbot

Rule Path
Allow /

google-extended

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

applebot

Rule Path
Allow /

anthropic-ai

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

googlebot-news

Rule Path
Disallow /ad
Disallow /sponsored

*

Rule Path
Disallow /admin
Disallow /newfanshot
Disallow /users/*/replies
Disallow /users/*/comments
Disallow /login
Disallow /account
Disallow /auth/*
Disallow /chorus_auth
Disallow /sso
Disallow /search
Disallow /the-highlight$

*

Rule Path
Disallow /share$
Disallow /share/*
Disallow /share?*

Other Records

Field Value
sitemap https://www.celticsblog.com/sitemaps/google_news
sitemap https://www.celticsblog.com/sitemaps

Comments

  • Google news sitemap
  • Sitemap archive