celiac.com
robots.txt

Robots Exclusion Standard data for celiac.com

Resource Scan

Scan Details

Site Domain celiac.com
Base Domain celiac.com
Scan Status Ok
Last Scan2024-10-02T13:01:10+00:00
Next Scan 2024-10-09T13:01:10+00:00

Last Scan

Scanned2024-10-02T13:01:10+00:00
URL https://celiac.com/robots.txt
Domain IPs 198.24.145.124
Response IP 198.24.145.124
Found Yes
Hash 0e8cf6c3a15f5c5a973c9f1cf0c1fa77a6c241d0ca65a9aca7a755408e8ec403
SimHash b0bc2013848a

Groups

*

Rule Path
Disallow /startTopic/
Disallow /discover/unread/
Disallow /markallread/
Disallow /cookie/
Disallow /online/
Disallow /discover/
Disallow /leaderboard/
Disallow /search/
Disallow /tags/
Disallow /*?advancedSearchForm=
Disallow /register/
Disallow /lostpassword/
Disallow /login/
Disallow /*?sortby=
Disallow /*?filter=
Disallow /*?tab=
Disallow /*?do=
Disallow /*ref%3D
Disallow /*?forumId*
Disallow /*?&controller=embed
Disallow /pm/
Disallow /notifications/
Disallow /*%26do%3DretrieveUrl*
Disallow /*%26amp%3Bdo%3DretrieveUrl*
Disallow /*?app=dp47badlinksfixer*
Disallow /submit?url=*
Disallow /lofiversion/
Disallow /Page1.html/
Disallow /followed/
Disallow /settings/
Disallow /*//1000$

amazonbot

Rule Path
Disallow /profile/

seekportbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 2

sogou

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

baiduspider

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

ahrefsbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

careerbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

liebaofast

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

mb2345browser

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

snappy

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

theworld

Rule Path
Disallow /

yeti

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

iccrawler

Rule Path
Disallow /

gigabot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

naverbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

moget

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

ichiro

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

exabot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

yandexbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

aspiegelbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

mj12bot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

blexbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

semrushbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

ccbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 5

dataforseobot

Rule Path
Allow /robots.txt
Disallow /

dotbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 10

rogerbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.celiac.com/sitemap.php
sitemap https://www.celiac.com/sitemap_posts.php
sitemap https://www.celiac.com/sitemap_pages.php
sitemap https://www.celiac.com/sitemap_comments.php
sitemap https://www.celiac.com/sitemap_cmscategories_pages.php
sitemap https://www.celiac.com/sitemap_forumspages.php
sitemap https://www.celiac.com/sitemap_archive.php
sitemap https://www.celiac.com/sitemap_news.php

Comments

  • START Default Rules for Invision Community (https://invisioncommunity.com)
  • Block pages with no unique content
  • Disallow: /staff/
  • added 1-25-24
  • Block faceted pages and 301 redirect pages
  • added 1-25-2024
  • Block profile pages as these have little unique value, consume a lot of crawl time and contain hundreds of 301 links
  • Disallow: /profile/
  • END Default Rules for Invision Community (https://invisioncommunity.com)
  • START CUSTOM SCOTT RULES
  • Disallow: /profile/*/?do=*
  • added 10-25-2023
  • Disallow: /profile/*/?status=*
  • Disallow: /profile/*/content/
  • Disallow: /profile/*/followers/
  • Disallow: /profile/*/reputation*
  • added 10-15-2022 (DP47) Bad Link Fixer for Bots
  • added 11-11-2022 to stop social share links
  • added 11-8-2023
  • added 11-28-2023
  • added 1-25-2024
  • Sitemaps
  • BLOCK AI BOTS BELOW
  • User-agent: ClaudeBot
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: CCBot
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: ChatGPT-User
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: GPTBot
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: Google-Extended
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: anthropic-ai
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: Omgilibot
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: Omgili
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: Diffbot
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: Bytespider
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: ImagesiftBot
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: cohere-ai
  • Allow: /robots.txt
  • Disallow: /
  • User-agent: FacebookBot
  • Allow: /robots.txt
  • Disallow: /