tip411.com
robots.txt

Robots Exclusion Standard data for tip411.com

Resource Scan

Scan Details

Site Domain tip411.com
Base Domain tip411.com
Scan Status Ok
Last Scan2025-08-23T04:50:53+00:00
Next Scan 2025-09-22T04:50:53+00:00

Last Scan

Scanned2025-08-23T04:50:53+00:00
URL https://tip411.com/robots.txt
Domain IPs 3.160.196.22, 3.160.196.53, 3.160.196.59, 3.160.196.76
Response IP 18.165.72.19
Found Yes
Hash 75cc461b8e0b2440f950370e80b19dbf2310a075809582de309a8bbcd9380ee1
SimHash 7434e972c4e0

Groups

amazonbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

img2dataset

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

perplexity-user

Rule Path
Disallow /

google-cloudvertexbot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

yandexadditional

Rule Path
Disallow /

yandexadditionalbot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

*

Rule Path
Allow /$
Allow /?*
Allow /posts/*
Allow /resources/*
Allow /a/*
Allow /alerts/*
Allow /assets/*
Allow /landing/*
Allow /agencies/*/groups/*
Disallow /

Comments

  • https://www.robotstxt.org/robotstxt.html
  • General rules for all other bots
  • Place allows first to avoid bots skipping after Disallow: /
  • Allow exactly the homepage
  • Allow the homepage with any query parameters
  • Now block everything else