hd4k.top
robots.txt

Robots Exclusion Standard data for hd4k.top

Resource Scan

Scan Details

Site Domain hd4k.top
Base Domain hd4k.top
Scan Status Ok
Last Scan2026-01-22T21:13:18+00:00
Next Scan 2026-02-21T21:13:18+00:00

Last Scan

Scanned2026-01-22T21:13:18+00:00
URL https://hd4k.top/robots.txt
Domain IPs 104.21.45.86, 172.67.212.164, 2606:4700:3031::6815:2d56, 2606:4700:3035::ac43:d4a4
Response IP 104.21.45.86
Found Yes
Hash 95a8ba516d3f9d6f0b6cfff3d25209275adb059ffc9dee25dd8c4a01c812e843
SimHash 2bbe9b74b0e6

Groups

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

bingbot

Rule Path
Disallow /

bingpreview

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

googlebot

Rule Path
Disallow /tag*
Disallow /reembed/
Disallow /index.php*

Other Records

Field Value
crawl-delay 40

bingbot

Rule Path
Disallow /tag*
Disallow /reembed/
Disallow /index.php*

Other Records

Field Value
crawl-delay 1

*

Rule Path
Disallow /tag*
Disallow /reembed/
Disallow /index.php*

Other Records

Field Value
crawl-delay 60

Comments

  • ChatGPT関連のクローラーをブロック
  • ClaudeBot関連のクローラーをブロック
  • Google-Extendedを含むGoogleのクローラーをブロック
  • Microsoftの生成AI(Copilotなど)のクローラーをブロック
  • Common Crawlのクローラーをブロック