jdkg.org
robots.txt

Robots Exclusion Standard data for jdkg.org

Resource Scan

Scan Details

Site Domain jdkg.org
Base Domain jdkg.org
Scan Status Ok
Last Scan2025-02-12T23:14:43+00:00
Next Scan 2025-03-14T23:14:43+00:00

Last Scan

Scanned2025-02-12T23:14:43+00:00
URL https://jdkg.org/robots.txt
Domain IPs 104.21.63.212, 172.67.150.96, 2606:4700:3032::ac43:9660, 2606:4700:3034::6815:3fd4
Response IP 172.67.150.96
Found Yes
Hash 1493972bb388474f4fae4950aca406bd71dd10c17fe4677974c1d8ce989201d3
SimHash ca120d85d1e7

Groups

*

Rule Path Comment
Disallow /works? cruel but efficient
Disallow /autocomplete/ -
Disallow /downloads/ -
Disallow /external_works/ -
Disallow /bookmarks/search? -
Disallow /people/search? -
Disallow /tags/search? -
Disallow /works/search? -

googlebot

Rule Path
Disallow /autocomplete/
Disallow /downloads/
Disallow /external_works/
Disallow /works/*?
Disallow /*search?
Disallow /*?*query=
Disallow /*?*sort_
Disallow /*?*selected_tags
Disallow /*?*view_adult
Disallow /*?*tag_id
Disallow /*?*pseud_id
Disallow /*?*user_id
Disallow /*?*pseud=

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 30

Comments

  • See https://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • disallow indexing of search results
  • Googlebot is smart and knows pattern matching