jigsawplanet.com
robots.txt

Robots Exclusion Standard data for jigsawplanet.com

Resource Scan

Scan Details

Site Domain jigsawplanet.com
Base Domain jigsawplanet.com
Scan Status Ok
Last Scan2024-10-09T19:12:36+00:00
Next Scan 2024-10-16T19:12:36+00:00

Last Scan

Scanned2024-10-09T19:12:36+00:00
URL https://jigsawplanet.com/robots.txt
Redirect https://www.jigsawplanet.com/robots.txt
Redirect Domain www.jigsawplanet.com
Redirect Base jigsawplanet.com
Domain IPs 69.46.22.74
Redirect IPs 69.46.22.74
Response IP 69.46.22.74
Found Yes
Hash e62fcd855e916d537c706f05410e02ccb9add40ce7be4ee37963122eb14d05db
SimHash 74bedefad619

Groups

mediapartners-google
adsbot-google
adsbot-google-mobile

Rule Path
Disallow /api/

googlebot-image
ccbot
chatgpt-user
gptbot
google-extended
perplexitybot

Rule Path
Disallow /api/
Disallow /i/*/jp.jpg$
Disallow /?rc=img&
Disallow /?rc=search&

dotbot
riddler
blexbot
ahrefsbot
mauibot

Rule Path
Disallow /

*

Rule Path
Disallow /api/
Disallow /?rc=embedpuzzle&
Disallow /?rc=emailpuzzle&
Disallow /?rc=search&
Disallow /?rc=signin&
Disallow /?rc=settings&
Disallow /*?rc=embeduser$
Disallow /?rc=contact&

Other Records

Field Value Comment
crawl-delay 2 non standard

Other Records

Field Value
sitemap https://www.jigsawplanet.com/?rc=sitemap

Comments

  • --- ad bots
  • --- AI bots
  • ---

Warnings

  • `clean-param` is not a known field.