paperjam.lu
robots.txt

Robots Exclusion Standard data for paperjam.lu

Resource Scan

Scan Details

Site Domain paperjam.lu
Base Domain paperjam.lu
Scan Status Ok
Last Scan2025-11-22T21:22:04+00:00
Next Scan 2025-12-22T21:22:04+00:00

Last Scan

Scanned2025-11-22T21:22:04+00:00
URL https://paperjam.lu/robots.txt
Domain IPs 104.26.10.249, 104.26.11.249, 172.67.71.48, 2606:4700:20::681a:af9, 2606:4700:20::681a:bf9, 2606:4700:20::ac43:4730
Response IP 104.26.11.249
Found Yes
Hash 50583696f6a99367b90c8087057bf65574739c710dde3b734aca60582f9c7e6c
SimHash 071dd140a4ee

Groups

*

Rule Path
Disallow /search
Disallow /search/
Disallow /search?
Disallow /_next/
Disallow /assets/fonts/
Disallow /assets/img/
Disallow /paperjam/icons/
Disallow /api/

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

meta-externalfetcher

Rule Path
Disallow /

Other Records

Field Value
sitemap https://paperjam.lu/sitemap.xml
sitemap https://paperjam.lu/club/sitemap.xml

Comments

  • robots.txt for paperjam.lu
  • Last updated: 4 May 2025
  • Disallow technical paths
  • Disallow Large Language Models
  • List Sitemaps