alquds.com
robots.txt

Robots Exclusion Standard data for alquds.com

Resource Scan

Scan Details

Site Domain alquds.com
Base Domain alquds.com
Scan Status Ok
Last Scan2026-02-17T14:26:26+00:00
Next Scan 2026-02-24T14:26:26+00:00

Last Scan

Scanned2026-02-17T14:26:26+00:00
URL https://alquds.com/robots.txt
Domain IPs 104.21.92.114, 172.67.192.25, 2606:4700:3031::6815:5c72, 2606:4700:3033::ac43:c019
Response IP 172.67.192.25
Found Yes
Hash 00c57e9027b0c11c25e784622e0f5a0301c07076f65a1323d5d9ad61b55e703d
SimHash 64585682e421

Groups

*

Rule Path
Allow /
Disallow /admin/
Disallow /master/
Disallow /sidekiq/
Disallow /users/sign_in
Disallow /users/sign_up
Disallow /users/password
Disallow /api/
Disallow /search?
Disallow /up
Disallow /health
Allow /rss
Allow /feed

Other Records

Field Value
crawl-delay 1

googlebot-news

Rule Path
Allow /posts/
Allow /ar/posts/
Allow /en/posts/
Allow /he/posts/
Disallow /admin/
Disallow /master/

gptbot

Rule Path
Allow /posts/
Allow /categories/
Disallow /admin/

Other Records

Field Value
crawl-delay 2

chatgpt-user

Rule Path
Allow /posts/
Allow /categories/
Disallow /admin/

Other Records

Field Value
crawl-delay 2

claude-web

Rule Path
Allow /posts/
Allow /categories/
Disallow /admin/

Other Records

Field Value
crawl-delay 2

anthropic-ai

Rule Path
Allow /posts/
Allow /categories/
Disallow /admin/

Other Records

Field Value
crawl-delay 2

Other Records

Field Value
sitemap https://alquds.com/sitemaps/alquds/sitemap.xml

Comments

  • robots.txt for جريدة القدس
  • Domain: alquds.com
  • Generated: 2026-02-17T16:26:29+02:00
  • Disallow admin and private areas
  • Disallow search results (can create duplicate content)
  • Disallow health check endpoints
  • Allow RSS feeds
  • Crawl delay for politeness
  • Sitemap location
  • Google News specific - allow full access to news content
  • AI Crawlers - allow with rate limiting respect