gracenote.com
robots.txt

Robots Exclusion Standard data for gracenote.com

Resource Scan

Scan Details

Site Domain gracenote.com
Base Domain gracenote.com
Scan Status Ok
Last Scan2025-09-24T14:56:01+00:00
Next Scan 2025-10-24T14:56:01+00:00

Last Scan

Scanned2025-09-24T14:56:01+00:00
URL https://gracenote.com/robots.txt
Domain IPs 192.0.66.38
Response IP 192.0.66.38
Found Yes
Hash d5ddc0cf5c4e0826ae13780f2cac890387fa07d6699ace6474d07b9868467834
SimHash a815bb006dbb

Groups

*

Rule Path
Disallow /gracenote-login/
Allow /gracenote-login/admin-ajax.php
Disallow /*.pdf$
Disallow /*?wg-choose-original=*
Disallow /feed
Disallow /*feed
Disallow /*?feed=
Disallow /*%26feed%3D
Disallow /*feed?*
Disallow /feed%26
Disallow /*/feed

Other Records

Field Value
sitemap https://gracenote.com/sitemap_index.xml
sitemap https://gracenote.com/nl/sitemap_index.xml
sitemap https://gracenote.com/it/sitemap_index.xml
sitemap https://gracenote.com/de/sitemap_index.xml
sitemap https://gracenote.com/ar/sitemap_index.xml
sitemap https://gracenote.com/pl/sitemap_index.xml
sitemap https://gracenote.com/ja/sitemap_index.xml
sitemap https://gracenote.com/ko/sitemap_index.xml
sitemap https://gracenote.com/pt/sitemap_index.xml
sitemap https://gracenote.com/fr/sitemap_index.xml
sitemap https://gracenote.com/es/sitemap_index.xml

Comments

  • Block all PDF files from being crawled
  • Prevent search engines from crawling wg-#choose-original
  • Block all URLs containing feed
  • Sitemap locations