cda.org
robots.txt

Robots Exclusion Standard data for cda.org

Resource Scan

Scan Details

Site Domain cda.org
Base Domain cda.org
Scan Status Ok
Last Scan2024-11-18T20:44:58+00:00
Next Scan 2024-12-18T20:44:58+00:00

Last Scan

Scanned2024-11-18T20:44:58+00:00
URL https://cda.org/robots.txt
Redirect https://www.cda.org/robots.txt
Redirect Domain www.cda.org
Redirect Base cda.org
Domain IPs 141.193.213.20, 141.193.213.21
Redirect IPs 141.193.213.20, 141.193.213.21
Response IP 141.193.213.21
Found Yes
Hash a97b7bd7cafc8e45a0d57a3680c33e4a5763c9197ec0ecc8fcc1bf37bb3ce219
SimHash 24388210f6fe

Groups

*

Rule Path
Disallow /dlm_uploads/

*

Rule Path
Disallow /author/
Disallow /trackback
Disallow /comments
Disallow */trackback
Disallow */comments
Disallow /cgi-bin*
Disallow /cdn-cgi/
Disallow *?s=
Disallow *%26s%3D
Disallow /*%26attr_*
Allow /wp-content/uploads
Allow /wp-*.png
Allow /wp-*.jpg
Allow /wp-*.jpeg
Allow /wp-*.gif
Allow /wp-*.svg
Allow /wp-*.pdf
Allow /wp-*.webp

Other Records

Field Value
crawl-delay 3

googlebot

Rule Path
Disallow /*.php$
Disallow /*.inc$
Disallow /*.gz$
Disallow */feed/$
Disallow *?s=
Disallow *%26s%3D
Disallow /*%26attr_*
Disallow /*?*
Disallow /*?
Allow /*?ver=*

googlebot-image

Rule Path
Allow /*

mediapartners-google*

Rule Path
Disallow
Allow /*

Other Records

Field Value
sitemap https://cda1dev.wpengine.com/sitemap_index.xml

Comments

  • ALL USER AGENTS EXCEPT IF THERE IS ANOTHER "User-agent" RULE AFTER
  • Disallow author archive
  • Disallow indexation of sensitive folders
  • Standard for wp
  • Disallow search :
  • Disallow filters
  • Allow to index images
  • Allow images in plugins, cache
  • GOOGLEBOTS SPECIFIC
  • Googlebot
  • Disallow indexation of sensitive files
  • Disallow search :
  • Disallow filters
  • Disallow indexation of URLs having duplicate content parameters
  • Allow Google Image Bot
  • Allow Google AdSense
  • Show to spiders our sitemap
  • ---------------------------
  • END YOAST BLOCK