deloitte.wsj.com
robots.txt

Robots Exclusion Standard data for deloitte.wsj.com

Resource Scan

Scan Details

Site Domain deloitte.wsj.com
Base Domain wsj.com
Scan Status Ok
Last Scan2024-05-29T15:06:46+00:00
Next Scan 2024-06-12T15:06:46+00:00

Last Scan

Scanned2024-05-29T15:06:46+00:00
URL https://deloitte.wsj.com/robots.txt
Domain IPs 13.33.88.117, 13.33.88.121, 13.33.88.47, 13.33.88.86
Response IP 13.33.88.86
Found Yes
Hash 04bc95503950d915d084d7c6caad195d22970298a3caa47cef3b23abc4dcaa5b
SimHash d88f08c7a6db

Groups

*

Rule Path
Disallow */cgi-bin/
Disallow */wp-admin/
Disallow */wp-includes/
Disallow */wp-content/plugins/
Disallow */wp-content/cache/
Disallow */wp-content/themes/
Disallow */trackback/
Disallow */feed/
Disallow */docs/
Disallow */tab/comments/
Disallow */tab/djml/
Disallow /api-video/
Disallow /community/
Disallow /auth/
Disallow /img/
Disallow /akamai/
Disallow /doubleclick/
Disallow /static/
Disallow /static_html_files/
Disallow /static_js/
Disallow /utilities/
Disallow /search/
Disallow */search/
Disallow /votenview.php
Disallow /indiarealtime/category/artculture/
Disallow /private/
Disallow /print/

Other Records

Field Value
sitemap http://deloitte.wsj.com/sitemap-index.xml

Warnings

  • `acap-crawler` is not a known field.
  • `acap-disallow-crawl` is not a known field.