sec.online.wsj.com
robots.txt

Robots Exclusion Standard data for sec.online.wsj.com

Resource Scan

Scan Details

Site Domain sec.online.wsj.com
Base Domain wsj.com
Scan Status Ok
Last Scan2024-05-21T15:54:09+00:00
Next Scan 2024-06-20T15:54:09+00:00

Last Scan

Scanned2024-05-21T15:54:09+00:00
URL http://sec.online.wsj.com/robots.txt
Domain IPs 205.203.140.1
Response IP 205.203.140.1
Found Yes
Hash 832d1f6fdb2d812267f4565cda60bb5b92a61b73f056692ddc6d485329b9869c
SimHash 13807c7f0761

Groups

applebot

Rule Path
Disallow /articles/
Disallow /news/articles/
Disallow /article/
Disallow /news/
Disallow /article_email/*
Disallow /user/*
Disallow /login/*
Disallow /acct/*
Disallow /msgcenter/*
Disallow /setup/*
Disallow /marketing/*
Disallow /public/article/*
Disallow /public/search/
Disallow /public/search*
Disallow /search*
Disallow /public/page/wsj-x-marketing.html
Disallow /public/page/news-media-marketing.html
Disallow /public/page/0_0_WP_RT_MARKETING.html
Disallow /news/articles/SB2*
Disallow /news/articles/SB3*
Disallow /news/articles/SB4*
Disallow /articles/SB2*
Disallow /articles/SB3*
Disallow /articles/SB4*
Disallow /article/AP*
Disallow /article/BT-CO*
Disallow /article/DN-CO*
Disallow /article/PR-CO*
Disallow /article/HUG*
Disallow /video/search/*
Disallow /articles/BT-CO*
Disallow /articles/DN-CO*
Disallow /articles/PR-CO*
Disallow /news/articles/BT-CO*
Disallow /news/articles/DN-CO*
Disallow /news/articles/PR-CO*
Disallow /topics*

*

Rule Path
Disallow /