propublica.net
robots.txt

Robots Exclusion Standard data for propublica.net

Resource Scan

Scan Details

Site Domain propublica.net
Base Domain propublica.net
Scan Status Ok
Last Scan2024-06-25T04:37:34+00:00
Next Scan 2024-07-02T04:37:34+00:00

Last Scan

Scanned2024-06-25T04:37:34+00:00
URL https://propublica.net/robots.txt
Redirect https://www.propublica.org:443/robots.txt
Redirect Domain www.propublica.org
Redirect Base propublica.org
Domain IPs 2600:1f18:12d2:f300:fb5c:a8ff:86bb:adb2, 2600:1f18:12d2:f301:cb19:1b82:dacf:80ef, 2600:1f18:12d2:f302:8017:6a9f:737a:8417, 2600:1f18:12d2:f303:3287:3c:88ce:b7ae, 2600:1f18:12d2:f304:de22:63c9:5f83:2a52, 3.229.22.140, 34.202.75.48, 44.206.104.176, 52.21.176.158, 54.208.147.122
Redirect IPs 104.16.251.51, 104.16.252.51, 2606:4700::6810:fb33, 2606:4700::6810:fc33
Response IP 104.16.252.51
Found Yes
Hash be01d51e5e6a2d8b037335884698350536b0183d5a07fd8fe9a7e6962f44ff93
SimHash a3081c0626d2

Groups

*

Rule Path
Disallow /cpresources/
Disallow /vendor/
Disallow /.env
Disallow /cache/
Disallow /static/projects/investigating-digital-advertising/

Other Records

Field Value
sitemap https://www.propublica.org/sitemap.xml

Comments

  • robots.txt for https://www.propublica.org/