stlpublicradio.org
robots.txt

Robots Exclusion Standard data for stlpublicradio.org

Resource Scan

Scan Details

Site Domain stlpublicradio.org
Base Domain stlpublicradio.org
Scan Status Ok
Last Scan2024-05-07T20:07:35+00:00
Next Scan 2024-06-06T20:07:35+00:00

Last Scan

Scanned2024-05-07T20:07:35+00:00
URL https://stlpublicradio.org/robots.txt
Redirect https://www.stlpr.org/robots.txt
Redirect Domain www.stlpr.org
Redirect Base stlpr.org
Domain IPs 44.230.85.241, 52.33.207.7
Redirect IPs 18.155.68.129, 18.155.68.16, 18.155.68.57, 18.155.68.70
Response IP 18.155.68.129
Found Yes
Hash 013d5c037cbcaf29ddf52e412461cece8b2dfd2c6fb33fb40fa394689ea0e260
SimHash 205c9bd40532

Groups

*

Rule Path
Disallow

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

google-inspectiontool

Rule Path
Allow /

google-image

Rule Path
Allow /

google-video

Rule Path
Allow /

googlebot

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.stlpr.org/sitemap.xml
sitemap https://www.stlpr.org/sitemap-latest.xml
sitemap https://www.stlpr.org/news-sitemap-content.xml

Comments

  • Disallowing the OpenAI web crawler
  • Disallowing OpenAI plugins
  • Disallowing Google Bard and Vertex AI web crawlers
  • Allow Google Search Console for sitemap crawling