provokemedia.com
robots.txt

Robots Exclusion Standard data for provokemedia.com

Resource Scan

Scan Details

Site Domain provokemedia.com
Base Domain provokemedia.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-11-28T04:46:01+00:00
Next Scan 2026-02-26T04:46:01+00:00

Last Successful Scan

Scanned2025-04-09T02:51:06+00:00
URL https://provokemedia.com/robots.txt
Redirect https://www.provokemedia.com/robots.txt
Redirect Domain www.provokemedia.com
Redirect Base provokemedia.com
Domain IPs 104.26.8.48, 104.26.9.48, 172.67.72.67, 2606:4700:20::681a:830, 2606:4700:20::681a:930, 2606:4700:20::ac43:4843
Redirect IPs 104.26.8.48, 104.26.9.48, 172.67.72.67, 2606:4700:20::681a:830, 2606:4700:20::681a:930, 2606:4700:20::ac43:4843
Response IP 104.26.8.48
Found Yes
Hash 3a915e978cf29ce26736233f0d293ebf6467cbf08c2bf7090e975b3780136d8f
SimHash 711dd870ef53

Groups

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

amazonbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

*

Rule Path
Allow /

*

Rule Path
Disallow /*?utm_source*

*

Rule Path
Disallow /RestApi*

*

Rule Path
Disallow /rest-api*

Other Records

Field Value
sitemap https://www.provokemedia.com/sitemap.xml
sitemap https://www.provokemedia.com/gnewssitemap.xml

Comments

  • www.robotstxt.org/
  • http://code.google.com/web/controlcrawlindex/