wpmaven.net
robots.txt

Robots Exclusion Standard data for wpmaven.net

Resource Scan

Scan Details

Site Domain wpmaven.net
Base Domain wpmaven.net
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-11-30T12:12:07+00:00
Next Scan 2025-12-07T12:12:07+00:00

Last Successful Scan

Scanned2025-10-28T21:57:04+00:00
URL https://wpmaven.net/robots.txt
Domain IPs 104.21.77.141, 172.67.208.234, 2606:4700:3032::ac43:d0ea, 2606:4700:3036::6815:4d8d
Response IP 104.21.77.141
Found Yes
Hash 54882d604807d707f7fd2bc0ab60118086cf590aea006b683c94da00a8b26558
SimHash 19114912e6de

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /report/
Disallow /wp-content-FSMS/
Disallow /keyword-counter/
Disallow /keyword-counter1/
Disallow /keyword-counter2/
Disallow /keyword-counter3/
Disallow /keyword-counter4/
Allow /wp-content/templates/%20-%20assets/
Allow /wp-content/templates/blogs%20-%20assets/
Disallow /wp-content/

bingbot

Rule Path
Allow /

googlebot

Rule Path
Allow /

archive.org_bot

Rule Path
Allow /

applebot

Rule Path
Allow /

oai-searchbot

Rule Path
Allow /

gptbot

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

bytespider

Rule Path
Allow /

claudebot

Rule Path
Allow /

claude-searchbot

Rule Path
Allow /

claude-user

Rule Path
Allow /

facebookbot

Rule Path
Allow /

meta-externalagent

Rule Path
Allow /

meta-externalfetcher

Rule Path
Allow /

google-cloudvertexbot

Rule Path
Allow /

petalbot

Rule Path
Allow /

chatgpt-user

Rule Path
Allow /

duckassistbot

Rule Path
Allow /

mistralai-user

Rule Path
Allow /

perplexity-user

Rule Path
Allow /

Comments

  • Rules for /wp-content/ - Allow specific subdirectories first
  • Then block the rest of the /wp-content/ directory
  • Specifically allow the following bots (they can access all allowed content):
  • Microsoft
  • Google
  • Internet Archive
  • Apple
  • OpenAI
  • Perplexity
  • ByteDance
  • Anthropic
  • Meta (Facebook)
  • Google Cloud
  • Huawei
  • Additional bots