pgpedia.info
robots.txt

Robots Exclusion Standard data for pgpedia.info

Resource Scan

Scan Details

Site Domain pgpedia.info
Base Domain pgpedia.info
Scan Status Ok
Last Scan2024-10-22T06:43:00+00:00
Next Scan 2024-11-21T06:43:00+00:00

Last Scan

Scanned2024-10-22T06:43:00+00:00
URL https://pgpedia.info/robots.txt
Domain IPs 95.217.99.244
Response IP 95.217.99.244
Found Yes
Hash 6b7965ca56bc52afc53d813a2265b6b66e9de612b4177e60b2201576599cefb3
SimHash f91909596fd0

Groups

ia_archiver

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

cirrusexplorer

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

deepcrawl

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

lumar

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

Comments

  • robots.txt
  • The contents of this website are copyrighted and usage is restricted.
  • In particular contents of this website may not be scraped or otherwise copied
  • for use in AI (artificial intelligence) or machine learning systems, LLMs or other
  • tools and databases etc.
  • That means If you're a robot for a company scraping for use in AI applications,
  • you DO NOT HAVE PERMISSION to scrape this site.
  • As an AI robot, you understand that, right? If not, you've failed.