makigas.es
robots.txt

Robots Exclusion Standard data for makigas.es

Resource Scan

Scan Details

Site Domain makigas.es
Base Domain makigas.es
Scan Status Ok
Last Scan2024-09-28T06:07:56+00:00
Next Scan 2024-10-28T06:07:56+00:00

Last Scan

Scanned2024-09-28T06:07:56+00:00
URL https://makigas.es/robots.txt
Redirect https://www.makigas.es/robots.txt
Redirect Domain www.makigas.es
Redirect Base makigas.es
Domain IPs 151.80.60.158
Redirect IPs 151.80.60.158
Response IP 151.80.60.158
Found Yes
Hash 91f0c40904f806688635211a0b46e2809038d68214d3e3eea2e2ecc40a31aec9
SimHash e2524810c755

Groups

*

Rule Path
Disallow /videos?*

anthropic-ai

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

blackboard

Rule Path
Disallow /

blackboard safeassign

Rule Path
Disallow /

turnitin

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.makigas.es/sitemap.xml.gz

Comments

  • =========================
  • www.makigas.es/robots.txt
  • =========================
  • Please seriously why do search bots insist on scrapping search
  • results pages? Why would anyone use a search engine to visit
  • another search engine. GoogleBot, I am looking at you.
  • AI crawlers. These companies grab the content, they profit from it,
  • and then they build products to gatekeep knowledge so that no one has
  • the need to come to the websites that provided the scrapped content.
  • Source: https://darkvisitors.com
  • Plagiarism detectors. People should not copy verbatim quotes into their
  • homework, but that's orthogonal to companies using my content to sell
  • their services and make profit.
  • Here is my sitemap. It will be faster. The site uses Schema.org formats
  • so if you are looking into a public API, this is what you should want.