beautiful.ai
robots.txt

Robots Exclusion Standard data for beautiful.ai

Resource Scan

Scan Details

Site Domain beautiful.ai
Base Domain beautiful.ai
Scan Status Ok
Last Scan2024-05-22T04:39:46+00:00
Next Scan 2024-06-21T04:39:46+00:00

Last Scan

Scanned2024-05-22T04:39:46+00:00
URL https://beautiful.ai/robots.txt
Redirect https://www.beautiful.ai/robots.txt
Redirect Domain www.beautiful.ai
Redirect Base beautiful.ai
Domain IPs 2001:4860:4802:32::15, 2001:4860:4802:34::15, 2001:4860:4802:36::15, 2001:4860:4802:38::15, 216.239.32.21, 216.239.34.21, 216.239.36.21, 216.239.38.21
Redirect IPs 2404:6800:4003:c05::79, 74.125.24.121
Response IP 172.217.194.121
Found Yes
Hash e27ff454129b32a3d0049ee601830ac8b54fe5d5e3748acf55739f6ba644e17e
SimHash bc0d8344e9f1

Groups

*

Rule Path
Disallow /player

*

Rule Path
Disallow /api

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-agent: *
  • Disallow: /
  • By adding the following line to a robots.txt file on www.beautiful.ai, we will be able to prevent search engines from crawling the presentations.
  • This is important because google is unable to render the actual content in the presentation and all they see is a blank page with different titles.
  • Also presentations private presentations could contain sensitive data, we want to make sure that they do not get crawled and indexed.
  • This should also push Google and other search engines to crawl other, more valuable pages on Beautiful.ai more frequently.
  • It seems that Google has indexed some api routes, such as /api/user/permissions/:id
  • I'm not sure how this makes sense, but I think we can safely disallow /api