bedcrypto.com
robots.txt

Robots Exclusion Standard data for bedcrypto.com

Resource Scan

Scan Details

Site Domain bedcrypto.com
Base Domain bedcrypto.com
Scan Status Ok
Last Scan2024-11-14T21:20:50+00:00
Next Scan 2024-11-21T21:20:50+00:00

Last Scan

Scanned2024-11-14T21:20:50+00:00
URL https://bedcrypto.com/robots.txt
Domain IPs 104.21.0.94, 172.67.185.154, 2606:4700:3032::6815:5e, 2606:4700:3032::ac43:b99a
Response IP 104.21.0.94
Found Yes
Hash f5c6565c6e9d8be0f5d6d518b52245d88528b78525df2a16b5349d755933131a
SimHash be147d00c6f8

Groups

*

Rule Path
Allow /misc/*.css$
Allow /misc/*.css?
Allow /misc/*.js$
Allow /misc/*.js?
Allow /misc/*.gif
Allow /misc/*.jpg
Allow /misc/*.jpeg
Allow /misc/*.png
Allow /modules/*.css$
Allow /modules/*.css?
Allow /modules/*.js$
Allow /modules/*.js?
Allow /modules/*.gif
Allow /modules/*.jpg
Allow /modules/*.jpeg
Allow /modules/*.png
Allow /profiles/*.css$
Allow /profiles/*.css?
Allow /profiles/*.js$
Allow /profiles/*.js?
Allow /profiles/*.gif
Allow /profiles/*.jpg
Allow /profiles/*.jpeg
Allow /profiles/*.png
Allow /themes/*.css$
Allow /themes/*.css?
Allow /themes/*.js$
Allow /themes/*.js?
Allow /themes/*.gif
Allow /themes/*.jpg
Allow /themes/*.jpeg
Allow /themes/*.png
Disallow /includes/
Disallow /misc/
Disallow /modules/
Disallow /profiles/
Disallow /scripts/
Disallow /themes/
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /INSTALL.sqlite.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /wp-login.php
Disallow %5E.*%5C/wp-includes%5C/wlwmanifest.xml
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips/
Disallow /node/add/
Disallow /search/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/logout/
Disallow /wp-json/wp/v2/users/1
Disallow /?q=admin%2F
Disallow /?q=comment%2Freply%2F
Disallow /?q=filter%2Ftips%2F
Disallow /?q=node%2Fadd%2F
Disallow /?q=search%2F
Disallow /?q=user%2Fpassword%2F
Disallow /?q=user%2Fregister%2F
Disallow /?q=user%2Flogin%2F
Disallow /?q=user%2Flogout%2F
Disallow /*?
Allow /*?page=
Disallow /*?page=*&*
Disallow /*?page=0*
Disallow /resources/search/*/*/*
Disallow /*/resources/search/*/*/*

Other Records

Field Value
crawl-delay 10

a6-indexer

Rule Path
Disallow /

alphaseobot

Rule Path
Disallow /

alphaseobot-sa

Rule Path
Disallow /

applebot

Rule Path
Disallow /

aspiegelbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

bingbot/2.0

Rule Path
Disallow /

blackboard safeassign

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

liebaofast

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

mauibot (crawler.feedback+wc@gmail.com)

Rule Path
Disallow /

megaindex.ru/2.0

Rule Path
Disallow /

mqqbrowser

Rule Path
Disallow /

nimbostratus-bot/v1.3.2

Rule Path
Disallow /

qwant-news

Rule Path
Disallow /

qwantify

Rule Path
Disallow /

seekport crawler

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

sputnikbot/2.3

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

timpibot/0.8

Rule Path
Disallow /

tinytestbot

Rule Path
Disallow /

ucbrowser

Rule Path
Disallow /

yacybot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

yandexbot/3.0

Rule Path
Disallow /

yeti

Rule Path
Disallow /

yisouspider

Rule Path
Disallow /

Other Records

Field Value
sitemap https://bedcrypto.com/map/map_0.txt
sitemap https://bedcrypto.com/map/map_1000.txt
sitemap https://bedcrypto.com/map/map_10000.txt
sitemap https://bedcrypto.com/map/map_11000.txt
sitemap https://bedcrypto.com/map/map_12000.txt
sitemap https://bedcrypto.com/map/map_2000.txt
sitemap https://bedcrypto.com/map/map_3000.txt
sitemap https://bedcrypto.com/map/map_4000.txt
sitemap https://bedcrypto.com/map/map_5000.txt
sitemap https://bedcrypto.com/map/map_6000.txt
sitemap https://bedcrypto.com/map/map_7000.txt
sitemap https://bedcrypto.com/map/map_8000.txt
sitemap https://bedcrypto.com/map/map_9000.txt

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html
  • CSS, JS, Images
  • Directories
  • Files
  • RH, 06.30.21: these are files likely bad bots are requesting
  • Paths (clean URLs)
  • RH, 06.30.21: these are files likely bad bots are requesting
  • Paths (no clean URLs)
  • RH, 07.01.21: Views has URL parameters from exposed filters (Archives and Publications Search views); https://www.drupal.org/node/345620
  • Disallow all URL variables except for page
  • RH, 04.30.24: archives search URLs being crawled can kill the sites, espeically with multiple facets. It might be how bots discover Archives items, so allow single faceted browse and search URLs but block all after (/resources/search/domain/anthropology is allowed but not /resources/search/domain/anthropology/contributor/maranz-david-e). If bots find Archives items by crawling the site, then this should allow all items to be found via browse URLs and search URLs. If there continue to be issues, we can consider blocking all. Note that /*/resources/search is for multilingual URLs
  • Disallow: /resources/browse/*
  • Disallow: /resources/search/*
  • Block bots
  • RDH, 08.19.19: I really don't want to block Applebot, but for now, I am. It is crawling us too much
  • RDH, 05.13.20: I really don't want to block bing, but for now, I am. It is also already in htaccess rules
  • RDH, 06.30.21: Very temporary to get some relief.
  • User-Agent: Googlebot
  • Disallow: /
  • RDH, 03/11/22:
  • Comment this out for JOT, who applied for a Crossref Similiarty Check account with TurnitIn;
  • User-agent: TurnitinBot
  • Disallow: /