cheapbooks.com
robots.txt

Robots Exclusion Standard data for cheapbooks.com

Resource Scan

Scan Details

Site Domain cheapbooks.com
Base Domain cheapbooks.com
Scan Status Ok
Last Scan2024-10-12T02:07:10+00:00
Next Scan 2024-10-19T02:07:10+00:00

Last Scan

Scanned2024-10-12T02:07:10+00:00
URL http://cheapbooks.com/robots.txt
Domain IPs 143.198.141.174
Response IP 143.198.141.174
Found Yes
Hash 7199d2707d91e7fd05a80e5a36664739854999bdd7d9b1bc93cffbf62210b290
SimHash ca5d5ce24610

Groups

*

Rule Path
Disallow /click*.cgi
Disallow /price*.cgi
Disallow /pics/*
Disallow /thumbnails/*

Other Records

Field Value
crawl-delay 1

adbeat_bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

amazonadbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

clickagy

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

grapeshotcrawler

Rule Path
Disallow /

linguee

Rule Path
Disallow /

linkdexbot

Rule Path
Disallow /

proximic

Rule Path
Disallow /

rytebot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

Other Records

Field Value
sitemap http://cdn.cheapbooks.com/sub/sitemaps/index.xml

Comments

  • $Id: robots.txt,v 1.1 2013/05/19 12:45:03 root Exp $
  • Disallow: /search*.cgi