deals.mostlycoupons.com
robots.txt

Robots Exclusion Standard data for deals.mostlycoupons.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	deals.mostlycoupons.com
Base Domain	mostlycoupons.com
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a server error.
Last Scan	2024-08-22T13:55:33+00:00
Next Scan	2024-11-20T13:55:33+00:00

Last Successful Scan

Scanned	2022-06-29T12:02:05+00:00
URL	https://deals.mostlycoupons.com/robots.txt
Response IP	162.242.223.52
Found	Yes
Hash	0ad5664192b0b12d27353f1a449fe3b55125b07834dfb28a1ecec1aec5e8c17e
SimHash	ad94dc3166f1

Groups

*

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

10

googlebot-image

Rule	Path
Disallow
Disallow	/admin/
Disallow	/app/
Disallow	/cgi-bin/
Disallow	/downloader/
Disallow	/errors/
Disallow	/includes/
Disallow	/pkginfo/
Disallow	/shell/
Disallow	/var/
Disallow	/index.php/
Disallow	/catalog/product_compare/
Disallow	/catalog/category/view/
Disallow	/catalog/product/view/
Disallow	/catalogsearch/
Disallow	/checkout/
Disallow	/control/
Disallow	/contacts/
Disallow	/customer/
Disallow	/customize/
Disallow	/newsletter/
Disallow	/poll/
Disallow	/review/
Disallow	/sendfriend/
Disallow	/tag/
Disallow	/wishlist/
Disallow	/catalog/product/gallery/
Disallow	/cron.php
Disallow	/cron.sh
Disallow	/error_log
Disallow	/index.php.sample
Disallow	/install.php
Disallow	/LICENSE.html
Disallow	/LICENSE.txt
Disallow	/LICENSE_AFL.txt
Disallow	/RELEASE_NOTES.txt
Disallow	/.buildpath
Disallow	/.gitignore
Disallow	/.project
Disallow	/*.php$
Disallow	/?p=&
Disallow	/*?SID=
Disallow	/?dir
Disallow	/*?dir=desc
Disallow	/*?dir=asc
Disallow	/*?limit=all
Disallow	/?mode
Disallow	/CVS
Disallow	/*.svn$
Disallow	/*.idea$
Disallow	/*.sql$
Disallow	/*.tgz$
Disallow	/index.php/review/product/list/id/
Disallow	/review/product/list/id/
Disallow	/auto-populate-mail
Disallow	/static-html

Rule

Path

Disallow

/admin/

Disallow

/app/

Disallow

/cgi-bin/

Disallow

/downloader/

Disallow

/errors/

Disallow

/includes/

Disallow

/pkginfo/

Disallow

/shell/

Disallow

/var/

Disallow

/index.php/

Disallow

/catalog/product_compare/

Disallow

/catalog/category/view/

Disallow

/catalog/product/view/

Disallow

/catalogsearch/

Disallow

/checkout/

Disallow

/control/

Disallow

/contacts/

Disallow

/customer/

Disallow

/customize/

Disallow

/newsletter/

Disallow

/poll/

Disallow

/review/

Disallow

/sendfriend/

Disallow

/tag/

Disallow

/wishlist/

Disallow

/catalog/product/gallery/

Disallow

/cron.php

Disallow

/cron.sh

Disallow

/error_log

Disallow

/index.php.sample

Disallow

/install.php

Disallow

/LICENSE.html

Disallow

/LICENSE.txt

Disallow

/LICENSE_AFL.txt

Disallow

/RELEASE_NOTES.txt

Disallow

/.buildpath

Disallow

/.gitignore

Disallow

/.project

Disallow

/*.php$

Disallow

/*?p=*&

Disallow

/*?SID=

Disallow

/*?dir*

Disallow

/*?dir=desc

Disallow

/*?dir=asc

Disallow

/*?limit=all

Disallow

/*?mode*

Disallow

/CVS

Disallow

/*.svn$

Disallow

/*.idea$

Disallow

/*.sql$

Disallow

/*.tgz$

Disallow

/index.php/review/product/list/id/

Disallow

/review/product/list/id/

Disallow

/auto-populate-mail

Disallow

/static-html

Back to top

Other Records

Field	Value
sitemap	https://ihotoffers.com/sitemap.xml

Field

Value

sitemap

https://ihotoffers.com/sitemap.xml

Back to top

Comments

****************************************************************************
robots.txt
: Robots, spiders, and search engines use this file to detmine which
content they should *not* crawl while indexing your website.
: This system is called "The Robots Exclusion Standard."
: It is strongly encouraged to use a robots.txt validator to check
for valid syntax before any robots read it!
Examples:
Instruct all robots to stay out of the admin area.
: User-agent: *
: Disallow: /admin/
Restrict Google and MSN from indexing your images.
: User-agent: Googlebot
: Disallow: /images/
: User-agent: MSNBot
: Disallow: /images/
****************************************************************************
Crawlers Setup
Google Image Crawler Setup
Disallow: /
Allow: /media/catalog/product/
Sitemap
Magento admin page
Directories
Disallow: /js/
Disallow: /lib/
Disallow: /media/
Disallow: /skin/
Paths (clean URLs)
Files
Paths (no clean URLs)
Disallow: /*.js$
Disallow: /*.css$
Sub category pages that are sorted or filtered.
Development files and folders: CVS, svn directories and dump files

Back to top

deals.mostlycoupons.comrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

Other Records

googlebot-image

Other Records

Comments

deals.mostlycoupons.com
robots.txt