trespass.com
robots.txt

Robots Exclusion Standard data for trespass.com

Resource Scan

Scan Details

Site Domain trespass.com
Base Domain trespass.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-09-07T11:52:25+00:00
Next Scan 2024-12-06T11:52:25+00:00

Last Successful Scan

Scanned2024-04-18T10:25:53+00:00
URL https://trespass.com/robots.txt
Redirect https://www.trespass.com/robots.txt
Redirect Domain www.trespass.com
Redirect Base trespass.com
Domain IPs 107.154.215.106, 107.154.80.106
Redirect IPs 45.60.126.183
Response IP 45.60.126.183
Found Yes
Hash d046fd633f5e5cc0b27ecb8640ba998c59a82d5e4d7da8eaec96cfa4e857415e
SimHash 1954985adca4

Groups

*

Rule Path
Disallow /app/
Disallow /bin/
Disallow /directory/
Disallow /downloader/
Disallow /install/
Disallow /js/
Disallow /lib/
Disallow /phpserver/
Disallow /pkginfo/
Disallow /private/
Disallow /setup/
Disallow /skin/
Disallow /update/
Disallow /var/
Disallow /vendor/
Disallow /admin/
Disallow */api/
Disallow */catalog/category/view/
Disallow */catalog/product/view/
Disallow */catalogsearch
Disallow */checkout
Disallow */contacts
Disallow */customer/
Disallow */newsletter
Disallow */order/
Disallow */poll/
Disallow */report/
Disallow */review/
Disallow */rma/
Disallow */sendfriend
Disallow */storelocator/index/view/id/
Disallow */wishlist
Disallow /cron
Disallow /cron.php
Disallow */cat/
Disallow /*/pricematch/
Disallow /*/colour/
Disallow /*/show/
Disallow /*/size/
Disallow */sort-by/
Disallow /*?*product_list_dir=
Disallow /*?*product_list_limit=
Disallow /*?*product_list_mode=
Disallow /*?*product_list_order=
Disallow /*.php
Disallow /*.sh$
Disallow /*.CSV$
Disallow /*.csv$
Disallow /*.gitignore$
Disallow /*.sample$
Disallow /*.sql$
Disallow /*.zip$
Disallow /*%26carrier%3D
Disallow /*%26cat%3D
Disallow /*%26color_filter%3D
Disallow /*%26display_size%3D
Disallow /*%26name%3D
Disallow /*%26network%3D
Disallow /*%26os%3D
Disallow /*%26price%3D
Disallow /*%26timestamp%3D
Disallow /*%26top_seller%3D
Disallow /*?SID=
Disallow /*?___SID=
Disallow /*?___store=
Disallow /*?carrier=
Disallow /*?cat=
Disallow /*?color_filter=
Disallow /*?display_size=
Disallow /*?name=
Disallow /*?network=
Disallow /*?os=
Disallow /*?price=
Disallow /*?timestamp=
Disallow /*?top_seller=
Disallow /*?size_legacy=*
Disallow /*?base_color=*
Disallow /*?manufacturer=*
Disallow /*?insulation_type=*
Disallow /*?gender=*
Disallow /*?p=2&size_legacy=*
Disallow *?size_legacy=*
Disallow *?base_color=*
Disallow *?manufacturer=*
Disallow *?insulation_type=*
Disallow *?gender=*
Disallow *?p=2&size_legacy=*
Disallow /?adnetwork=
Disallow /%26adnetwork%3D
Disallow /?affc=
Disallow /%26affc%3D
Disallow /?cto_pld=
Disallow /%26cto_pld%3D
Allow /*?p=

Other Records

Field Value
crawl-delay 10

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

amazonbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

nutch

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

seekbot

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

sogou inst spider

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

adsbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider-image

Rule Path
Disallow /

baiduspider-mobile

Rule Path
Disallow /

baiduspider-news

Rule Path
Disallow /

baiduspider-video

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

botify

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

idealo-bot

Rule Path
Disallow /

megaindex.com

Rule Path
Disallow /

newspaper/0.2.8

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.trespass.com/sitemap.xml

Comments

  • Crawlers Setup
  • Directories
  • Paths (clean URLs)
  • Files
  • Do not index pages that are sorted or filtered.
  • Clean URLs - Suffixes
  • Disallow: /*?
  • CVS, SVN directory and dump files
  • Parameters
  • Disallow: /*?PageSpeed=noscript
  • Allow pagination
  • Few bots obey the crawl delay but we will set it to 10
  • Google ignores this so it does not matter that it is universally set
  • Bad bots
  • Ahrefs is > 5% of traffic and can be disallowed on Black Friday. 20 seconds crawl delay specified for normal days
  • Amazon bot for Alexa does not respect crawl rate and is not behaving on N

Warnings

  • 2 invalid lines.