trailspace.com
robots.txt

Robots Exclusion Standard data for trailspace.com

Resource Scan

Scan Details

Site Domain trailspace.com
Base Domain trailspace.com
Scan Status Ok
Last Scan2024-09-20T23:41:53+00:00
Next Scan 2024-09-27T23:41:53+00:00

Last Scan

Scanned2024-09-20T23:41:53+00:00
URL https://trailspace.com/robots.txt
Redirect https://www.trailspace.com/robots.txt
Redirect Domain www.trailspace.com
Redirect Base trailspace.com
Domain IPs 54.86.140.68
Redirect IPs 54.86.140.68
Response IP 54.86.140.68
Found Yes
Hash 9290b3d18f3848b34aa4385f69cc7ae0084405e9ba974e00b046299e8133583d
SimHash 02365bd34453

Groups

becomebot

Rule Path
Disallow /

voyager

Rule Path
Disallow /

exabot

Rule Path
Disallow /

geniebot

Rule Path
Disallow /

ichiro

Rule Path
Disallow /

accoona-ai-agent

Rule Path
Disallow /

psbot

Rule Path
Disallow /

spiderman

Rule Path
Disallow /

converacrawler

Rule Path
Disallow /

nutch

Rule Path
Disallow /

http://www.almaden.ibm.com/cs/crawler

Rule Path
Disallow /

boitho.com-dc

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

sensis web crawler

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

sdcresearchlabs-testbot

Rule Path
Disallow /

shopwiki

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

fatbot

Rule Path
Disallow /

tmcrawler

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

snapbot

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

sproose

Rule Path
Disallow /

shim-crawler

Rule Path
Disallow /

omniexplorer_bot

Rule Path
Disallow /

*

Rule Path
Disallow /brands?*
Disallow /out/
Disallow /search/
Disallow /site-utilities/

Comments

  • https://www.trailspace.com
  • Robot Exclusion File -- robots.txt