fxtop.com
robots.txt

Robots Exclusion Standard data for fxtop.com

Resource Scan

Scan Details

Site Domain fxtop.com
Base Domain fxtop.com
Scan Status Ok
Last Scan2024-10-01T12:47:19+00:00
Next Scan 2024-10-08T12:47:19+00:00

Last Scan

Scanned2024-10-01T12:47:19+00:00
URL https://fxtop.com/robots.txt
Domain IPs 135.125.180.177, 2001:41d0:700:4eb1::
Response IP 135.125.180.177
Found Yes
Hash a59dc4a2ce06bfc4fb0099245c2c9527c3b69c2e0084841a5871659dcc4caf02
SimHash bafc15568013

Groups

sosospider

Rule Path
Disallow /php3
Allow /

ec2linkfinder

Rule Path
Disallow /

yandex

Rule Path
Disallow /

sogou

Rule Path
Disallow /php3
Allow /

youdaobot

Rule Path
Disallow /php3
Allow /

naverbot

Rule Path
Disallow /php3
Allow /

yeti

Rule Path
Disallow /php3
Allow /

ichiro

Rule Path
Disallow /php3
Allow /

spinn3r

Rule Path
Disallow /

googlebot

Rule Path
Disallow
Allow /

mediapartners-google

Rule Path
Disallow
Allow /

proximic

Rule Path
Disallow /

*

Rule Path
Disallow /php3
Allow /

Other Records

Field Value
crawl-delay 600

Other Records

Field Value
sitemap https://fxtop.com/sitemap.xml

Comments

  • Deny Soso spider in the site <http://help.soso.com/webspider.htm>
  • allowed on 23/12/2015
  • Disallow: /
  • Deny EC2LinkFinder in the site
  • deny 80legs.com webcrawler
  • allowed on 23/12/2015
  • Disallow: /
  • deny http://www.metadatalabs.com/mlbot
  • User-agent: MLBot
  • Disallow: /
  • deny yandex http://yandex.com/bots
  • allowed on 23/12/2015
  • User-agent: Yandex
  • Crawl-delay: 500 # specifies a 500 seconds timeout
  • Disallow: /
  • Disallow: /php3
  • Allow: /
  • prohibits downloading anything except for the pages
  • starting with '/cgi-bin'
  • deny Sogou spider http://www.sogou.com/docs/help/webmasters.htm#07
  • allowed on 23/12/2015
  • Disallow: /
  • <http://www.youdao.com/help/webmaster/spider/>
  • allowed on 23/12/2015
  • Disallow: /
  • <http://help.naver.com/robots/>
  • allowed on 23/12/2015
  • Disallow: /
  • user agent Yeti
  • allowed on 23/12/2015
  • Disallow: /
  • <http://help.goo.ne.jp/door/crawler.html>
  • allowed on 23/12/2015
  • Disallow: /
  • Ils font payer les donnees d'autrui.
  • <http://spinn3r.com/robot>
  • google case
  • User-agent: Googlebot
  • Disallow: /php3
  • Allow: /
  • proximic used by amazon EC2 cloud, we already blacklisted some of their IPs for abuse
  • general case
  • specifies a 600 seconds timeout

Warnings

  • 3 invalid lines.