fileformat.info
robots.txt

Robots Exclusion Standard data for fileformat.info

Resource Scan

Scan Details

Site Domain fileformat.info
Base Domain fileformat.info
Scan Status Ok
Last Scan2024-06-13T21:55:59+00:00
Next Scan 2024-06-20T21:55:59+00:00

Last Scan

Scanned2024-06-13T21:55:59+00:00
URL https://fileformat.info/robots.txt
Redirect https://www.fileformat.info/robots.txt
Redirect Domain www.fileformat.info
Redirect Base fileformat.info
Domain IPs 104.21.3.2, 172.67.129.246, 2606:4700:3031::ac43:81f6, 2606:4700:3035::6815:302
Redirect IPs 104.21.3.2, 172.67.129.246, 2606:4700:3031::ac43:81f6, 2606:4700:3035::6815:302
Response IP 104.21.3.2
Found Yes
Hash 89d5c0c67d7c361290542ddefa2b366e3482cc96e344e477e9a3e95dc06c1232
SimHash 0cf8cae39851

Groups

*

Rule Path
Disallow /_
Disallow /about/feed
Disallow /about/javad
Disallow /about/webal
Disallow /down
Disallow /mirror/news
Disallow /other/bookm
Disallow /security
Disallow /user
Disallow /honeypot.txt
Disallow /format/unipage/sample/

npbot

Rule Path
Disallow /

digimarc

Rule Path
Disallow /

marcspider

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

becomebot

Rule Path
Disallow /

updated

Rule Path
Disallow /

sbider

Rule Path
Disallow /info

voyager/1.0

Rule Path
Disallow /

stubhub

Rule Path
Disallow /

gsa-crawler

Rule Path
Disallow /

sygolbot http://www.sygol.net

Rule Path
Disallow /

virus_detector

Rule Path
Disallow /

nimblecrawler

Rule Path
Disallow /

shopwiki

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

Other Records

Field Value
sitemap http://www.fileformat.info/sitemap.xml

Comments

  • robots.txt for FileFormat.Info (www.fileformat.info)
  • Disallow: /css
  • Disallow: /images
  • Disallow: /js
  • https://yandex.ru/support/webmaster/robot-workings/clean-param.html
  • some samples have relative links that cause crawlers to go nuts
  • Useless busybody: http://www.nameprotect.com/botinfo.html
  • DigiMarc watermark checker
  • yet another robo-snitch
  • shopping bots: nothing here to buy
  • if you're not going to send links my way, don't crush me
  • http://www.sitesell.com/sbider.html
  • nothing health-related here
  • http://www.kosmix.com/crawler.html
  • no tickets for sale
  • http://www.stubhub.com/
  • no classifieds here
  • http://www.sygol.net/
  • no viruses here
  • securecomputing.com
  • nothing healthy here
  • healthline.com
  • http://www.shopwiki.com/wiki/Help:Bot
  • nothing to buy here
  • http://www.warebay.com/bot.html
  • nothing to buy here
  • annoying but harmless SEO monitor
  • https://ahrefs.com/robot/index.php

Warnings

  • `clean-param` is not a known field.