techbriefs.com
robots.txt

Robots Exclusion Standard data for techbriefs.com

Resource Scan

Scan Details

Site Domain techbriefs.com
Base Domain techbriefs.com
Scan Status Ok
Last Scan2024-05-05T00:57:53+00:00
Next Scan 2024-06-04T00:57:53+00:00

Last Scan

Scanned2024-05-05T00:57:53+00:00
URL https://techbriefs.com/robots.txt
Redirect https://www.techbriefs.com/robots.txt
Redirect Domain www.techbriefs.com
Redirect Base techbriefs.com
Domain IPs 52.5.61.5
Redirect IPs 52.5.61.5
Response IP 52.5.61.5
Found Yes
Hash 06f4276354099c57f780953f7f483ff6297c6cb0c9a81ebdb1315c6e19e6d9c6
SimHash 6a3f0c584fd8

Groups

*

Rule Path
Disallow /administrator/
Disallow /api/
Disallow /bin/
Disallow /cache/
Disallow /cli/
Disallow /components/
Disallow /includes/
Disallow /installation/
Disallow /language/
Disallow /layouts/
Disallow /libraries/
Disallow /logs/
Disallow /modules/
Disallow /plugins/
Disallow /tmp/
Disallow /tb/search
Disallow /mdb/search
Disallow /search

turnitinbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

chrome-lighthouse

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

semrushbot-sa

Rule Path
Disallow /

semrushbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

rogerbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

barkrowler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 8

serpstatbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

megaindex.ru/2.0

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

magpie-crawler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

dataforseobot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

barkrowler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

neevabot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

seekportbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

friendlycrawler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

yandexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

Comments

  • If the Joomla site is installed within a folder
  • eg www.example.com/joomla/ then the robots.txt file
  • MUST be moved to the site root
  • eg www.example.com/robots.txt
  • AND the joomla folder name MUST be prefixed to all of the
  • paths.
  • eg the Disallow rule for the /administrator/ folder MUST
  • be changed to read
  • Disallow: /joomla/administrator/
  • For more information about the robots.txt standard, see:
  • https://www.robotstxt.org/orig.html
  • try to keep bots away from search
  • the below needs to be added to svn
  • https://moz.com/help/moz-procedures/crawlers/rogerbot
  • Disallow: /
  • https://babbar.tech/crawler
  • https://serpstatbot.com/
  • https://megaindex.com/crawler
  • https://www.brandwatch.com/legal/magpie-crawler/
  • https://www.linkdex.com/en-us/about/bots/
  • User-agent: linkdexbot
  • Crawl-Delay: 10
  • https://dataforseo.com/dataforseo-bot
  • https://www.babbar.tech/crawler
  • see https://www.abuseipdb.com/check/154.54.249.204
  • https://neeva.com/neevabot
  • 100.26.127.17
  • https://bot.seekport.com
  • http://yandex.com/bots