isms.online
robots.txt

Robots Exclusion Standard data for isms.online

Resource Scan

Scan Details

Site Domain isms.online
Base Domain isms.online
Scan Status Ok
Last Scan2024-09-20T07:16:45+00:00
Next Scan 2024-10-20T07:16:45+00:00

Last Scan

Scanned2024-09-20T07:16:45+00:00
URL https://isms.online/robots.txt
Redirect https://www.isms.online/robots.txt
Redirect Domain www.isms.online
Redirect Base isms.online
Domain IPs 193.105.61.5
Redirect IPs 151.101.130.132, 151.101.194.132, 151.101.2.132, 151.101.66.132
Response IP 199.232.46.132
Found Yes
Hash a0eeb897fd02d6f05110d92e5be34735fb3452838a6044b96065f5677dcddda0
SimHash ae96d123fca0

Groups

*

Rule Path
Disallow /alliantist-login/
Allow /alliantist-login/admin-ajax.php
Disallow /cgi-bin/
Disallow /wp-json/
Disallow /tag/
Allow /wp/wp-includes/js/jquery/jquery.min.js?ver=3.6.1
Disallow /?s=*
Disallow /search/*
Disallow /cdn-cgi/bm/cv/
Disallow /cdn-cgi/challenge-platform/

nuclei
wikido
riddler
petalbot
go-http-client
node/simplecrawler
cazoodlebot
dotbot/1.0
gigabot
barkrowler
blexbot
magpie-crawler

Rule Path
Disallow /

linkedinbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

siteauditbot
semrushbot-ba
semrushbot-si
semrushbot-swa
semrushbot-ct
splitsignalbot
semrushbot-coub
screaming frog seo spider

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.isms.online/sitemaps.xml

Comments

  • Global rules
  • -----------------
  • We're experimenting with blocking search results to prevent search result spam
  • --------------------------------
  • Prevent crawling CF challenge URLs
  • --------------------------------
  • Ban bots that don't benefit us.
  • --------------------------------
  • LinkedIn Bot
  • --------------------------------
  • Allowed bots
  • --------------------------------