trustnet.com
robots.txt

Robots Exclusion Standard data for trustnet.com

Resource Scan

Scan Details

Site Domain trustnet.com
Base Domain trustnet.com
Scan Status Ok
Last Scan2024-11-17T01:09:53+00:00
Next Scan 2024-11-24T01:09:53+00:00

Last Scan

Scanned2024-11-17T01:09:53+00:00
URL https://trustnet.com/robots.txt
Redirect https://www.trustnet.com/robots.txt
Redirect Domain www.trustnet.com
Redirect Base trustnet.com
Domain IPs 104.18.81.112, 104.18.82.112, 2606:4700::6812:5170, 2606:4700::6812:5270
Redirect IPs 104.18.81.112, 104.18.82.112, 2606:4700::6812:5170, 2606:4700::6812:5270
Response IP 104.18.81.112
Found Yes
Hash 6a1718b0a37753f7ef851b2219fb9966f1b76867507406341e97266a742cdb87
SimHash b8021d0b4744

Groups

*

Rule Path
Disallow /aspnet_client/
Disallow /bin/
Disallow /config/
Disallow /data/
Disallow /install/
Disallow /macroScripts/
Disallow /masterpages/
Disallow /umbraco/
Disallow /umbraco_client/
Disallow /usercontrols/
Disallow /xslt/

libwww-perl

Rule Path
Disallow /

Other Records

Field Value
sitemap http://{HTTP_HOST}/sitemap

Comments

  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these “Robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: https://example.com/robots.txt
  • Ignored: https://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html