thedirect.com
robots.txt

Robots Exclusion Standard data for thedirect.com

Resource Scan

Scan Details

Site Domain thedirect.com
Base Domain thedirect.com
Scan Status Ok
Last Scan2024-09-21T16:45:23+00:00
Next Scan 2024-09-28T16:45:23+00:00

Last Scan

Scanned2024-09-21T16:45:23+00:00
URL https://thedirect.com/robots.txt
Domain IPs 172.66.40.139, 172.66.43.117, 2606:4700:3108::ac42:288b, 2606:4700:3108::ac42:2b75
Response IP 172.66.43.117
Found Yes
Hash 733da0925ee610460e657da2fc4ccb5d93fd4a5ffb4af5e2d6f16750ef20de73
SimHash f79d8d420792

Groups

*

Rule Path
Disallow /testHome/*
Disallow /user-info/*
Disallow /articles/article-api-info/
Disallow /accounts/
Disallow /search/
Disallow *?itm_source=*

Other Records

Field Value
sitemap https://thedirect.com/sitemap.xml
sitemap https://thedirect.com/sitemap/index/
sitemap https://thedirect.com/googleNewsSitemap.xml
sitemap https://thedirect.com/tagSitemap.xml
sitemap https://thedirect.com/sitemap/wiki/
sitemap https://thedirect.com/author_sitemap.xml

Warnings

  • `host` is not a known field.