scoopforwork.com
robots.txt

Robots Exclusion Standard data for scoopforwork.com

Resource Scan

Scan Details

Site Domain scoopforwork.com
Base Domain scoopforwork.com
Scan Status Ok
Last Scan2024-09-19T13:22:41+00:00
Next Scan 2024-10-19T13:22:41+00:00

Last Scan

Scanned2024-09-19T13:22:41+00:00
URL https://scoopforwork.com/robots.txt
Redirect https://www.flexindex.com/robots.txt
Redirect Domain www.flexindex.com
Redirect Base flexindex.com
Domain IPs 13.227.254.26, 13.227.254.56, 13.227.254.61, 13.227.254.79
Redirect IPs 76.76.21.123, 76.76.21.241
Response IP 76.76.21.241
Found Yes
Hash f960d305be1e7e9b74282a001a41613e20669f16110e3c3fef5eb0e6f90f5e65
SimHash 49063f454a51

Groups

*

Rule Path
Disallow *%2C*
Disallow *rangeOfEmployees*
Disallow *headquarterAddress.city*headquarterAddress.state*
Disallow *headquarterAddress.state*headquarterAddress.city*
Disallow /explore?headquarterAddress.state*
Disallow /explore?headquarterAddress.city*
Disallow /explore?industry*
Disallow *?headquarterAddress.state=*&*
Disallow *?headquarterAddress.city=*&*
Disallow *?industry=*&*

Other Records

Field Value
sitemap https://www.flexindex.com/sitemap.xml

Comments

  • blocks crawling of 2+ filters of the same type
  • blocks crawling of company size variations in all forms
  • blocks crawling of state & city combos
  • blocks crawling from non-flex type root folders
  • blocks crawling of 2+ filter combos within flex types