scoopforwork.com
robots.txt

Robots Exclusion Standard data for scoopforwork.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	scoopforwork.com
Base Domain	scoopforwork.com
Scan Status	Ok
Last Scan	2024-09-19T13:22:41+00:00
Next Scan	2024-10-19T13:22:41+00:00

Last Scan

Scanned	2024-09-19T13:22:41+00:00
URL	https://scoopforwork.com/robots.txt
Redirect	https://www.flexindex.com/robots.txt
Redirect Domain	www.flexindex.com
Redirect Base	flexindex.com
Domain IPs	13.227.254.26, 13.227.254.56, 13.227.254.61, 13.227.254.79
Redirect IPs	76.76.21.123, 76.76.21.241
Response IP	76.76.21.241
Found	Yes
Hash	f960d305be1e7e9b74282a001a41613e20669f16110e3c3fef5eb0e6f90f5e65
SimHash	49063f454a51

Groups

*

Rule	Path
Disallow	%2C
Disallow	rangeOfEmployees
Disallow	headquarterAddress.cityheadquarterAddress.state*
Disallow	headquarterAddress.stateheadquarterAddress.city*
Disallow	/explore?headquarterAddress.state*
Disallow	/explore?headquarterAddress.city*
Disallow	/explore?industry*
Disallow	?headquarterAddress.state=&*
Disallow	?headquarterAddress.city=&*
Disallow	?industry=&*

Rule

Path

Disallow

*%2C*

Disallow

*rangeOfEmployees*

Disallow

*headquarterAddress.city*headquarterAddress.state*

Disallow

*headquarterAddress.state*headquarterAddress.city*

Disallow

/explore?headquarterAddress.state*

Disallow

/explore?headquarterAddress.city*

Disallow

/explore?industry*

Disallow

*?headquarterAddress.state=*&*

Disallow

*?headquarterAddress.city=*&*

Disallow

*?industry=*&*

Back to top

Other Records

Field	Value
sitemap	https://www.flexindex.com/sitemap.xml

Field

Value

sitemap

https://www.flexindex.com/sitemap.xml

Back to top

Comments

blocks crawling of 2+ filters of the same type
blocks crawling of company size variations in all forms
blocks crawling of state & city combos
blocks crawling from non-flex type root folders
blocks crawling of 2+ filter combos within flex types

Back to top

scoopforwork.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Comments

scoopforwork.com
robots.txt