transparentcalifornia.com
robots.txt
Robots Exclusion Standard data for transparentcalifornia.com
Resource Scan
Scan Details
Site Domain | transparentcalifornia.com |
Base Domain | transparentcalifornia.com |
Scan Status | Ok |
Last Scan | 2024-11-15T06:33:24+00:00 |
Next Scan | 2024-11-22T06:33:24+00:00 |
Last Scan
Scanned | 2024-11-15T06:33:24+00:00 |
URL | https://transparentcalifornia.com/robots.txt |
Domain IPs | 104.26.2.128, 104.26.3.128, 172.67.69.48, 2606:4700:20::681a:280, 2606:4700:20::681a:380, 2606:4700:20::ac43:4530 |
Response IP | 104.26.2.128 |
Found | Yes |
Hash | 1d09ea61395048ebe99849fbf92402574fbd1d675d7cd2393f739c35cdadb8f1 |
SimHash | 7644d052c707 |
Groups
baiduspider
yandex
clickagy intelligence bot v2
dotbot
sogou web spider
updownerbot
surveybot
webmeupbot
sogou spider
seoscanners.net
petalbot
Rule | Path |
---|---|
Disallow | / |
bingbot
Rule | Path |
---|---|
Disallow | /salaries/all/ |
Disallow | /pensions/all/ |
Disallow | /salaries/search/ |
Disallow | /pensions/search/ |
Other Records
Field | Value |
---|---|
crawl-delay | 1 |
*
Rule | Path |
---|---|
Disallow | /salaries/all/ |
Disallow | /pensions/all/ |
Disallow | /salaries/search/ |
Disallow | /pensions/search/ |
Other Records
Field | Value |
---|---|
sitemap | https://transcal.s3.amazonaws.com/public/sitemaps/sitemap-index.xml |