gene.com
robots.txt
Robots Exclusion Standard data for gene.com
Resource Scan
Scan Details
Site Domain | gene.com |
Base Domain | gene.com |
Scan Status | Ok |
Last Scan | 2024-11-02T05:19:18+00:00 |
Next Scan | 2024-11-16T05:19:18+00:00 |
Last Scan
Scanned | 2024-11-02T05:19:18+00:00 |
URL | https://gene.com/robots.txt |
Redirect | https://www.gene.com/robots.txt |
Redirect Domain | www.gene.com |
Redirect Base | gene.com |
Domain IPs | 72.34.128.111 |
Redirect IPs | 104.18.20.119, 104.18.21.119, 2606:4700::6812:1477, 2606:4700::6812:1577 |
Response IP | 104.18.21.119 |
Found | Yes |
Hash | 7563d148b6ac662a985f5ccfca430d3e4810591a104570298afd4e177d39cdc5 |
SimHash | 251450401312 |
Groups
*
Rule | Path |
---|---|
Disallow | /*.json |
Disallow | /scientists/mta/* |
Disallow | /about-us/suppliers/supplier-registration/* |
Disallow | /meta/search* |
Disallow | /careers/find-a-job?* |
Disallow | /assets/frontend/downloads/media/fs18/* |
Disallow | /download/pdf/nocrawl/* |