cell.com
robots.txt
Robots Exclusion Standard data for cell.com
Resource Scan
Scan Details
Site Domain | cell.com |
Base Domain | cell.com |
Scan Status | Ok |
Last Scan | 2024-10-21T14:28:28+00:00 |
Next Scan | 2024-11-20T14:28:28+00:00 |
Last Scan
Scanned | 2024-10-21T14:28:28+00:00 |
URL | https://cell.com/robots.txt |
Redirect | https://www.cell.com/robots.txt |
Redirect Domain | www.cell.com |
Redirect Base | cell.com |
Domain IPs | 65.156.1.100 |
Redirect IPs | 162.159.140.114, 172.66.0.112 |
Response IP | 172.66.0.112 |
Found | Yes |
Hash | d7b32924864bec31da6f142970cc0af2ba90eb1149770c71eaa40ec4ef7beea3 |
SimHash | 631c0860c7f3 |
Groups
*
Rule | Path |
---|---|
Disallow | /action |
Disallow | /help |
Disallow | /search |
Disallow | /feedback |
Disallow | /rss |
Disallow | /action/clickThrough |
Disallow | /action/showLogin |
Disallow | /page/account-confirmation-thanks |
Disallow | /media |
Disallow | /medical-research |
Disallow | /servlet/linkout |
Disallow | /na101/ |
Disallow | /na101v1/ |
Disallow | /na102/ |
Disallow | /doi/mlt/ |
Allow | /action/showJournal |
Allow | /action/showXml |
Allow | /series |
Allow | /isbn |
Allow | /doi/book |
Allow | /.well-known/tdmrep.json |
Other Records
Field | Value |
---|---|
sitemap | https://www.cell.com/sitemap-index-1.txt |
sitemap | https://www.cell.com/custom_pages.gz |