cybermen.com
robots.txt

Robots Exclusion Standard data for cybermen.com

Resource Scan

Scan Details

Site Domain cybermen.com
Base Domain cybermen.com
Scan Status Ok
Last Scan2024-05-22T17:13:48+00:00
Next Scan 2024-05-29T17:13:48+00:00

Last Scan

Scanned2024-05-22T17:13:48+00:00
URL https://cybermen.com/robots.txt
Redirect https://www.cybermen.com/robots.txt
Redirect Domain www.cybermen.com
Redirect Base cybermen.com
Domain IPs 185.40.101.43
Redirect IPs 185.40.101.43
Response IP 185.40.101.43
Found Yes
Hash 315165b8539e111dfe0eb00502399c35c646657c25da30f2c31a3f856cda010a
SimHash 4155c3102633

Groups

mj12bot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider-image

Rule Path
Disallow /

baiduspider-video

Rule Path
Disallow /

geekystats.com crawler

Rule Path
Disallow /

jugendschutzprogramm-crawler; info: http://www.jugendschutzprogramm.de

Rule Path
Disallow /

hybridbot (hybrid.ru/about. if our bot caused problems please contact us. contact email: m.lyashkov@targetix.net)

Rule Path
Disallow /

mozilla/5.0 (compatible; grapeshotcrawler/2.0; +http://www.grapeshot.co.uk/crawler.php)

Rule Path
Disallow /

proximic

Rule Path
Disallow /

sirdatabot (+https://semantic-api.docs.sirdata.net/contextual-api/contextual-api/introduction)

Rule Path
Disallow /Account/Activate/

*

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.cybermen.com/sitemap.xml