turkman.cc
robots.txt

Robots Exclusion Standard data for turkman.cc

Resource Scan

Scan Details

Site Domain turkman.cc
Base Domain turkman.cc
Scan Status Ok
Last Scan2024-10-05T20:42:46+00:00
Next Scan 2024-10-12T20:42:46+00:00

Last Scan

Scanned2024-10-05T20:42:46+00:00
URL https://turkman.cc/robots.txt
Domain IPs 104.21.76.172, 172.67.197.249, 2606:4700:3030::ac43:c5f9, 2606:4700:3035::6815:4cac
Response IP 172.67.197.249
Found Yes
Hash a71e188bc9de955cd5648ca1ced80e78db23dcc50861269083edc0c82280b48c
SimHash 7d41b4731131

Groups

*

Rule Path
Allow /engine/classes/min/*
Allow /engine/classes/js/*
Allow /engine/data/emoticons/*
Disallow /engine/
Disallow /user/
Disallow /newposts/
Disallow /statistics.html
Disallow /*subaction%3Duserinfo
Disallow /*subaction%3Dnewposts
Disallow /*do%3Dlastcomments
Disallow /*do%3Dfeedback
Disallow /*do%3Dregister
Disallow /*do%3Dlostpassword
Disallow /*do%3Daddnews
Disallow /*do%3Dstats
Disallow /*do%3Dpm
Disallow /*do%3Dsearch
Disallow */page/*
Disallow /xfsearch/*
Disallow */?*

Other Records

Field Value
sitemap https://turkman.cc/sitemap.xml

Warnings

  • `host` is not a known field.