getsetclean.in
robots.txt

Robots Exclusion Standard data for getsetclean.in

Resource Scan

Scan Details

Site Domain getsetclean.in
Base Domain getsetclean.in
Scan Status Ok
Last Scan2024-10-03T17:20:14+00:00
Next Scan 2024-10-10T17:20:14+00:00

Last Scan

Scanned2024-10-03T17:20:14+00:00
URL https://getsetclean.in/robots.txt
Redirect https://www.getsetclean.in/robots.txt
Redirect Domain www.getsetclean.in
Redirect Base getsetclean.in
Domain IPs 54.169.69.170
Redirect IPs 13.33.88.24, 13.33.88.35, 13.33.88.82, 13.33.88.99
Response IP 13.33.88.24
Found Yes
Hash c87a077662f333b9788cea4967632700ebf3572e62094692efc827f7afb4b9cf
SimHash a88d5860e117

Groups

*

Rule Path
Disallow /analytics-iframe/
Disallow /?
Disallow *utm_%3D

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.getsetclean.in/in/en/sitemap.xml
sitemap https://www.getsetclean.in/in/en/images.sitemap.xml
sitemap https://www.getsetclean.in/in/en/videos.sitemap.xml
sitemap https://www.getsetclean.in/in/hi/sitemap.xml
sitemap https://www.getsetclean.in/in/hi/images.sitemap.xml
sitemap https://www.getsetclean.in/in/hi/videos.sitemap.xml
sitemap https://www.getsetclean.in/in/ta/sitemap.xml
sitemap https://www.getsetclean.in/in/ta/images.sitemap.xml
sitemap https://www.getsetclean.in/in/ta/videos.sitemap.xml
sitemap https://www.getsetclean.in/in/te/sitemap.xml
sitemap https://www.getsetclean.in/in/te/images.sitemap.xml
sitemap https://www.getsetclean.in/in/te/videos.sitemap.xml