cinestaan.com
robots.txt

Robots Exclusion Standard data for cinestaan.com

Resource Scan

Scan Details

Site Domain cinestaan.com
Base Domain cinestaan.com
Scan Status Ok
Last Scan2024-11-13T11:57:16+00:00
Next Scan 2024-11-20T11:57:16+00:00

Last Scan

Scanned2024-11-13T11:57:16+00:00
URL https://cinestaan.com/robots.txt
Redirect https://www.cinestaan.com/robots.txt
Redirect Domain www.cinestaan.com
Redirect Base cinestaan.com
Domain IPs 13.33.28.111, 13.33.28.43, 13.33.28.96, 13.33.28.98
Redirect IPs 13.33.28.111, 13.33.28.43, 13.33.28.96, 13.33.28.98
Response IP 13.33.28.111
Found Yes
Hash f768cc2ead9b670eb1d1b215f9d79e4440f0e8c696c1544ab27924ea8881e229
SimHash 001455125405

Groups

googlebot

Rule Path
Allow /

googlebot-news

Rule Path
Allow /

googlebot-image

Rule Path
Allow /
Allow /

googlebot-mobile

Rule Path
Allow /

msnbot

Rule Path
Allow /

slurp

Rule Path
Allow /

teoma

Rule Path
Allow /

twiceler

Rule Path
Allow /

gigabot

Rule Path
Allow /

scrubby

Rule Path
Allow /

robozilla

Rule Path
Allow /

nutch

Rule Path
Allow /

ia_archiver

Rule Path
Allow /

baiduspider

Rule Path
Allow /

naverbot

Rule Path
Allow /

yeti

Rule Path
Allow /

yahoo-mmcrawler

Rule Path
Allow /

psbot

Rule Path
Allow /

asterias

Rule Path
Allow /

yahoo-blogs/v3.9

Rule Path
Allow /

twitterbot

Rule Path
Allow /

rogerbot

Rule Path
Allow /

dotbot

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.cinestaan.com/sitemap.xml
sitemap https://www.cinestaan.com/news.xml

Warnings

  • 2 invalid lines.
  • `user-gent` is not a known field.