firstsing.com
robots.txt

Robots Exclusion Standard data for firstsing.com

Resource Scan

Scan Details

Site Domain firstsing.com
Base Domain firstsing.com
Scan Status Ok
Last Scan2025-09-18T00:52:29+00:00
Next Scan 2025-10-18T00:52:29+00:00

Last Scan

Scanned2025-09-18T00:52:29+00:00
URL https://firstsing.com/robots.txt
Domain IPs 104.21.56.76, 172.67.180.73, 2606:4700:3031::6815:384c, 2606:4700:3035::ac43:b449
Response IP 104.21.56.76
Found Yes
Hash 3de3555060cbd3016650bccf2ec6591c32602050575758fcedc9735cd6520fdf
SimHash 2f213ff346f5

Groups

*

Rule Path
Disallow /includes/

baiduspider

Rule Path
Allow /includes/

baiduspider-image

Rule Path
Allow /includes/

baiduspider-mobile

Rule Path
Allow /includes/

baiduspider-news

Rule Path
Allow /includes/

baiduspider-video

Rule Path
Allow /includes/

bingbot

Rule Path
Allow /includes/

msnbot-media

Rule Path
Allow /includes/

adidxbot

Rule Path
Allow /includes/

googlebot

Rule Path
Allow /includes/

googlebot-image

Rule Path
Allow /includes/

googlebot-mobile

Rule Path
Allow /includes/

googlebot-news

Rule Path
Allow /includes/

googlebot-video

Rule Path
Allow /includes/

mediapartners-google

Rule Path
Allow /includes/

adsbot-google

Rule Path
Allow /includes/

slurp

Rule Path
Allow /includes/

yandex

Rule Path
Allow /includes/

Other Records

Field Value
sitemap https://firstsing.com/sitemap.xml

Comments

  • This robots.txt file controls crawling of URLs under https://firstsing.com.
  • All crawlers are disallowed to crawl files in the "includes" directory, such
  • as .css, .js, but Google needs them for rendering, so Googlebot is allowed
  • to crawl them.