itgyani.com
robots.txt

Robots Exclusion Standard data for itgyani.com

Resource Scan

Scan Details

Site Domain itgyani.com
Base Domain itgyani.com
Scan Status Ok
Last Scan2025-11-20T19:16:25+00:00
Next Scan 2025-11-27T19:16:25+00:00

Last Scan

Scanned2025-11-20T19:16:25+00:00
URL https://itgyani.com/robots.txt
Domain IPs 104.21.40.41, 172.67.175.48, 2606:4700:3030::6815:2829, 2606:4700:3031::ac43:af30
Response IP 104.21.40.41
Found Yes
Hash e6bdbdeee2d54fd918919a8ab576ed454ce23a4d6efcdc0fe1fbb9b5ed7d2b12
SimHash 68a21ad0a4b0

Groups

*

Rule Path
Allow /
Disallow /admin/
Disallow /auth/
Disallow /api/
Allow /assets/
Allow /*.css$
Allow /*.js$
Allow /*.jpg$
Allow /*.jpeg$
Allow /*.png$
Allow /*.gif$
Allow /*.webp$
Allow /*.svg$

Other Records

Field Value
crawl-delay 1

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

facebookexternalhit

Rule Path
Allow /

twitterbot

Rule Path
Allow /

Other Records

Field Value
sitemap https://itgyani.com/sitemap.xml
sitemap https://itgyani.com/blog-sitemap.xml

Comments

  • Sitemaps
  • Block admin and private areas
  • Allow important resources
  • Crawl delay for respectful crawling
  • Specific rules for different bots