genpact.com
robots.txt

Robots Exclusion Standard data for genpact.com

Resource Scan

Scan Details

Site Domain genpact.com
Base Domain genpact.com
Scan Status Ok
Last Scan2024-10-30T18:21:45+00:00
Next Scan 2024-11-29T18:21:45+00:00

Last Scan

Scanned2024-10-30T18:21:45+00:00
URL https://genpact.com/robots.txt
Redirect https://www.genpact.com/robots.txt
Redirect Domain www.genpact.com
Redirect Base genpact.com
Domain IPs 72.32.21.154
Redirect IPs 104.18.0.204, 104.18.1.204, 2606:4700::6812:1cc, 2606:4700::6812:cc
Response IP 104.18.1.204
Found Yes
Hash b4992af79b1cf32f9afc2d603eef71c2018a8632b338da08a0a8cac23d1d5ec8
SimHash 85509d122fb3

Groups

*

Rule Path
Disallow /cpresources/
Disallow /vendor/
Disallow /.env
Disallow /cache/
Disallow /downloadable-content/

Other Records

Field Value
sitemap https://www.genpact.com/sitemaps-1-sitemap.xml
sitemap https://www.genpact.com/de/sitemaps-1-sitemap.xml
sitemap https://www.genpact.com/jp/sitemaps-1-sitemap.xml

Comments

  • robots.txt for https://www.genpact.com/
  • live - don't allow web crawlers to index cpresources/ or vendor/