in.pearson.com
robots.txt

Robots Exclusion Standard data for in.pearson.com

Resource Scan

Scan Details

Site Domain in.pearson.com
Base Domain pearson.com
Scan Status Ok
Last Scan2025-11-29T11:24:39+00:00
Next Scan 2025-12-29T11:24:39+00:00

Last Scan

Scanned2025-11-29T11:24:39+00:00
URL https://in.pearson.com/robots.txt
Domain IPs 23.41.19.159
Response IP 23.39.5.176
Found Yes
Hash 2af68ee838c9eebfb6f9d8a47555674c1c10a987daa82b4bd19bfc8efc23ca93
SimHash e0514445c593

Groups

*

Rule Path
Disallow /en/pdc-new-en

Other Records

Field Value
sitemap https://in.pearson.com/sitemap.xml