pearsonlongman.com
robots.txt

Robots Exclusion Standard data for pearsonlongman.com

Resource Scan

Scan Details

Site Domain pearsonlongman.com
Base Domain pearsonlongman.com
Scan Status Ok
Last Scan2026-04-03T10:27:04+00:00
Next Scan 2026-04-10T10:27:04+00:00

Last Scan

Scanned2026-04-03T10:27:04+00:00
URL https://pearsonlongman.com/robots.txt
Redirect https://www.ldoceonline.com/robots.txt
Redirect Domain www.ldoceonline.com
Redirect Base ldoceonline.com
Domain IPs 159.182.72.19
Redirect IPs 52.45.11.161
Response IP 52.45.11.161
Found Yes
Hash cc9939aae4177181ad183d19d170dfc4eab1d3225c26a9c653cdfb6e45327e33
SimHash 6b2e4894cbb3

Groups

amazonadbot

Rule Path
Disallow /autocomplete/
Disallow /autocomplete

proximic

Rule Path
Disallow

*

Rule Path
Disallow /spellcheck/
Disallow /autocomplete/
Disallow /spellcheck
Disallow /autocomplete

verity/1.1

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

sirdatabot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

Other Records

Field Value
sitemap https://www.ldoceonline.com/sitemap.xml