paiz.com.gt
robots.txt

Robots Exclusion Standard data for paiz.com.gt

Resource Scan

Scan Details

Site Domain paiz.com.gt
Base Domain paiz.com.gt
Scan Status Ok
Last Scan2024-06-29T10:44:54+00:00
Next Scan 2024-07-06T10:44:54+00:00

Last Scan

Scanned2024-06-29T10:44:54+00:00
URL https://paiz.com.gt/robots.txt
Redirect https://www.paiz.com.gt/robots.txt
Redirect Domain www.paiz.com.gt
Redirect Base paiz.com.gt
Domain IPs 192.124.249.75
Redirect IPs 108.157.254.2, 108.157.254.27, 108.157.254.59, 108.157.254.88, 2600:9000:2753:4a00:b:b847:af00:93a1, 2600:9000:2753:8000:b:b847:af00:93a1, 2600:9000:2753:8800:b:b847:af00:93a1, 2600:9000:2753:9200:b:b847:af00:93a1, 2600:9000:2753:a600:b:b847:af00:93a1, 2600:9000:2753:b600:b:b847:af00:93a1, 2600:9000:2753:ce00:b:b847:af00:93a1, 2600:9000:2753:dc00:b:b847:af00:93a1
Response IP 108.157.254.27
Found Yes
Hash 4aa1d0341c305bad9cb78bd7a302b6c5080d79072e1f5d2bc86a88f32c9b8a63
SimHash f410cd574dd0

Groups

*

Rule Path
Disallow /img/*
Disallow /account/*
Disallow /login/*
Disallow /checkout/*
Disallow /busca/*
Disallow /quick-view/*
Disallow /espiar/*

Other Records

Field Value
sitemap https://www.walmart.com.gt/sitemap.xml

Comments

  • Disallow all crawlers access to certain pages.

Warnings

  • `noindex` is not a known field.