webagre.com
robots.txt
Robots Exclusion Standard data for webagre.com
Resource Scan
Scan Details
Site Domain | webagre.com |
Base Domain | webagre.com |
Scan Status | Ok |
Last Scan | 2024-11-06T17:44:06+00:00 |
Next Scan | 2024-11-20T17:44:06+00:00 |
Last Scan
Scanned | 2024-11-06T17:44:06+00:00 |
URL | https://webagre.com/robots.txt |
Domain IPs | 219.111.240.121 |
Response IP | 219.111.240.121 |
Found | Yes |
Hash | 25e9f9a93662c77ca9fc28064295c4aa1a9eb990c61698880c4cb980e9a3089c |
SimHash | c11cd978e793 |
Groups
bingbot
adidxbot
Rule | Path |
---|---|
Allow | / |
Disallow | /job/login |
Disallow | /career/login |
Disallow | /job/apply/ |
Disallow | /career/apply/ |
Disallow | /job/preview/ |
Disallow | /career/preview/ |
Other Records
Field | Value |
---|---|
crawl-delay | 10 |
*
Rule | Path |
---|---|
Allow | / |
Disallow | /job/login |
Disallow | /career/login |
Disallow | /job/preview/ |
Disallow | /job/apply/ |
Disallow | /career/apply/ |
Disallow | /career/preview/ |
Other Records
Field | Value |
---|---|
crawl-delay | 5 |
Other Records
Field | Value |
---|---|
sitemap | https://webagre.com/sitemap.xml |
sitemap | https://webagre.com/storage/sitemap_job_list.xml |