archive.globes.co.il
robots.txt

Robots Exclusion Standard data for archive.globes.co.il

Resource Scan

Scan Details

Site Domain archive.globes.co.il
Base Domain globes.co.il
Scan Status Ok
Last Scan2024-04-29T03:45:20+00:00
Next Scan 2024-05-06T03:45:20+00:00

Last Scan

Scanned2024-04-29T03:45:20+00:00
URL https://archive.globes.co.il/robots.txt
Domain IPs 2600:1413:b000:6::17d5:2bcf, 2600:1413:b000:6::17d5:2bdf, 96.17.96.18, 96.17.96.31
Response IP 23.59.168.98
Found Yes
Hash 6896e680cc0fafd1fb03cdfbf491cc9101e477ae20a0e0f86ac1e5b459557749
SimHash 8b3f19082b90

Groups

telegrambot (like twitterbot)

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin/
Disallow /adstream/
Disallow /home_scripts
Disallow /7263/
Disallow %40CONTENT-REF
Disallow /news/undefined/
Disallow /en/undefined/
Disallow /bulletin/
Disallow /shared/
Disallow /apps/
Allow /bulletin/divors/nirim.html

Other Records

Field Value
sitemap http://www.globes.co.il/data/webservices/google-maps.ashx
sitemap http://www.globes.co.il/data/webservices/google-maps.ashx?language=he
sitemap http://www.globes.co.il/data/webservices/google-maps.ashx?language=en

Comments

  • Robots.txt file
  • All robots will spider the domain