adzuna.it
robots.txt

Robots Exclusion Standard data for adzuna.it

Resource Scan

Scan Details

Site Domain adzuna.it
Base Domain adzuna.it
Scan Status Ok
Last Scan2026-01-24T15:11:00+00:00
Next Scan 2026-01-31T15:11:00+00:00

Last Scan

Scanned2026-01-24T15:11:00+00:00
URL https://adzuna.it/robots.txt
Redirect https://www.adzuna.it/robots.txt
Redirect Domain www.adzuna.it
Redirect Base adzuna.it
Domain IPs 176.34.146.176, 2a05:d018:1ac1:1500:4ea:2812:8aea:960f, 2a05:d018:1ac1:1501:3fa4:5f77:9e43:e34c, 2a05:d018:1ac1:1502:a2e1:d549:5be0:91f3, 34.241.118.182, 52.212.43.199
Redirect IPs 176.34.146.176, 2a05:d018:1ac1:1500:4ea:2812:8aea:960f, 2a05:d018:1ac1:1501:3fa4:5f77:9e43:e34c, 2a05:d018:1ac1:1502:a2e1:d549:5be0:91f3, 34.241.118.182, 52.212.43.199
Response IP 52.212.43.199
Found Yes
Hash a2b82ff1c3e0580f1a23e1606932cc9f3fae1b6cbf8d3f45040ae928408953df
SimHash 205009d0a4f6

Groups

amazonbot

Rule Path
Disallow /

*

Rule Path
Disallow /search?
Disallow /jobs/search?
Disallow /to-rent?
Disallow /for-sale?
Disallow /goto/ad/
Disallow /jobs/goto/ad/
Disallow /land/ad/
Disallow /jobs/land/ad/
Disallow /advanced-search?
Disallow /jobs/advanced-search?
Disallow /jobs/my-alerts?
Disallow /my-alerts?
Disallow /get_avg?
Disallow /get_stats?
Disallow /_app_count*
Disallow /app_complete*
Disallow /_create*
Disallow /*?error*
Disallow /authenticate*
Disallow /*?*loc=
Disallow /*%26loc%3D
Disallow /*?*cmp=
Disallow /*%26cm
Disallow /*?*c_id
Disallow /*%26c_i
Disallow /*?*cat
Disallow /*%26cat
Disallow /*?*q
Disallow /*%26q%3D

adsbot-google
adsbot-google-mobile

Rule Path
Disallow /create_notification

ccbot
gptbot
chatgpt-user
google-extended
bytespider
diffbot
facebookbot
omgili
applebot-extended
perplexitybot
amazonbot
claudebot
omgilibot
anthropic-ai
claude-web
imagesiftbot
youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.adzuna.it/sitemap_index.jobs_IT.xml
sitemap https://www.adzuna.it/sitemap_index_details.jobs_IT.xml

Comments

  • JOB-3239: Block crawling of URLs with filtering query params to prevent SEO bloat
  • Disallow /create_notification endpoint from being accessed by the AdsBot
  • https://developers.google.com/search/docs/advanced/crawling/overview-google-crawlers
  • Sitemap links for our two primary sitemaps (core and details)
  • JOB-2438: disallow ChatGPT crawlers (see https://darkvisitors.com/agents)