iwmi.cgiar.org
robots.txt

Robots Exclusion Standard data for iwmi.cgiar.org

Resource Scan

Scan Details

Site Domain iwmi.cgiar.org
Base Domain cgiar.org
Scan Status Ok
Last Scan2024-11-18T07:06:28+00:00
Next Scan 2024-12-18T07:06:28+00:00

Last Scan

Scanned2024-11-18T07:06:28+00:00
URL https://iwmi.cgiar.org/robots.txt
Redirect https://www.iwmi.org/robots.txt
Redirect Domain www.iwmi.org
Redirect Base iwmi.org
Domain IPs 141.193.213.10, 141.193.213.11
Redirect IPs 141.193.213.10, 141.193.213.11
Response IP 141.193.213.10
Found Yes
Hash 8714ed6682a664d065ce4c918372720a9ac06210c8f6605edf1c072706988e49
SimHash 094adb11d6f5

Groups

twitterbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 600

googlebot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

googlebot-image

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

googlebot-video

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

googlebot-news

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

mediapartners-google

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

bing

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

msnbot-media

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

yahoo-blogs

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

yahoo-mmcrawler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

duckduckbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

yandexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

exabot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

facebot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

facebookexternalhit

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

ia_archiver

Rule Path
Allow /Publications/*/
Allow /wp-content/images/*/
Allow /wp-content/uploads/*/
Allow /wp-includes/*/
Allow /events/*
Disallow /wp-admin/
Disallow /wp-content/mu-plugins/
Disallow /wp-content/upgrade/
Disallow /search-results/
Disallow /search_gcse/
Disallow /wp-login.php
Disallow /wp-cron.php
Disallow /xmlrpc.php

Other Records

Field Value
crawl-delay 600

Other Records

Field Value
sitemap https://www.iwmi.cgiar.org/post-sitemap.xml
sitemap https://www.iwmi.cgiar.org/post-sitemap2.xml
sitemap https://www.iwmi.cgiar.org/post-sitemap3.xml
sitemap https://www.iwmi.cgiar.org/page-sitemap.xml
sitemap https://www.iwmi.cgiar.org/event-sitemap.xml
sitemap https://www.iwmi.cgiar.org/people-sitemap.xml
sitemap https://www.iwmi.cgiar.org/category-sitemap.xml
sitemap https://www.iwmi.cgiar.org/post_tag-sitemap.xml
sitemap https://www.iwmi.cgiar.org/post_tag-sitemap2.xml

Comments

  • allow all
  • alllow the twitter bot
  • set the crawl delay
  • but allow only important bots
  • Index these directories
  • Do not index directories
  • Files

Warnings

  • 1 invalid line.