andalucia.com
robots.txt

Robots Exclusion Standard data for andalucia.com

Resource Scan

Scan Details

Site Domain andalucia.com
Base Domain andalucia.com
Scan Status Ok
Last Scan2024-11-11T03:49:26+00:00
Next Scan 2024-11-18T03:49:26+00:00

Last Scan

Scanned2024-11-11T03:49:26+00:00
URL https://andalucia.com/robots.txt
Domain IPs 104.21.91.148, 172.67.222.192, 2606:4700:3033::ac43:dec0, 2606:4700:3037::6815:5b94
Response IP 172.67.222.192
Found Yes
Hash 7f280d633bed3106b58aea90d6fe1209f0f12df94f2b0fa8ab6720cdb513fb99
SimHash 2d961d4bc570

Groups

mediapartners-google

Rule Path
Disallow

nutch

Rule Path
Disallow /

megaindex

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

*

Rule Path
Disallow /includes/
Disallow /misc/
Disallow /modules/
Disallow /profiles/
Disallow /scripts/
Disallow /themes/
Disallow /admin/
Disallow /andalucia.com/
Disallow /development/
Disallow /facebook/
Disallow /fbtest/
Disallow /files/
Disallow /flash/
Disallow /frames/
Disallow /links/
Disallow /data/
Disallow /parser/
Disallow /jobs/
Disallow /jobs_old/
Disallow /cgi-bin/
Disallow /homepages/
Disallow /Html/
Disallow /imageres/
Disallow /jm/
Disallow /logos/
Disallow /modules/
Disallow /newhomepage/
Disallow /newhomepage2/
Disallow /phpadsnew/
Disallow /quiztest/
Disallow /temp/
Disallow /tomclicks/
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /admin/
Disallow /comment/reply/
Disallow /contact/
Disallow /logout/
Disallow /node/add/
Disallow /search/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/
Disallow /?q=admin%2F
Disallow /?q=comment%2Freply%2F
Disallow /?q=contact%2F
Disallow /?q=logout%2F
Disallow /?q=node%2Fadd%2F
Disallow /?q=search%2F
Disallow /?q=user%2Fpassword%2F
Disallow /?q=user%2Fregister%2F
Disallow /?q=user%2Flogin%2F

Other Records

Field Value
crawl-delay 20

*

Rule Path
Disallow /forums/admin/
Disallow /forums/db/
Disallow /forums/images/
Disallow /forums/includes/
Disallow /forums/language/
Disallow /forums/templates/
Disallow /forums/common.php
Disallow /forums/config.php
Disallow /forums/faq.php
Disallow /forums/groupcp.php
Disallow /forums/login.php
Disallow /forums/memberlist.php
Disallow /forums/modcp.php
Disallow /forums/posting.php
Disallow /forums/privmsg.php
Disallow /forums/profile.php
Disallow /forums/search.php
Disallow /forums/viewonline.php

*

Rule Path
Disallow /node/1
Disallow /node/31226

*

Rule Path
Allow /core/*.css$
Allow /core/*.css?
Allow /core/*.js$
Allow /core/*.js?
Allow /core/*.gif
Allow /core/*.jpg
Allow /core/*.jpeg
Allow /core/*.png
Allow /core/*.svg
Allow /profiles/*.css$
Allow /profiles/*.css?
Allow /profiles/*.js$
Allow /profiles/*.js?
Allow /profiles/*.gif
Allow /profiles/*.jpg
Allow /profiles/*.jpeg
Allow /profiles/*.png
Allow /profiles/*.svg
Allow /sites/default/files/styles/*
Disallow /core/
Disallow /profiles/
Disallow /README.txt
Disallow /web.config
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips
Disallow /node/add/
Disallow /search/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/logout/
Disallow /index.php/admin/
Disallow /index.php/comment/reply/
Disallow /index.php/filter/tips
Disallow /index.php/node/add/
Disallow /index.php/search/
Disallow /index.php/user/password/
Disallow /index.php/user/register/
Disallow /index.php/user/login/
Disallow /index.php/user/logout/
Disallow /hotel-viewer
Disallow /rentals-viewer
Disallow /tours-viewer
Disallow /taxonomy/term/
Disallow /feed

Other Records

Field Value
sitemap https://www.andalucia.com/sitemap.xml
sitemap https://www.andalucia.com/blog/sitemap.xml

Comments

  • $Id: robots.txt,v 1.9.2.2 2010/09/06 10:37:16 goba Exp $
  • CC 29/6/2013 CC14/05/2020
  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/wc/robots.html
  • For syntax checking, see:
  • http://www.sxw.org.uk/computing/robots/check.html
  • Below brought over form D6 and since updated
  • Directories
  • Disallow: /sites/
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)
  • forum robots.txt
  • forum pages
  • Sitemaps
  • Sitemap: https://www.andalucia.com/hotels/sitemap.xml
  • Sitemap: https://www.andalucia.com/property/sitemap.xml
  • Sitemap: https://www.andalucia.com/tours/sitemap.xml
  • Sitemap: https://www.andalucia.com/rental/sitemap.xml
  • Below from D8 configuration
  • CSS, JS, Images
  • Twitter card
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)
  • Pages created by Views