chicbrides.com.mx
robots.txt

Robots Exclusion Standard data for chicbrides.com.mx

Resource Scan

Scan Details

Site Domain chicbrides.com.mx
Base Domain chicbrides.com.mx
Scan Status Ok
Last Scan2024-11-16T12:15:07+00:00
Next Scan 2024-11-23T12:15:07+00:00

Last Scan

Scanned2024-11-16T12:15:07+00:00
URL https://chicbrides.com.mx/robots.txt
Redirect https://www.chicmagazine.com.mx/robots.txt
Redirect Domain www.chicmagazine.com.mx
Redirect Base chicmagazine.com.mx
Domain IPs 18.155.68.16, 18.155.68.33, 18.155.68.59, 18.155.68.8
Redirect IPs 216.137.52.109, 216.137.52.11, 216.137.52.67, 216.137.52.95
Response IP 18.165.140.53
Found Yes
Hash 81b222b8ba530d9c56074c8bbd75fc009d5dbee2a0bbe3701f19dd689514ebab
SimHash b8161f032d65

Groups

*

Rule Path
Disallow /node/
Disallow /cdb/
Disallow /wp-content/
Disallow /sites/
Disallow /Topicos/
Disallow /7198/
Disallow /noticias/
Disallow /bbtstats/
Disallow /bbtfile/
Disallow /feed/
Disallow /rss7/
Disallow /rss10/
Disallow /MediaCenter/
Disallow /portal/

genio

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

scooperbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

flamingo_searchengine

Rule Path
Disallow /

facebot

Rule Path
Disallow /

luminatebot

Rule Path
Disallow /

vagabondo

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

r6_commentreader

Rule Path
Disallow /

yeti

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

showyoubot

Rule Path
Disallow /

gozaikbot

Rule Path
Disallow /

python-requests

Rule Path
Disallow /

queryseekerspider

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

yandeximages

Rule Path
Disallow /

apache-httpclient

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

buck

Rule Path
Disallow /

wikido

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

sogou

Rule Path
Disallow /

zend_http_client

Rule Path
Disallow /

robots

Rule Path
Disallow /

arquivo-web-crawler

Rule Path
Disallow /

bidswitchbot

Rule Path
Disallow /

g-i-g-a-b-o-t

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

garlikcrawler

Rule Path
Disallow /

caam

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

clickagy intelligence bot

Rule Path
Disallow /

jersey

Rule Path
Disallow /

libwww-perl

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

omgili

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

python-urllib

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

grapeshot

Rule Path
Disallow

Other Records

Field Value
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-articles-index.xml
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-google-news-index.xml
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-tags-index.xml
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-images-index.xml
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-videos-index.xml

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/wc/robots.html
  • For syntax checking, see:
  • http://www.sxw.org.uk/computing/robots/check.html