chicmagazine.com.mx
robots.txt

Robots Exclusion Standard data for chicmagazine.com.mx

Resource Scan

Scan Details

Site Domain chicmagazine.com.mx
Base Domain chicmagazine.com.mx
Scan Status Ok
Last Scan2024-06-13T15:26:02+00:00
Next Scan 2024-06-20T15:26:02+00:00

Last Scan

Scanned2024-06-13T15:26:02+00:00
URL https://chicmagazine.com.mx/robots.txt
Redirect https://www.chicmagazine.com.mx/robots.txt
Redirect Domain www.chicmagazine.com.mx
Redirect Base chicmagazine.com.mx
Domain IPs 13.33.183.125, 13.33.183.28, 13.33.183.58, 13.33.183.78
Redirect IPs 13.227.74.37, 13.227.74.45, 13.227.74.49, 13.227.74.65
Response IP 18.165.171.76
Found Yes
Hash cc7d893ca80f734adc1544ee7491c6064835170ef236c6fcb7d2188fac6b00dd
SimHash b8161f032d65

Groups

*

Rule Path
Disallow /node/
Disallow /cdb/
Disallow /wp-content/
Disallow /sites/
Disallow /Topicos/
Disallow /7198/
Disallow /noticias/
Disallow /bbtstats/
Disallow /bbtfile/
Disallow /feed/
Disallow /rss7/
Disallow /rss10/
Disallow /MediaCenter/
Disallow /portal/

genio

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

scooperbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

flamingo_searchengine

Rule Path
Disallow /

facebot

Rule Path
Disallow /

luminatebot

Rule Path
Disallow /

vagabondo

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

r6_commentreader

Rule Path
Disallow /

yeti

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

showyoubot

Rule Path
Disallow /

gozaikbot

Rule Path
Disallow /

python-requests

Rule Path
Disallow /

queryseekerspider

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

yandeximages

Rule Path
Disallow /

apache-httpclient

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

buck

Rule Path
Disallow /

wikido

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

sogou

Rule Path
Disallow /

zend_http_client

Rule Path
Disallow /

robots

Rule Path
Disallow /

arquivo-web-crawler

Rule Path
Disallow /

bidswitchbot

Rule Path
Disallow /

g-i-g-a-b-o-t

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

garlikcrawler

Rule Path
Disallow /

caam

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

clickagy intelligence bot

Rule Path
Disallow /

jersey

Rule Path
Disallow /

libwww-perl

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

omgili

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

python-urllib

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

grapeshot

Rule Path
Disallow

Other Records

Field Value
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-articles-index.xml
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-default-current.xml
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-google-news-index.xml
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-tags-index.xml
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-images-index.xml
sitemap https://www.chicmagazine.com.mx/sitemap/sitemap-videos-index.xml

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/wc/robots.html
  • For syntax checking, see:
  • http://www.sxw.org.uk/computing/robots/check.html