globalmedia.mx
robots.txt

Robots Exclusion Standard data for globalmedia.mx

Resource Scan

Scan Details

Site Domain globalmedia.mx
Base Domain globalmedia.mx
Scan Status Ok
Last Scan2024-10-29T02:35:10+00:00
Next Scan 2024-11-28T02:35:10+00:00

Last Scan

Scanned2024-10-29T02:35:10+00:00
URL https://globalmedia.mx/robots.txt
Domain IPs 34.201.80.84, 54.157.4.65, 54.196.16.164, 54.91.6.89
Response IP 54.196.16.164
Found Yes
Hash 27c7cc19a2123166d304e56cf6ec5641ea5a66c31ec3c8f96d1160c376a1b1ad
SimHash 38941f173ca6

Groups

*

Rule Path
Disallow /angular/
Disallow /assets/

genio

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

scooperbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

flamingo_searchengine

Rule Path
Disallow /

facebot

Rule Path
Disallow /

luminatebot

Rule Path
Disallow /

vagabondo

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

r6_commentreader

Rule Path
Disallow /

yeti

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

showyoubot

Rule Path
Disallow /

gozaikbot

Rule Path
Disallow /

python-requests

Rule Path
Disallow /

queryseekerspider

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

yandeximages

Rule Path
Disallow /

apache-httpclient

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

buck

Rule Path
Disallow /

wikido

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

sogou

Rule Path
Disallow /

zend_http_client

Rule Path
Disallow /

robots

Rule Path
Disallow /

arquivo-web-crawler

Rule Path
Disallow /

bidswitchbot

Rule Path
Disallow /

g-i-g-a-b-o-t

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

garlikcrawler

Rule Path
Disallow /

caam

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

clickagy intelligence bot

Rule Path
Disallow /

jersey

Rule Path
Disallow /

libwww-perl

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

omgili

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

python-urllib

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.globalmedia.mx/sitemap.xml
sitemap https://www.globalmedia.mx/google_news.xml

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-agent: *