cgmagonline.com
robots.txt

Robots Exclusion Standard data for cgmagonline.com

Resource Scan

Scan Details

Site Domain cgmagonline.com
Base Domain cgmagonline.com
Scan Status Ok
Last Scan2024-11-14T10:57:42+00:00
Next Scan 2024-11-21T10:57:42+00:00

Last Scan

Scanned2024-11-14T10:57:42+00:00
URL https://cgmagonline.com/robots.txt
Redirect https://www.cgmagonline.com/robots.txt
Redirect Domain www.cgmagonline.com
Redirect Base cgmagonline.com
Domain IPs 104.26.6.184, 104.26.7.184, 172.67.68.70, 2606:4700:20::681a:6b8, 2606:4700:20::681a:7b8, 2606:4700:20::ac43:4446
Redirect IPs 104.26.6.184, 104.26.7.184, 172.67.68.70, 2606:4700:20::681a:6b8, 2606:4700:20::681a:7b8, 2606:4700:20::ac43:4446
Response IP 104.26.7.184
Found Yes
Hash 3bd44516ced20a1e4a9ba845b95492f418d37e2b9d8e52a7d362e3c898aa8522
SimHash 267453a0c51b

Groups

*

Rule Path
Allow /wp-admin/admin-ajax.php
Allow /*/*.css
Allow /*/*.js
Disallow /wp-admin/
Disallow /wp-includes/
Disallow /readme.html
Disallow /license.txt
Disallow /xmlrpc.php
Disallow /wp-login.php
Disallow /wp-register.php
Disallow *?attachment_id=
Disallow /wp-json/
Disallow /?rest_route=
Disallow /search/
Disallow /?s=
Disallow *?s=*
Disallow *?p=*
Disallow *%26p%3D*
Disallow *%26preview%3D*
Disallow /trackback/
Disallow */comments$
Disallow */trackback
Disallow */trackback$
Disallow /wp-comments
Disallow /wp-trackback
Disallow */replytocom%3D

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

chatgpt

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

googlebot

Rule Path
Allow /

googlebot-image

Rule Path
Allow /wp-content/uploads/

mediapartners-google

Rule Path
Allow /

adsbot-google

Rule Path
Allow /

adsbot-google-mobile

Rule Path
Allow /

bingbot

Rule Path
Allow /

msnbot

Rule Path
Allow /

msnbot-media

Rule Path
Allow /wp-content/uploads/

applebot

Rule Path
Allow /

yandex

Rule Path
Allow /
Allow /feed/

yandeximages

Rule Path
Allow /wp-content/uploads/

slurp

Rule Path
Allow /

duckduckbot

Rule Path
Allow /

qwantify

Rule Path
Allow /

baiduspider

Rule Path
Allow /

baiduspider/2.0

Rule Path
Allow /

baiduspider-video

Rule Path
Allow /

baiduspider-image

Rule Path
Allow /

sogou spider

Rule Path
Allow /

sogou web spider

Rule Path
Allow /

sosospider

Rule Path
Allow /

sosospider+

Rule Path
Allow /

sosospider/2.0

Rule Path
Allow /

yodao

Rule Path
Allow /

youdao

Rule Path
Allow /

youdaobot

Rule Path
Allow /

youdaobot/1.0

Rule Path
Allow /

naverbot

Rule Path
Allow /

newsnow

Rule Path
Disallow

seznambot

Rule Path
Allow /

facebook

Rule Path
Allow /

facebookplatform/1.0

Rule Path
Allow /

facebookexternalhit

Rule Path
Allow /

facebookexternalhit/1.0

Rule Path
Allow /

facebookexternalhit/1.1

Rule Path
Allow /

facebookscraper

Rule Path
Allow /

facebot/1.0

Rule Path
Allow /

visionutils/0.2

Rule Path
Allow /

datagnionbot/1.0

Rule Path
Allow /

instagrambot

Rule Path
Allow /

whatsapp bot

Rule Path
Allow /

telegrambot

Rule Path
Allow /

twitterbot

Rule Path
Allow /

linkedinbot

Rule Path
Allow /

linkedinbot/1.0

Rule Path
Allow /

pinterest bot

Rule Path
Allow /

pinterest/0.1

Rule Path
Allow /

pinterest/0.2

Rule Path
Allow /

discordbot

Rule Path
Allow /
Allow /*.webp$
Allow /*.jpg$
Allow /*.png$
Allow /*.gif$

dotbot

Rule Path
Disallow /

giftghostbot

Rule Path
Disallow /

seznam

Rule Path
Disallow /

paperlibot

Rule Path
Disallow /

genieo

Rule Path
Disallow /

dataprovider/6.101

Rule Path
Disallow /

dataprovidersiteexplorer

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.cgmagonline.com/sitemap_index.xml
sitemap https://www.cgmagonline.com/page-sitemap.xml
sitemap https://www.cgmagonline.com/sitemap-news.xml

Comments

  • CGMagazine Robots.txt - Crawling the best of Comics and Gaming
  • Advanced Wordpress
  • Prevent Crawling of WordPress JSON API Endpoints
  • Block Search URLs /search/ and /?s=
  • Block Parameters
  • Block Spam Directories
  • Block AI Spiders
  • Rankmath Sitemap Link
  • News Sitemap Link
  • Allow Google Bot
  • Allow Google Images Bot
  • Allow Google Media Partners Bot
  • Allow Google AdsBot Bot
  • Allow Google Mobile Bot
  • Allow Bing Bot
  • Allow MSN Bot
  • Allow MSNBot Media Bot
  • Allow Apple Bot
  • Source robots.txt:
  • Allow Yandex Images Bot
  • Allow Yahoo Search (Slurp bot)
  • Allow DuckDuckGo Bot
  • Allow Qwant Bot
  • Allow Baidu/Sogou/Soso/Youdao Bot
  • Allow Naver Bot
  • Allow NewsNow
  • Allow Seznam Bot
  • Allow Facebook Bot
  • Allow Instagram Bot
  • Allow Whatsapp Bot
  • Allow Telegram Bot
  • Allow Twitter Bot
  • Allow Linkedin Bot
  • Allow Pinterest Bot
  • Allow Discord Bot
  • Allow Webp Images
  • Allow Jpg Images
  • Allow Png Images
  • Allow Gif Images
  • Block Scrapper Bots