g17media.com
robots.txt

Robots Exclusion Standard data for g17media.com

Resource Scan

Scan Details

Site Domain g17media.com
Base Domain g17media.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-09-06T23:04:21+00:00
Next Scan 2024-10-06T23:04:21+00:00

Last Successful Scan

Scanned2024-08-08T18:42:35+00:00
URL https://g17media.com/robots.txt
Redirect https://www.lcpdfr.com/robots.txt
Redirect Domain www.lcpdfr.com
Redirect Base lcpdfr.com
Domain IPs 198.251.90.186
Redirect IPs 198.251.90.186
Response IP 198.251.90.186
Found Yes
Hash f11887a7038af9ff013635289ec952163ffe95359e94e9f1d9b153ffd8bb8ef9
SimHash 2c7cd001cca7

Groups

mediapartners-google

Rule Path
Disallow

dotbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

*

Rule Path
Disallow /downloads/*?*advanced_search_submitted=1*
Disallow /leaderboard/
Disallow /*?advancedSearchForm=
Disallow /discover/
Disallow /online/
Disallow /markallread/
Disallow /discover/unread/
Disallow /startTopic/
Disallow /search/
Disallow /login/
Disallow /register/
Disallow /lostpassword/
Disallow /tags/
Disallow /cookie/
Disallow /*/?do=download
Disallow /*?do=add
Disallow /*?do=email
Disallow /*?sortby=
Disallow /*?filter=
Disallow /*?tab=*
Disallow /*?do=*
Disallow /*ref%3D
Disallow /*?forumId*
Disallow /*?&controller=embed
Disallow /*?filter_tag*
Disallow /403error.php
Disallow /404error.php
Disallow /500error.php
Disallow /Credits.txt
Disallow /error.php
Disallow /upgrading.html
Disallow /cops/
Disallow /*.zip$
Disallow /*.rar$
Disallow /*.doc$
Allow /ads.txt

Other Records

Field Value
sitemap http://www.lcpdfr.com/sitemap.php

Comments

  • Robot requests to this site SHOULD:
  • - set a custom User-Agent, do not stick to default bot or framework User-Agents, as these may be throttled or blocked
  • - during times of attack, we have a system very similar to Cloudflare's Under Attack mode. Your bot should be able to accept Cookies provided by HTTP and be able to handle a 307 redirect.
  • - if you need to make more than 1 request per second, and you find your bot gets blocked or throttled, please e-mail management at lcpdfr dot com with your bots use case. Bots can be whitelisted if they provide a benefit to us or our users.
  • Updated on 7th Oct 2017
  • dotbot is a true example of junk.
  • Dynamic places that may cause issue
  • Disallow: /profile/
  • URLs we don't want in index
  • Disallow: /api/
  • Disallow: /datastore/
  • Disallow: /system/
  • Disallow: /applications/ Comment out as some IPS assets rest in applications/core/interface
  • Disallow: /plugins/
  • Disallow: /admin/
  • Files we'd never want in index
  • Ads