beginlinux.com
robots.txt

Robots Exclusion Standard data for beginlinux.com

Resource Scan

Scan Details

Site Domain beginlinux.com
Base Domain beginlinux.com
Scan Status Ok
Last Scan2025-10-07T21:14:29+00:00
Next Scan 2025-11-06T21:14:29+00:00

Last Scan

Scanned2025-10-07T21:14:29+00:00
URL https://beginlinux.com/robots.txt
Domain IPs 195.26.247.133
Response IP 195.26.247.133
Found Yes
Hash 674e8cfced93f9de39d8e89f02643f1a321d3ec1983c78e43f77e5ef4ce36eaf
SimHash 6b14117463f5

Groups

duggmirror

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow
Allow /*

mediapartners-google*

Rule Path
Disallow /
Allow

*

Rule Path
Disallow /cgi-bin
Disallow /media
Disallow /data
Disallow /blog/wp-admin
Disallow /blog/wp-includes
Disallow /blog/wp-content/plugins
Disallow /blog/wp-content/cache
Disallow /blog/wp-content/themes
Disallow /blog/wp-trackback
Disallow /blog/wp-feed
Disallow /blog/wp-comments
Disallow /blog/*/trackback
Disallow /blog/*/feed
Disallow /blog/*/comments

googlebot

Rule Path
Disallow /blog/*.php$
Disallow /blog/*.js$
Disallow /blog/*.inc$
Disallow /blog/*.css$
Disallow /blog/*.gz$
Disallow /blog/*.wmv$
Disallow /blog/*.cgi$
Disallow /blog/*.xhtml$

*

Rule Path
Disallow /administrator/
Disallow /cache/
Disallow /components/
Disallow /images/
Disallow /includes/
Disallow /installation/
Disallow /language/
Disallow /libraries/
Disallow /media/
Disallow /modules/
Disallow /plugins/
Disallow /templates/
Disallow /tmp/
Disallow /xmlrpc/
Allow /web/*.pdf$
Disallow /*.pdf$

Other Records

Field Value
sitemap http://beginlinux.com/sitemap.xml

Comments

  • disable duggmirror
  • wordpress