int-ent.de
robots.txt

Robots Exclusion Standard data for int-ent.de

Archived Snapshots

Resource Scan

Scan Details

Site Domain	int-ent.de
Base Domain	int-ent.de
Scan Status	Ok
Last Scan	2025-10-18T21:43:58+00:00
Next Scan	2025-10-25T21:43:58+00:00

Last Scan

Scanned	2025-10-18T21:43:58+00:00
URL	https://int-ent.de/robots.txt
Domain IPs	78.46.146.190
Response IP	78.46.146.190
Found	Yes
Hash	03bfd411563d01afbba01445289607eb81b88500baa3024187c850c1e9a3a39d
SimHash	b21a5349c6f6

Groups

*

Rule	Path
Allow	/wp-admin/admin-ajax.php
Disallow	/wp-admin
Disallow	/cgi-bin
Disallow	/wp-includes
Disallow	/wp-content/plugins
Disallow	/wp-content/cache
Disallow	/wp-content/themes
Disallow	/trackback
Disallow	/feed
Disallow	/comments
Disallow	/category//
Disallow	*/trackback/
Disallow	*/feed/
Disallow	*/comments/
Allow	/wp-content/cache/*.css
Allow	/wp-content/cache/*.js
Allow	/wp-content/plugins/simple-share-buttons-adder/css/*.css
Allow	/wp-content/plugins/simple-share-buttons-adder/buttons/simple/*.png
Allow	/wp-includes/js/jquery/*.js
Allow	/wp-content/uploads/wordpress-popular-posts/*.jpg

Rule

Path

Allow

/wp-admin/admin-ajax.php

Disallow

/wp-admin

Disallow

/cgi-bin

Disallow

/wp-includes

Disallow

/wp-content/plugins

Disallow

/wp-content/cache

Disallow

/wp-content/themes

Disallow

/trackback

Disallow

/feed

Disallow

/comments

Disallow

/category/*/*

Disallow

*/trackback/

Disallow

*/feed/

Disallow

*/comments/

Allow

/wp-content/cache/*.css

Allow

/wp-content/cache/*.js

Allow

/wp-content/plugins/simple-share-buttons-adder/css/*.css

Allow

/wp-content/plugins/simple-share-buttons-adder/buttons/simple/*.png

Allow

/wp-includes/js/jquery/*.js

Allow

/wp-content/uploads/wordpress-popular-posts/*.jpg

sitecheck.internetseer.com

Rule	Path
Disallow	/

Rule

Path

Disallow

zealbot

Rule	Path
Disallow	/

Rule

Path

Disallow

msiecrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

sitesnagger

Rule	Path
Disallow	/

Rule

Path

Disallow

webstripper

Rule	Path
Disallow	/

Rule

Path

Disallow

webcopier

Rule	Path
Disallow	/

Rule

Path

Disallow

fetch

Rule	Path
Disallow	/

Rule

Path

Disallow

offline explorer

Rule	Path
Disallow	/

Rule

Path

Disallow

teleport

Rule	Path
Disallow	/

Rule

Path

Disallow

teleportpro

Rule	Path
Disallow	/

Rule

Path

Disallow

webzip

Rule	Path
Disallow	/

Rule

Path

Disallow

linko

Rule	Path
Disallow	/

Rule

Path

Disallow

httrack

Rule	Path
Disallow	/

Rule

Path

Disallow

microsoft.url.control

Rule	Path
Disallow	/

Rule

Path

Disallow

xenu

Rule	Path
Disallow	/

Rule

Path

Disallow

larbin

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww

Rule	Path
Disallow	/

Rule

Path

Disallow

zyborg

Rule	Path
Disallow	/

Rule

Path

Disallow

download ninja

Rule	Path
Disallow	/

Rule

Path

Disallow

fast

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

grub-client

Rule	Path
Disallow	/

Rule

Path

Disallow

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

npbot

Rule	Path
Disallow	/

Rule

Path

Disallow

webreaper

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://int-ent.de/sitemap-news.xml
sitemap	https://int-ent.de/sitemap.xml
sitemap	https://int-ent.de/news-sitemap.xml

Field

Value

sitemap

https://int-ent.de/sitemap-news.xml

sitemap

https://int-ent.de/sitemap.xml

sitemap

https://int-ent.de/news-sitemap.xml

Comments

XML Sitemap & Google News Feeds version 4.6.3 - http://status301.net/wordpress-plugins/xml-sitemap-feed/
Allow styles and js for rendering
Some bots are known to be trouble, particularly those designed to copy
entire sites. Please obey robots.txt.
Misbehaving: requests much too fast:
Sorry, wget in its recursive mode is a frequent problem.
Please read the man page and use it properly; there is a
--wait option you can use to set the delay between hits,
for instance.
The 'grub' distributed client has been *very* poorly behaved.
Doesn't follow robots.txt anyway, but...
Hits many times per second, not acceptable
http://www.nameprotect.com/botinfo.html
A capture bot, downloads gazillions of pages with no public benefit
http://www.webreaper.net/

int-ent.derobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

sitecheck.internetseer.com

zealbot

msiecrawler

sitesnagger

webstripper

webcopier

fetch

offline explorer

teleport

teleportpro

webzip

linko

httrack

microsoft.url.control

xenu

larbin

libwww

zyborg

download ninja

fast

wget

grub-client

k2spider

npbot

webreaper

Other Records

Comments

int-ent.de
robots.txt