wikihow.com
robots.txt

Robots Exclusion Standard data for wikihow.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	wikihow.com
Base Domain	wikihow.com
Scan Status	Ok
Last Scan	2024-11-13T21:24:00+00:00
Next Scan	2024-11-20T21:24:00+00:00

Last Scan

Scanned	2024-11-13T21:24:00+00:00
URL	https://wikihow.com/robots.txt
Redirect	https://www.wikihow.com/robots.txt
Redirect Domain	www.wikihow.com
Redirect Base	wikihow.com
Domain IPs	151.101.1.91, 151.101.129.91, 151.101.193.91, 151.101.65.91
Redirect IPs	151.101.1.91, 151.101.129.91, 151.101.193.91, 151.101.65.91
Response IP	199.232.45.91
Found	Yes
Hash	ab977165e81f385f520ed1492b7a1b7881a2202ef89ef22c4fcea8a26fb4c86a
SimHash	2c504189cdf7

Groups

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org

Rule	Path
Disallow	/api.php
Disallow	/index.php
Disallow	/Special%3A

Rule

Path

Disallow

/api.php

Disallow

/index.php

Disallow

/Special%3A

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

doc

Rule	Path
Disallow	/

Rule

Path

Disallow

download ninja

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

fetch

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

hmse_robot

Rule	Path
Disallow	/

Rule

Path

Disallow

httrack

Rule	Path
Disallow	/

Rule

Path

Disallow

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

larbin

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww

Rule	Path
Disallow	/

Rule

Path

Disallow

linko

Rule	Path
Disallow	/

Rule

Path

Disallow

microsoft.url.control

Rule	Path
Disallow	/

Rule

Path

Disallow

msiecrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

npbot

Rule	Path
Disallow	/

Rule

Path

Disallow

offline explorer

Rule	Path
Disallow	/

Rule

Path

Disallow

omigilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

sitecheck.internetseer.com

Rule	Path
Disallow	/

Rule

Path

Disallow

sitesnagger

Rule	Path
Disallow	/

Rule

Path

Disallow

teleport

Rule	Path
Disallow	/

Rule

Path

Disallow

teleportpro

Rule	Path
Disallow	/

Rule

Path

Disallow

ubicrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

webcopier

Rule	Path
Disallow	/

Rule

Path

Disallow

webreaper

Rule	Path
Disallow	/

Rule

Path

Disallow

webstripper

Rule	Path
Disallow	/

Rule

Path

Disallow

webzip

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

xenu

Rule	Path
Disallow	/

Rule

Path

Disallow

zao

Rule	Path
Disallow	/

Rule

Path

Disallow

zealbot

Rule	Path
Disallow	/

Rule

Path

Disallow

zyborg

Rule	Path
Disallow	/

Rule

Path

Disallow

adsbot-google

Rule	Path
Allow	/

Rule

Path

Allow

mediapartners-google

Rule	Path
Allow	/

Rule

Path

Allow

googlebot

Rule	Path
Allow	/Special%3ANewPages
Allow	/Special%3ASitemap
Allow	/Special%3ACategoryListing
Allow	/

Rule

Path

Allow

/Special%3ANewPages

Allow

/Special%3ASitemap

Allow

/Special%3ACategoryListing

Allow

*

Rule	Path
Allow	/Special%3ABlock
Allow	/Special%3ABlockList
Allow	/Special%3ACategorylisting
Allow	/Special%3ACategoryListing
Allow	/Special%3ACharity
Allow	/Special%3AEmailUser
Allow	/Special%3ALSearch
Allow	/Special%3ANewPages
Allow	/Special%3AQABox
Allow	/Special%3ASearchAd
Allow	/Special%3ASitemap
Allow	/Special%3AThankAuthors
Allow	/Special%3AUserLogin
Allow	/index.php?*action=credits
Allow	/index.php?*MathShowImage
Allow	/index.php?*printable
Disallow	/index.php
Disallow	/*feed%3Drss
Disallow	/*action%3Ddelete
Disallow	/*action%3Dhistory
Disallow	/Special%3A
Disallow	/*platform%3D
Disallow	/*variant%3D

Rule

Path

Allow

/Special%3ABlock

Allow

/Special%3ABlockList

Allow

/Special%3ACategorylisting

Allow

/Special%3ACategoryListing

Allow

/Special%3ACharity

Allow

/Special%3AEmailUser

Allow

/Special%3ALSearch

Allow

/Special%3ANewPages

Allow

/Special%3AQABox

Allow

/Special%3ASearchAd

Allow

/Special%3ASitemap

Allow

/Special%3AThankAuthors

Allow

/Special%3AUserLogin

Allow

/index.php?*action=credits

Allow

/index.php?*MathShowImage

Allow

/index.php?*printable

Disallow

/index.php

Disallow

/*feed%3Drss

Disallow

/*action%3Ddelete

Disallow

/*action%3Dhistory

Disallow

/Special%3A

Disallow

/*platform%3D

Disallow

/*variant%3D

Comments

robots.txt for https://www.wikihow.com
based on wikipedia.org's robots.txt
Crawlers that are kind enough to obey, but which we'd rather not have
unless they're feeding search engines.
Sitemap: https://www.wikihow.com/sitemap_index.xml
If your bot supports such a thing using the 'Crawl-delay' or another
instruction, please let us know. We can add it to our robots.txt.
Friendly, low-speed bots are welcome viewing article pages, but not
dynamically-generated pages please. Article pages contain our site's
real content.
Requests many pages per second
http://www.nameprotect.com/botinfo.html
Some bots are known to be trouble, particularly those designed to copy
entire sites. Please obey robots.txt.
wget in recursive mode uses too many resources for us.
Please read the man page and use it properly; there is a
--wait option you can use to set the delay between hits,
for instance. Please wait 3 seconds between each request.

wikihow.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

anthropic-ai

archive.org

ccbot

doc

download ninja

facebookbot

fetch

gptbot

hmse_robot

httrack

k2spider

larbin

libwww

linko

microsoft.url.control

msiecrawler

npbot

offline explorer

omigilibot

perplexitybot

sitecheck.internetseer.com

sitesnagger

teleport

teleportpro

ubicrawler

webcopier

webreaper

webstripper

webzip

wget

xenu

zao

zealbot

zyborg

adsbot-google

mediapartners-google

googlebot

*

Comments

wikihow.com
robots.txt