wageindicator.co.uk
robots.txt

Robots Exclusion Standard data for wageindicator.co.uk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	wageindicator.co.uk
Base Domain	wageindicator.co.uk
Scan Status	Ok
Last Scan	2024-11-13T04:26:45+00:00
Next Scan	2024-11-20T04:26:45+00:00

Last Scan

Scanned	2024-11-13T04:26:45+00:00
URL	https://wageindicator.co.uk/robots.txt
Domain IPs	139.162.181.223, 2a01:7e01::f03c:91ff:feca:3bd2
Response IP	139.162.181.223
Found	Yes
Hash	e2a9cb0a2e45eea85ef8b8821016168de11a55d91a5e463cb25d5d221a9c657c
SimHash	ae11aa534c64

Groups

*

Rule	Path
Disallow

Rule

Path

Disallow

Other Records

Field	Value
crawl-delay	4

Field

Value

crawl-delay

4

www.deadlinkchecker.com

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

1

googlebot

Rule	Path
Disallow	/*atct_album_view$
Disallow	/*folder_factories$
Disallow	/*folder_summary_view$
Disallow	/*login_form$
Disallow	/*mail_password_form$
Disallow	/%40%40search
Disallow	/*search_rss$
Disallow	/*sendto_form$
Disallow	/*summary_view$
Disallow	/*thumbnail_view$
Disallow	/?job-id=
Disallow	/google-search-result?q=*
Disallow	/*archive-no-index$

Rule

Path

Disallow

/*atct_album_view$

Disallow

/*folder_factories$

Disallow

/*folder_summary_view$

Disallow

/*login_form$

Disallow

/*mail_password_form$

Disallow

/%40%40search

Disallow

/*search_rss$

Disallow

/*sendto_form$

Disallow

/*summary_view$

Disallow

/*thumbnail_view$

Disallow

/*?job-id=*

Disallow

/google-search-result?q=*

Disallow

/*archive-no-index$

Back to top

Other Records

Field	Value
sitemap	https://wageindicator.co.uk/sitemap.xml.gz

Field

Value

sitemap

https://wageindicator.co.uk/sitemap.xml.gz

Back to top

Comments

Define access-restrictions for robots/spiders
http://www.robotstxt.org/wc/norobots.html
By default we allow robots to access all areas of our site
already accessible to anonymous users
Add Googlebot-specific syntax extension to exclude forms
that are repeated for each piece of content in the site
the wildcard is only supported by Googlebot
http://www.google.com/support/webmasters/bin/answer.py?answer=40367&ctx=sibling
we want pages like our landing pages to be indexed (?job-id=7412100000000)
Disallow: /*?
waarschijnlijk kan het geen kwaad om deze "view" aan het einde van de URL te indexeren?
Disallow: /*view$
Do not index archive folders with this ID

Back to top

wageindicator.co.ukrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

www.deadlinkchecker.com

Other Records

googlebot

Other Records

Comments

wageindicator.co.uk
robots.txt