machineryplanet.io
robots.txt

Robots Exclusion Standard data for machineryplanet.io

Archived Snapshots

Resource Scan

Scan Details

Site Domain	machineryplanet.io
Base Domain	machineryplanet.io
Scan Status	Ok
Last Scan	2025-12-25T06:01:56+00:00
Next Scan	2026-01-01T06:01:56+00:00

Last Scan

Scanned	2025-12-25T06:01:56+00:00
URL	https://machineryplanet.io/robots.txt
Domain IPs	104.21.59.176, 172.67.182.5, 2606:4700:3032::6815:3bb0, 2606:4700:3034::ac43:b605
Response IP	172.67.182.5
Found	Yes
Hash	1596bed3ea3ad14de629e996876dd965d9d80da75d6a01ae98291d8e8fc4fdf4
SimHash	0708980374f4

Groups

*

Rule	Path
Allow	/
Disallow	/admin
Disallow	/admin/
Disallow	/_next/
Disallow	/api/
Allow	/api/sitemap*.xml
Allow	/api/health
Disallow	/debug/
Disallow	/test/
Disallow	/staging/
Allow	/search?categories=*
Allow	/search?childCategories=*
Allow	/search?make=*
Allow	/search?model=*
Allow	/search?productType=*
Allow	/search$
Allow	/search/$
Disallow	/search?sort=
Disallow	/search?filter=
Disallow	/search?page=
Disallow	/search?limit=
Disallow	/search?offset=
Disallow	/search?view=
Disallow	/search?&&&
Allow	/images/
Allow	/icons/
Allow	/fonts/
Allow	/*.css
Allow	/*.js
Allow	/*.woff
Allow	/*.woff2
Allow	/*.jpg
Allow	/*.jpeg
Allow	/*.png
Allow	/*.webp
Allow	/*.avif
Allow	/*.svg
Allow	/*.gif
Allow	/favicon.ico
Allow	/robots.txt
Allow	/sitemap*.xml
Disallow	/private/
Disallow	/temp/
Disallow	/cache/
Disallow	/.git/
Disallow	/node_modules/
Disallow	/.next/

Rule

Path

Allow

Disallow

/admin

Disallow

/admin/

Disallow

/_next/

Disallow

/api/

Allow

/api/sitemap*.xml

Allow

/api/health

Disallow

/debug/

Disallow

/test/

Disallow

/staging/

Allow

/search?categories=*

Allow

/search?childCategories=*

Allow

/search?make=*

Allow

/search?model=*

Allow

/search?productType=*

Allow

/search$

Allow

/search/$

Disallow

/search?*sort=*

Disallow

/search?*filter=*

Disallow

/search?*page=*

Disallow

/search?*limit=*

Disallow

/search?*offset=*

Disallow

/search?*view=*

Disallow

/search?*&*&*&*

Allow

/images/

Allow

/icons/

Allow

/fonts/

Allow

/*.css

Allow

/*.js

Allow

/*.woff

Allow

/*.woff2

Allow

/*.jpg

Allow

/*.jpeg

Allow

/*.png

Allow

/*.webp

Allow

/*.avif

Allow

/*.svg

Allow

/*.gif

Allow

/favicon.ico

Allow

/robots.txt

Allow

/sitemap*.xml

Disallow

/private/

Disallow

/temp/

Disallow

/cache/

Disallow

/.git/

Disallow

/node_modules/

Disallow

/.next/

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	5

Field

Value

crawl-delay

ahrefsbot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	5

Field

Value

crawl-delay

dotbot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	5

Field

Value

crawl-delay

screaming frog seo spider

Rule	Path
Allow	/

Rule

Path

Allow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot
ahrefsbot
baiduspider

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

Other Records

Field	Value
sitemap	https://www.machineryplanet.ae/api/sitemap-index.xml
sitemap	https://www.machineryplanet.ae/api/sitemap.xml
sitemap	https://www.machineryplanet.ae/api/sitemap-products.xml
sitemap	https://www.machineryplanet.ae/api/sitemap-categories.xml
sitemap	https://www.machineryplanet.ae/api/sitemap-blogs.xml
sitemap	https://www.machineryplanet.ae/api/sitemap-images.xml

Field

Value

sitemap

https://www.machineryplanet.ae/api/sitemap-index.xml

sitemap

https://www.machineryplanet.ae/api/sitemap.xml

sitemap

https://www.machineryplanet.ae/api/sitemap-products.xml

sitemap

https://www.machineryplanet.ae/api/sitemap-categories.xml

sitemap

https://www.machineryplanet.ae/api/sitemap-blogs.xml

sitemap

https://www.machineryplanet.ae/api/sitemap-images.xml

Comments

================================================================
Machinery Planet - Robots.txt (SEO Optimized)
Updated: 2025-11-26
Purpose: Allow search engines to crawl valuable content while
blocking duplicate/low-value pages
================================================================
====================
MAIN SEARCH ENGINES
====================
Block admin and development paths
====================
SEARCH & FILTERING
====================
✅ CRITICAL FIX: Allow category and brand pages but block filters/sorting
Allow valuable pages:
Block duplicate content parameters (must come AFTER Allow rules)
Block search with multiple filter combinations (low value)
====================
STATIC ASSETS
====================
Allow crawling of important assets for proper rendering
====================
PRIVATE DIRECTORIES
====================
====================
SPECIAL USER AGENTS
====================
Block AI crawlers (GPT, Claude, etc.)
====================
SEO TOOL CRAWLERS
====================
Allow but rate-limit aggressive SEO crawlers
====================
BAD BOTS (Optional)
====================
Block known bad bots/scrapers
====================
SITEMAPS
====================
✅ FIXED: Only reference sitemaps for THIS domain
====================
HOST PREFERENCE
====================
Preferred domain (www version)
====================
NOTES FOR DEVELOPERS
====================
1. This file allows Google to crawl 12,800+ product/category pages
2. Blocks only duplicate/filtered versions to save crawl budget
3. AI crawlers blocked to prevent content scraping
4. SEO crawlers rate-limited but allowed for auditing
5. All sitemaps reference THIS domain only (no cross-domain refs)
================================================================

Warnings

`host` is not a known field.

machineryplanet.iorobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

gptbot

chatgpt-user

ccbot

anthropic-ai

google-extended

perplexitybot

semrushbot

Other Records

ahrefsbot

Other Records

dotbot

Other Records

screaming frog seo spider

mj12bot

semrushbotahrefsbotbaiduspider

Other Records

Other Records

Comments

Warnings

machineryplanet.io
robots.txt

semrushbot
ahrefsbot
baiduspider