thecommonsjournal.org
robots.txt

Robots Exclusion Standard data for thecommonsjournal.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	thecommonsjournal.org
Base Domain	thecommonsjournal.org
Scan Status	Failed
Failure Reason	Scan timed out.
Last Scan	2026-01-03T19:03:43+00:00
Next Scan	2026-02-02T19:03:43+00:00

Last Successful Scan

Scanned	2025-11-11T13:48:33+00:00
URL	https://thecommonsjournal.org/robots.txt
Domain IPs	34.147.4.31
Response IP	34.147.4.31
Found	Yes
Hash	b19b5d3e7340e8abfe53534bec81c5d325e8551d6bdbc1e72a544f19416b410c
SimHash	481dca40e5d3

Groups

googlebot

Rule	Path
Disallow	/print/*
Allow	/

Rule

Path

Disallow

/print/*

Allow

/

bingbot

Rule	Path
Disallow	/print/*
Allow	/

Rule

Path

Disallow

/print/*

Allow

/

duckduckbot

Rule	Path
Disallow	/print/*
Allow	/

Rule

Path

Disallow

/print/*

Allow

/

applebot

Rule	Path
Disallow	/print/*
Allow	/

Rule

Path

Disallow

/print/*

Allow

/

*

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	undefined/sitemap.xml

Field

Value

sitemap

undefined/sitemap.xml

Back to top

Comments

Googlebot
Bingbot
DuckDuckBot
Applebot
All other bots

Back to top

thecommonsjournal.orgrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

googlebot

bingbot

duckduckbot

applebot

*

Other Records

Comments

thecommonsjournal.org
robots.txt