newjerseyglobe.com
robots.txt

Robots Exclusion Standard data for newjerseyglobe.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	newjerseyglobe.com
Base Domain	newjerseyglobe.com
Scan Status	Ok
Last Scan	2024-11-16T06:51:26+00:00
Next Scan	2024-11-23T06:51:26+00:00

Last Scan

Scanned	2024-11-16T06:51:26+00:00
URL	https://newjerseyglobe.com/robots.txt
Domain IPs	141.193.213.20, 141.193.213.21
Response IP	141.193.213.20
Found	Yes
Hash	af45362f6feb20ccb81b94fc414cb96936e6972e5abd893664f4c108b0b13051
SimHash	482c4840a0b3

Groups

*

Rule	Path
Disallow

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

/

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

/

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

/

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

/

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://newjerseyglobe.com/sitemap_index.xml

Field

Value

sitemap

https://newjerseyglobe.com/sitemap_index.xml

Back to top

Comments

START YOAST BLOCK
---------------------------
---------------------------
END YOAST BLOCK

Back to top

newjerseyglobe.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

anthropic-ai

ccbot

chatgpt-user

cohere-ai

google-extended

gptbot

ia_archiver

Other Records

Comments

newjerseyglobe.com
robots.txt