scialert.net
robots.txt

Robots Exclusion Standard data for scialert.net

Archived Snapshots

Resource Scan

Scan Details

Site Domain	scialert.net
Base Domain	scialert.net
Scan Status	Ok
Last Scan	2024-11-15T11:10:50+00:00
Next Scan	2024-11-22T11:10:50+00:00

Last Scan

Scanned	2024-11-15T11:10:50+00:00
URL	https://scialert.net/robots.txt
Domain IPs	104.26.8.86, 104.26.9.86, 172.67.74.49, 2606:4700:20::681a:856, 2606:4700:20::681a:956, 2606:4700:20::ac43:4a31
Response IP	172.67.74.49
Found	Yes
Hash	f6f62042eee4bb9a87231822d036515e5acd015512aedf0d40bca9c45565a161
SimHash	50148b408132

Groups

*

Rule	Path
Allow	/
Allow	/sitemap.html

Rule

Path

Allow

/sitemap.html

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou web spider

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou inst spider

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

diffbot

Rule	Path
Disallow	/

Rule

Path

Disallow

imagesiftbot

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://scialert.net/sitemaps.xml

Field

Value

sitemap

https://scialert.net/sitemaps.xml

Comments

Block problem bots
User-agent: Baiduspider
User-agent: 360Spider
User-agent: Yisouspider
User-agent: Amazonbot
Block OpenAI
Block Google Bard AI
User-agent: Google-Extended
Disallow: /
Block Common Crawl AI scraper
Block Perplexity AI
Block other misc AI scrapers

Warnings

`clean-param` is not a known field.

scialert.netrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

petalbot

bytespider

sogou web spider

sogou inst spider

gptbot

chatgpt-user

ccbot

perplexitybot

anthropic-ai

omgilibot

omgili

facebookbot

diffbot

imagesiftbot

cohere-ai

claudebot

Other Records

Comments

Warnings

scialert.net
robots.txt