workinstartups.com
robots.txt

Robots Exclusion Standard data for workinstartups.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	workinstartups.com
Base Domain	workinstartups.com
Scan Status	Ok
Last Scan	2026-02-16T13:52:10+00:00
Next Scan	2026-02-23T13:52:10+00:00

Last Scan

Scanned	2026-02-16T13:52:10+00:00
URL	https://workinstartups.com/robots.txt
Domain IPs	2a05:d018:1ac1:1500:5ffb:9e55:7a4:b868, 2a05:d018:1ac1:1501:44de:f183:522d:d688, 2a05:d018:1ac1:1502:f43:20af:e363:7985, 54.195.133.207, 54.216.122.80, 54.246.86.149
Response IP	54.195.133.207
Found	Yes
Hash	0f380c80d05846c452ebf48095cc21b281679ac96b7768d4200aaa08d036a2e9
SimHash	211089e58d74

Groups

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

*

Rule	Path
Allow	/
Disallow	/search?
Disallow	/jobs/search?
Disallow	/goto/ad/
Disallow	/jobs/goto/ad/
Disallow	/land/ad/
Disallow	/jobs/land/ad/
Disallow	/advanced-search?
Disallow	/jobs/advanced-search?
Disallow	/jobs/my-alerts?
Disallow	/my-alerts?
Disallow	/jobiak/
Disallow	/get_avg?
Disallow	/get_stats?
Disallow	/_app_count*
Disallow	/app_complete*
Disallow	/_create*
Disallow	/?error
Disallow	/authenticate*

Rule

Path

Allow

/

Disallow

/search?

Disallow

/jobs/search?

Disallow

/goto/ad/

Disallow

/jobs/goto/ad/

Disallow

/land/ad/

Disallow

/jobs/land/ad/

Disallow

/advanced-search?

Disallow

/jobs/advanced-search?

Disallow

/jobs/my-alerts?

Disallow

/my-alerts?

Disallow

/jobiak/

Disallow

/get_avg?

Disallow

/get_stats?

Disallow

/_app_count*

Disallow

/app_complete*

Disallow

/_create*

Disallow

/*?error*

Disallow

/authenticate*

adsbot-google
adsbot-google-mobile

Rule	Path
Disallow	/create_notification
Disallow	/jobs/create_notification

Rule

Path

Disallow

/create_notification

Disallow

/jobs/create_notification

ccbot
gptbot
chatgpt-user
google-extended
bytespider
diffbot
facebookbot
omgili
applebot-extended
perplexitybot
amazonbot
claudebot
omgilibot
anthropic-ai
claude-web
imagesiftbot
youbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://workinstartups.com/sitemap_index.jobs_WIS.xml

Field

Value

sitemap

https://workinstartups.com/sitemap_index.jobs_WIS.xml

Back to top

Comments

Disallow /create_notification endpoint from being accessed by the AdsBot
https://developers.google.com/search/docs/advanced/crawling/overview-google-crawlers
Sitemap links for core sitemaps (Also, details were included, but removed in JOB-2857)
JOB-2438: disallow ChatGPT crawlers (see https://darkvisitors.com/agents)

Back to top

workinstartups.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

amazonbot

*

adsbot-googleadsbot-google-mobile

ccbotgptbotchatgpt-usergoogle-extendedbytespiderdiffbotfacebookbotomgiliapplebot-extendedperplexitybotamazonbotclaudebotomgilibotanthropic-aiclaude-webimagesiftbotyoubot

Other Records

Comments

workinstartups.com
robots.txt

adsbot-google
adsbot-google-mobile

ccbot
gptbot
chatgpt-user
google-extended
bytespider
diffbot
facebookbot
omgili
applebot-extended
perplexitybot
amazonbot
claudebot
omgilibot
anthropic-ai
claude-web
imagesiftbot
youbot