admin.dushi.singtao.ca
robots.txt

Robots Exclusion Standard data for admin.dushi.singtao.ca

Resource Scan

Scan Details

Site Domain admin.dushi.singtao.ca
Base Domain singtao.ca
Scan Status Ok
Last Scan2025-08-04T11:21:58+00:00
Next Scan 2025-09-03T11:21:58+00:00

Last Scan

Scanned2025-08-04T11:21:58+00:00
URL https://admin.dushi.singtao.ca/robots.txt
Domain IPs 52.205.78.79
Response IP 52.205.78.79
Found Yes
Hash 4da587051b5479df2acf7be45f8f41dfe6937c129c4845bc20e1c6ca7317bb53
SimHash 5f7eca11c4d6

Groups

*

Rule Path
Allow /

Other Records

Field Value
crawl-delay 60

mediapartners-google

Rule Path
Allow /

Other Records

Field Value
crawl-delay 2

googlebot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

googlebot-news

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

bingbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

msnbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

slurp

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

yandex

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

baiduspider

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

alexabot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

ia_archiver

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

ai2bot
ai2bot-dolma
aihitbot
amazonbot
andibot
anthropic-ai
applebot
applebot-extended
awario
bedrockbot
brightbot 1.0
bytespider
ccbot
chatgpt-user
claude-searchbot
claude-user
claude-web
claudebot
cohere-ai
cohere-training-data-crawler
cotoyogi
crawlspace
datenbank crawler
devin
diffbot
duckassistbot
echobot bot
echoboxbot
facebookbot
facebookexternalhit
factset_spyderbot
firecrawlagent
friendlycrawler
gemini-deep-research
google-cloudvertexbot
google-extended
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalagent
meta-externalfetcher
meta-externalfetcher
mistralai-user
mistralai-user/1.0
mycentralaiscraperbot
netestate imprint crawler
novaact
oai-searchbot
omgili
omgilibot
operator
pangubot
panscient
panscient.com
perplexity-user
perplexitybot
perplexity-user
petalbot
phindbot
poseidon research crawler
qualifiedbot
quillbot
quillbot.com
sbintuitionsbot
scrapy
semrushbot-ocob
semrushbot-swa
sidetrade indexer bot
summalybot
thinkbot
tiktokspider
timpibot
velenpublicwebcrawler
wardbot
webzio-extended
wpbot
yandexadditional
yandexadditionalbot
youbot
piplbot
sosospider

Rule Path
Disallow /
Disallow /toronto/wp-admin/
Disallow /toronto/wp-includes/
Disallow /toronto/wp-content/plugins/
Disallow /toronto/wp-content/upgrade/
Disallow /toronto/wp-content/wflogs/
Disallow /toronto/wp-content/rsspi-log/
Disallow /toronto/wp-content/languages/
Disallow /toronto/uncategorized/
Disallow /vancouver/wp-admin/
Disallow /vancouver/wp-includes/
Disallow /vancouver/wp-content/plugins/
Disallow /vancouver/wp-content/upgrade/
Disallow /vancouver/wp-content/wflogs/
Disallow /vancouver/wp-content/rsspi-log/
Disallow /vancouver/wp-content/languages/
Disallow /vancouver/uncategorized/

Other Records

Field Value
sitemap https://dushi.singtao.ca/toronto/gen-sitemap/post/today/

Comments

  • ========================
  • 1. Default Rules (All Bots)
  • ========================
  • ========================
  • 2. Google Ad Manager (FASTER Crawl)
  • ========================
  • ========================
  • 3. Other Google & Bing Bots (Standard Crawl)
  • ========================
  • ========================
  • 4. Blocked Bots (AI, Scrapers, Spam)
  • ========================
  • ========================
  • 5. Disallowed Paths (Security & SEO)
  • ========================
  • ========================
  • 6. Sitemaps
  • ========================