sport.de
robots.txt

Robots Exclusion Standard data for sport.de

Resource Scan

Scan Details

Site Domain sport.de
Base Domain sport.de
Scan Status Ok
Last Scan2024-06-26T04:08:16+00:00
Next Scan 2024-07-03T04:08:16+00:00

Last Scan

Scanned2024-06-26T04:08:16+00:00
URL https://sport.de/robots.txt
Redirect https://www.sport.de/robots.txt
Redirect Domain www.sport.de
Redirect Base sport.de
Domain IPs 23.48.107.50, 23.48.107.67
Redirect IPs 23.64.122.136, 23.64.122.147
Response IP 23.48.107.67
Found Yes
Hash 478a7e2776f8b788f8a77837d920d64f6382f4c4db70248025381a42a5e0b8a8
SimHash 5a18cb40373b

Groups

aihitbot
blexbot
careerbot
ccbot
dotbot
google-extended
gptbot
grapeshot
icjobs
linkdexbot
magpie-crawler
megaindex
mj12bot
proximic
queryseekerspider
scrapy
scrapybot
semrushbot
sentibot
seokicks-robot
tkbot
trendkite-akashic-crawler
vagabondo
wbsearchbot

Rule Path
Disallow /

ahrefsbot
baiduspider
yandexbot
yahoo! slurp

Rule Path
Disallow /*?

Other Records

Field Value
crawl-delay 10

*

Rule Path
Disallow /person_stats_content-matchstats
Disallow /widget_video_videolist-content
Disallow /webview
Disallow /widget_news_archiv-content
Disallow /json/matches-by-ids
Disallow /json/teams-by-ids
Disallow /tabellenrechner
Disallow /widget_images_imagelist-content
Disallow /widget_statistics_teamperson_goals
Disallow /widget_*
Disallow /*?
Disallow /webview
Disallow /genius*
Disallow /template-sportwetten

facebookexternalhit

Rule Path
Allow /*?