filmfetish.com
robots.txt

Robots Exclusion Standard data for filmfetish.com

Resource Scan

Scan Details

Site Domain filmfetish.com
Base Domain filmfetish.com
Scan Status Ok
Last Scan2024-11-15T01:25:55+00:00
Next Scan 2024-12-15T01:25:55+00:00

Last Scan

Scanned2024-11-15T01:25:55+00:00
URL https://filmfetish.com/robots.txt
Domain IPs 194.164.70.230
Response IP 194.164.70.230
Found Yes
Hash ccd24e57f4aeedd330b580e20de35585fad09469be5707781402ba47f04afbb5
SimHash 4fa56bd5c1a1

Groups

gptbot

Rule Path
Disallow /

applebot
googlebot-news
feedfetcher-google
bingbot
slurp
duckduckbot

Rule Path
Allow /artcats/
Allow /calendar-dates/
Allow /char/
Allow /company/
Allow /cpeople/
Allow /cshow/
Allow /event/
Allow /facility/
Allow /fact/
Allow /games-attractions/
Allow /prds/
Allow /publication/
Allow /sport-type/

applebot
storebot-google
bingbot
slurp
duckduckbot

Rule Path
Allow /market/
Allow /ffmkt/
Allow /artcats/
Allow /prds/

*
applebot
gptbot
googlebot
adsbot-google-mobile
adsbot-google
apis-google
feedfetcher-google
googlebot-image
googlebot-news
googlebot-video
google-inspectiontool
mediapartners-google
storebot-google
googleother
bingbot
msnbot
msnbot-media
adidxbot
desktop
mobile
baidu union
baidu favorites
image search
news search
video search
yandexbot
yandexmobilebot
yandex
yandexdirect
yandexdirectdyn
yandexmedia
yandeximages
yadirectfetcher
yandexblogs
yandexnews
yandexpagechecker
yandexmetrika
yandexcalendar
yandexscreenshotbot
yandexfavicons
yandexwebmaster
yandeximageresizer
yandexsitelinks
yandexantivirus
yandexvertis
slurp
duckduckbot
ia_archiver
aolbuild
teoma

Rule Path
Disallow /art/
Disallow /assets_local/
Disallow /cgi-bin/
Disallow /email/
Disallow /hit.pictures/
Disallow /img/
Disallow /prime_sites/
Disallow /redirect_sites/
Disallow /wp-admin/
Disallow /wp-content/
Disallow /wp-includes/
Disallow /csong/
Disallow /elements/
Disallow /locations-africa-mideast/
Disallow /locations-asia/
Disallow /locs-carib-cen-so-america/
Disallow /locations-africa-mideast/
Disallow /locations-no-america/
Disallow /number/
Disallow /sets/
Disallow /str/
Disallow /strtwo/
Disallow /title/
Disallow /util/

applebot
adsbot-google-mobile
adsbot-google
apis-google
googlebot
googlebot-image
googlebot-video
google-inspectiontool
mediapartners-google
storebot-google
googleother
msnbot
msnbot-media
adidxbot
desktop
mobile
baidu union
baidu favorites
image search
news search
video search
yandexbot
yandexmobilebot
yandex
yandexdirect
yandexdirectdyn
yandexmedia
yandeximages
yadirectfetcher
yandexblogs
yandexnews
yandexpagechecker
yandexmetrika
yandexcalendar
yandexscreenshotbot
yandexfavicons
yandexwebmaster
yandeximageresizer
yandexsitelinks
yandexantivirus
yandexvertis
ia_archiver
aolbuild
teoma

Rule Path
Disallow /facility/
Disallow /artcats/
Disallow /calendar-dates/
Disallow /char/
Disallow /company/
Disallow /event/
Disallow /games-attractions/
Disallow /monthsdays/
Disallow /cpeople/
Disallow /prds/
Disallow /publication/
Disallow /cshow/
Disallow /sport-type/
Disallow /yrs/

Other Records

Field Value
sitemap https://www.filmfetish.com/sitemap.xml

Comments

  • Block from entire website, including OpenAI Chat GBT
  • Make sure news bots go to news and select archives
  • Make sure store and product bots go to product pages
  • Block all from select directories
  • Block MOST from select taxonomies and other content
  • How-to: developers.google.com/search/docs/crawling-indexing/robots/create-robots-txt
  • OpenAI: foxnews.com/tech/openai-releases-webcrawler-gptbot-block
  • Post types: cal /fact/, crush /ffmkt/
  • List of search bots: webnots.com/user-agents-list-for-google-bing-baidu-and-yandex-search-engines/