tfo.org
robots.txt

Robots Exclusion Standard data for tfo.org

Resource Scan

Scan Details

Site Domain tfo.org
Base Domain tfo.org
Scan Status Ok
Last Scan2024-09-21T07:20:24+00:00
Next Scan 2024-10-05T07:20:24+00:00

Last Scan

Scanned2024-09-21T07:20:24+00:00
URL https://tfo.org/robots.txt
Redirect https://www.tfo.org/robots.txt
Redirect Domain www.tfo.org
Redirect Base tfo.org
Domain IPs 13.107.246.35, 2620:1ec:bdf::35
Redirect IPs 13.107.246.59, 2620:1ec:bdf::59
Response IP 13.107.246.59
Found Yes
Hash a1de3b1e075cb12c6f3b0165c5a3b3880098c029ec54906c5c2be8a6dd59a13d
SimHash 2d527c710717

Groups

*

Rule Path
Allow /
Allow /explorer$
Allow /explore$
Disallow /explorer?q=
Disallow /explore?q=
Allow /recherche
Allow /search$
Disallow /recherche?q=
Disallow /search?q=
Allow /grille-horaire$
Disallow /grille-horaire/
Disallow /*?jw_start=%7Bseek_to_second_number%7D

Comments

  • Allow crawling of the entire website
  • Allow crawling of the explore page
  • Disallow crawling of the search results
  • Allow crawling of the explore page
  • Disallow crawling of the search results
  • Allow crawling of tv schedule
  • Disallow crawling of tv schedule days via the calendar (search engine trap)