yp.com
robots.txt

Robots Exclusion Standard data for yp.com

Resource Scan

Scan Details

Site Domain yp.com
Base Domain yp.com
Scan Status Ok
Last Scan2024-05-14T10:39:13+00:00
Next Scan 2024-05-21T10:39:13+00:00

Last Scan

Scanned2024-05-14T10:39:13+00:00
URL https://yp.com/robots.txt
Redirect https://www.yellowpages.com/robots.txt?re=yp
Redirect Domain www.yellowpages.com
Redirect Base yellowpages.com
Domain IPs 151.138.15.18, 208.93.105.116
Redirect IPs 151.138.15.18
Response IP 208.93.105.116
Found Yes
Hash 3b520b54c0e85b7466b36df839600e6c9f017105043c3056eb09265ff81e4d8a
SimHash 59097b0043b0

Groups

*

Rule Path
Disallow /*images/li.gif
Disallow /*images/logging_requests.gif
Disallow /relevance_feedback
Disallow /listings/
Disallow /listing_feedback/
Disallow */report_abuse
Disallow /gallery/*/copyright
Disallow /gallery/*/flag
Disallow /contribute/
Disallow /reservations/
Disallow */print_ad?*
Disallow */audio_ad?*
Disallow */map_locations
Disallow /reviews/*/up
Disallow /reviews/*/down
Disallow /reviews/*/follow
Disallow /reviews/*/unfollow
Disallow */no-internet-heading-assigned
Disallow */no-internet-heading-assisted
Disallow /login
Disallow /register
Disallow /user/
Disallow /ypu/js/compiled/tripadvisor*
Disallow /ypu/apps/ypm-core/ypm/javascripts/bundle_tripadvisor*
Disallow /undefined/
Disallow /improve_listing/*
Disallow /search*
Disallow /lwes/
Disallow /route?*

scrapy

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

twitterbot

Rule Path
Allow *

Warnings

  • 2 invalid lines.
  • `host` is not a known field.