glassdoor.com.br
robots.txt

Robots Exclusion Standard data for glassdoor.com.br

Resource Scan

Scan Details

Site Domain glassdoor.com.br
Base Domain glassdoor.com.br
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-03-30T08:15:59+00:00
Next Scan 2024-06-28T08:15:59+00:00

Last Successful Scan

Scanned2023-03-06T23:51:33+00:00
URL https://glassdoor.com.br/robots.txt
Redirect https://www.glassdoor.com.br/robots.txt
Redirect Domain www.glassdoor.com.br
Redirect Base glassdoor.com.br
Domain IPs 104.18.140.83, 104.18.141.83
Redirect IPs 104.18.140.83, 104.18.141.83
Response IP 104.18.140.83
Found Yes
Hash 591be1de88593ead00bbf6339ab1c0908054df9a80ed25532cf7cf50b1326f6b
SimHash 1291f9c845d3

Groups

*

Rule Path
Disallow /*?*hostSite=*
Disallow /1347171559/
Disallow /about/board/
Disallow /about/contact/
Disallow /about/faq/
Disallow /about/forCareerCenters/
Disallow /about/forLibraries/
Disallow /about/forStudents/
Disallow /about/guidelines/
Disallow /about/index/
Disallow /about/jobs/
Disallow /about/learn/
Disallow /about/overview/
Disallow /about/privacy/
Disallow /about/privacy/
Disallow /about/syndicationCenter/
Disallow /about/team/
Disallow /about/terms/
Disallow /about/widgetTerms/
Disallow /ajax/
Disallow /abtest
Disallow /browse/
Disallow /employerinfo/
Disallow /employerInfo/
Disallow /Explorar/pesquisar-empresas
Disallow /getAdSlotContentsAjax.htm
Disallow /home/
Disallow /integrations/facebook/glassdoor/eep
Disallow /jobview/
Disallow /legal/
Disallow /lists/
Disallow /more/
Disallow /partner/
Disallow /partner-center/
Disallow /partners/company/
Disallow /partners/insights/
Disallow /partners/jobs/
Disallow /partners/reports/
Disallow /partners/resumeView
Disallow /partners/settings/
Disallow /parts
Disallow /profile/
Allow /profile/login_input.htm
Allow /profile/joinNow_input.htm
Disallow /Resume/user-profile/
Disallow /rss/*
Disallow /Rungs/
Disallow /search/
Disallow /Search/
Disallow /survey/
Disallow /surveys
Disallow /util/
Disallow /getAdSlotContentsAjax.htm
Disallow /developer/widget/builder/
Disallow /hammer/
Disallow /mz-survey/
Disallow /user-activation/
Disallow /member/
Disallow /resume/build/
Disallow /userprofile/
Disallow /sourcing$
Disallow /searchsuggest$
Disallow /knowyourworth/
Disallow /Avalia%C3%A7%C3%B5es/index.htm?
Disallow */lib$
Disallow */lib/
Disallow */globalize/
Disallow */globalize$
Disallow */ASCIISumThreshold$
Disallow */LogClient$
Disallow */MsgBuilder$
Disallow */UserAgent$
Disallow */Constants$
Disallow */init/
Disallow */init$
Disallow */LogServer$
Disallow */GDLogger$
Disallow */gd-perf$
Disallow */gd-site-hdr-dropdown$
Disallow */bundles$
Disallow */wait$
Disallow */extend$
Disallow */strings$
Disallow */strings/
Disallow */document$
Disallow */*Ajax.htm
Disallow */json$
Disallow */json/
Disallow /Compara/escolher
Disallow /employers/ec
Disallow /slink.htm
Disallow /*encryptedUserId
Disallow /*followId
Disallow /*userValidationKey
Disallow */trackClickAsync.htm
Disallow /track
Disallow /job-listing/details.htm?*
Disallow /job-listing/*_IE*.htm
Disallow /job-listing/JV.htm?*
Disallow /Vaga/*_IP*
Disallow /Vagas/*_P*.htm*
Disallow /Vagas/*_IP*.htm*
Disallow /Avalia%C3%A7%C3%B5es/*_P*.htm*
Disallow /Avalia%C3%A7%C3%B5es/*_IP*.htm*
Allow /Avalia%C3%A7%C3%B5es/*-avalia%C3%A7%C3%B5es-SRCH_*_IP2.htm*
Disallow /Entrevista/*_P*.htm*
Disallow /Entrevista/*_IP*.htm*
Disallow /Benef%C3%ADcios/*_IP*.htm*
Disallow /Sal%C3%A1rios/*_IP*.htm*
Allow /Sal%C3%A1rios/*_IP2.htm*
Allow /Sal%C3%A1rios/*_IP3.htm*
Allow /Sal%C3%A1rios/*_IP4.htm*
Allow /Sal%C3%A1rios/*_IP5.htm*
Disallow /1060761/*

ia_archiver

Rule Path
Disallow /
Allow */index.htm

omniexplorer_bot

Rule Path
Disallow /

mediapartners-google

Rule Path
Allow /

baiduspider

Rule Path
Disallow /
Allow */index.htm

Comments

  • France
  • logging related
  • Blocking track urls (ACQ-2468)
  • Blocking non standard job view and job search URLs, and paginated job SERP URLs (TRFC-2831)
  • Blocking pagination on employer infosite
  • Blocking bots from crawling DoubleClick for Publisher and Google Analytics related URL's (which aren't real URL's)