glassdoor.be
robots.txt

Robots Exclusion Standard data for glassdoor.be

Resource Scan

Scan Details

Site Domain glassdoor.be
Base Domain glassdoor.be
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-09-22T18:17:56+00:00
Next Scan 2024-12-21T18:17:56+00:00

Last Successful Scan

Scanned2023-03-03T08:44:42+00:00
URL https://glassdoor.be/robots.txt
Redirect https://nl.glassdoor.be/robots.txt
Redirect Domain nl.glassdoor.be
Redirect Base glassdoor.be
Domain IPs 104.18.147.58, 104.18.148.58
Redirect IPs 104.18.147.58, 104.18.148.58
Response IP 104.18.147.58
Found Yes
Hash f4da5711d932e6360fe2eeed0cc5b5f9516eee789aadfbccc35f7c709a7f370a
SimHash 129bfbc845db

Groups

*

Rule Path
Disallow /*?*hostSite=*
Disallow /1347171559/
Disallow /about/board/
Disallow /about/contact/
Disallow /about/faq/
Disallow /about/forCareerCenters/
Disallow /about/forLibraries/
Disallow /about/forStudents/
Disallow /about/guidelines/
Disallow /about/index/
Disallow /about/jobs/
Disallow /about/learn/
Disallow /about/overview/
Disallow /about/privacy/
Disallow /about/syndicationCenter/
Disallow /about/team/
Disallow /about/terms/
Disallow /about/widgetTerms/
Disallow /ajax/
Disallow /abtest
Disallow /browse/
Disallow /employerinfo/
Disallow /employerInfo/
Disallow /Ontdekken/bedrijven-zoeken
Disallow /getAdSlotContentsAjax.htm
Disallow /home/
Disallow /integrations/facebook/glassdoor/eep
Disallow /jobview/
Disallow /legal/
Disallow /lists/
Disallow /more/
Disallow /partners/
Allow /partners/account/maken.htm
Disallow /partner/
Disallow /partner-center/
Disallow /partners/company/
Disallow /partners/insights/
Disallow /partners/jobs/
Disallow /partners/reports/
Disallow /partners/resumeView
Disallow /partners/settings/
Disallow /parts
Disallow /profile/
Allow /profile/login_input.htm
Allow /profile/joinNow_input.htm
Disallow /Resume/user-profile/
Disallow /rss/*
Disallow /Rungs/
Disallow /search/
Disallow /Search/
Disallow /survey/
Disallow /surveys
Disallow /util/
Disallow /developer/widget/builder/
Disallow /hammer/
Disallow /mz-survey/
Disallow /user-activation/
Disallow /member/
Disallow /resume/build/
Disallow /userprofile/
Disallow /sourcing$
Disallow /searchsuggest$
Disallow /knowyourworth/
Disallow /Reviews/index.htm?
Disallow */lib$
Disallow */lib/
Disallow */globalize/
Disallow */globalize$
Disallow */ASCIISumThreshold$
Disallow */LogClient$
Disallow */MsgBuilder$
Disallow */UserAgent$
Disallow */Constants$
Disallow */init/
Disallow */init$
Disallow */LogServer$
Disallow */GDLogger$
Disallow */gd-perf$
Disallow */gd-site-hdr-dropdown$
Disallow */bundles$
Disallow */wait$
Disallow */extend$
Disallow */strings$
Disallow */strings/
Disallow */document$
Disallow */*Ajax.htm
Disallow */json$
Disallow */json/
Disallow /Vergelijk/kiezen
Disallow /employers/ec
Disallow /slink.htm
Disallow /*encryptedUserId
Disallow /*followId
Disallow /*userValidationKey
Disallow */trackClickAsync.htm
Disallow /track
Disallow /job-listing/details.htm?*
Disallow /job-listing/*_IE*.htm
Disallow /job-listing/JV.htm?*
Disallow /Vacature/*_IP*
Disallow /Vacatures/*_P*.htm*
Disallow /Vacatures/*_IP*.htm*
Disallow /Reviews/*_P*.htm*
Disallow /Reviews/*_IP*.htm*
Allow /Reviews/*-reviews-SRCH_*_IP2.htm*
Disallow /Sollicitatiegesprek/*_P*.htm*
Disallow /Sollicitatiegesprek/*_IP*.htm*
Disallow /Arbeidsvoorwaarden/*_IP*.htm*
Disallow /Salarissen/*-salarissen-SRCH*_IP*.htm*
Allow /Salarissen/*-salarissen-SRCH*_IP2.htm*
Allow /Salarissen/*-salarissen-SRCH*_IP3.htm*
Allow /Salarissen/*-salarissen-SRCH*_IP4.htm*
Allow /Salarissen/*-salarissen-SRCH*_IP5.htm*
Disallow /1060761/*
Disallow /Reviews/Barbizon-scam-*

ia_archiver

Rule Path
Disallow /
Allow */index.htm

omniexplorer_bot

Rule Path
Disallow /

mediapartners-google

Rule Path
Allow /

baiduspider

Rule Path
Disallow /
Allow */index.htm

Comments

  • BelgiĆ« (Nederlands)
  • logging related
  • Blocking track urls (ACQ-2468)
  • Blocking non standard job view and job search URLs, and paginated job SERP URLs (TRFC-2831)
  • Blocking pagination on employer infosite
  • Blocking bots from crawling DoubleClick for Publisher and Google Analytics related URL's (which aren't real URL's)
  • TRFC-4037 Block page from being indexed