Site Reliability Engineer Team Lead

WhiteHat Security·Belfast


Who we are:

In July of 2019, WhiteHat was acquired by NTT Ltd., a leading global technology services company. This immediately enabled WhiteHat to bring solutions and professional services to clients in over 100 countries where NTT Ltd. operates. With major wins in 2019 across the Americas, Europe, Australia and Japan, WhiteHat Security continues to leverage channel partners, direct sales teams, and the global presence of NTT Ltd. to service customers worldwide.  In May 2020, we were named a Leader in the 2020 Gartner Magic Quadrant for Application Security Testing (AST). This Leader position is based on the ability to execute and completeness of vision. This is WhiteHat’s fifth time being named a Leader in this report. 

What we are looking for:

WhiteHat is looking for a Site Reliability Engineer Team Lead to join our Technical Operations team in our Belfast office. We are looking for a hand-on lead with experience operating a large-scale SaaS infrastructure in both private and public cloud. The TechOps team runs the infrastructure supporting the WhiteHat Sentinel product as well as the development environments. This position will lead a small team and cover a wide tech stack with the opportunity to touch many new technologies. We are looking for a well-balanced people and technology lead with an automation mind-set. This role will work cross-functionally with multiple teams within the company.

What you will be doing:

  • Lead a small team located in Belfast to cover EU business hours
  • Troubleshooting production issues in a 24/7 environment as part of a global team
  • Operationally manage all infrastructure in the tech stack including data center, storage, hardware, compute, shared services, and public cloud
  • Automate processes for efficiency and consistency. Manage infrastructure as code
  • Secure the infrastructure and product with industry standard best practices
  • Plan capacity, reliability, and security while optimizing costs
  • Work cross functionally with other teams in multiple time zones to support and grow the products and business
  • Research and test new technologies to build for the future and speed the development cycle

What we value:

  • 5+ years' experience supporting a 24/7 production environment for a SaaS product and supporting software engineering teams
  • Experience having direct reports in a TechOps or Engineering role
  • Strong knowledge of Linux system administration
  • Experience managing services in at least one public cloud provider
  • Familiar with virtualization technologies and how to make services highly available
  • Experience working with storage solutions for NAS, SAN, backup and local storage
  • Administration of common Internet services: DNS, NTP, SMTP, HTTP, SSL, NFS, etc.
  • Fluent in at least one scripting language and experience with configuration management tools
  • Availability to be on-call to support a 24/7 environment and experience working with a global company and team

Nice to have:

  • Experience working with a global team, specifically with the US time zones
  • Management of containers and container orchestration
  • Managing budgets and working with external vendors
  • Knowledge of networking protocols, firewalls, load-balancers, switches, routers, VPN
  • Experience managing server hardware remotely
  • Project management experience and certifications
  • Degree in Computer Science or equivalent
  • ITIL or equivalent certifications

How to Apply

Please apply via the careers page on the company website. 

Apply Now