Lead Site Reliability Engineer
About Glean:
Founded in 2019, Glean is an innovative AI-powered knowledge management platform designed to help organizations quickly find, organize, and share information across their teams. By integrating seamlessly with tools like Google Drive, Slack, and Microsoft Teams, Glean ensures employees can access the right knowledge at the right time, boosting productivity and collaboration. The company’s cutting-edge AI technology simplifies knowledge discovery, making it faster and more efficient for teams to leverage their collective intelligence.
Glean was born from Founder & CEO Arvind Jain’s deep understanding of the challenges employees face in finding and understanding information at work. Seeing firsthand how fragmented knowledge and sprawling SaaS tools made it difficult to stay productive, he set out to build a better way - an AI-powered enterprise search platform that helps people quickly and intuitively access the information they need. Since then, Glean has evolved into the leading Work AI platform, combining enterprise-grade search, an AI assistant, and powerful application- and agent-building capabilities to fundamentally redefine how employees work.
About the Role:
Glean is seeking a Site Reliability Engineering Lead to foster a culture of engineering excellence, drive technical strategy, and develop a high-performing, collaborative team. Your role is pivotal in ensuring our services meet stringent Service Level Objectives (SLOs) and in building resilient, automated production environments in the cloud. You'll lead a team and be responsible for products globally, providing technical leadership to key projects and empowering your team to do the same.
Much of our software development focuses on building infrastructure to scale our operations in a hybrid cloud environment and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale and fast growth which are unique to Glean, while using your expertise in coding, algorithms, problem-solving, and SRE practices. We keep Glean applications up and running, ensuring our customers have the best and most reliable experience possible.
You are:
- Technical Leadership and Mentorship : Play a key role in driving technical excellence and fostering a culture of reliability across engineering teams. You will lead by example, setting best practices for incident management, performance optimization, and automation. Influence best practices, drive cross-team collaborations, and contribute to the execution of key objectives in alignment with engineering leadership and cross-functional partners. Establish strong technical credibility, shaping architectural decisions and ensuring the delivery of high-quality, reliable systems.
- Ensure High Availability: Implement and maintain resilient cloud architectures, monitor system performance, and proactively identify and resolve potential bottlenecks or points of failure.
- Incident Management: Participate in primary oncall rotation; cultivate technical curiosity and growth mindset, and a blameless postmortem culture within the team. Continuously optimize the on-call process for sustainability and efficiency.
- Automation and Tooling: Develop and maintain automation scripts, tools, and processes to streamline system deployment, monitoring, and management tasks. Your contributions will be vital in efficiently scaling cloud operations.
- Performance Optimization: Optimize cloud infrastructure and applications for performance, scalability, and cost-effectiveness.
- Security and Compliance: Collaborate with security engineers to implement best practices and ensure compliance with security standards and policies.
- Monitoring and Alerting: Design and configure advanced monitoring systems to gain insights into system behavior, set up alerts, and respond proactively to potential issues. Create and maintain comprehensive dashboards and playbooks for production on-call.
- Software Development Consultation: Engage actively in the entire software development lifecycle. Participate in system design reviews and provide valuable SRE insights during launch reviews, influencing and enhancing system architecture.
About you:
- Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
- 8+ years of experience in a senior-level role within Site Reliability Engineering or similar role, particularly in managing cloud-based services and infrastructure.
- 5+ years of experience with software development in one or more programming languages.
- 3+ years of experience managing people or teams, leading projects, and designing, analyzing, and troubleshooting distributed systems running in Cloud.
- Strong knowledge of cloud platforms such as Google Cloud Platform, AWS, or Azure.
- Practical experience with containerization technologies, including Docker and Kubernetes. Familiarity with infrastructure as code tools like Terraform is essential.
- Solid understanding of networking, security principles, and best SRE and security practices.
- Proficiency in using monitoring and alerting tools to detect and respond to potential issues effectively
Location:
- This role is hybrid (4 days a week in one of our Palo Alto Office)
Compensation & Benefits:
The standard base salary range for this position is $200,000 - $260,000 annually. Compensation offered will be determined by factors such as location, level, job-related knowledge, skills, and experience. Certain roles may be eligible for variable compensation, equity, and benefits.
We offer a comprehensive benefits package including competitive compensation, Medical, Vision, and Dental coverage, generous time-off policy, and the opportunity to contribute to your 401k plan to support your long-term goals. When you join, you'll receive a home office improvement stipend, as well as an annual education and wellness stipends to support your growth and wellbeing. We foster a vibrant company culture through regular events, and provide healthy lunches daily to keep you fueled and focused.
We are a diverse bunch of people and we want to continue to attract and retain a diverse range of people into our organization. We're committed to an inclusive and diverse company. We do not discriminate based on gender, ethnicity, sexual orientation, religion, civil or family status, age, disability, or race.
#LI-HYBRID
Recommended Jobs
Print Buyer (Production Materials)
Join Team FSSI and Become an Employee-Owner! Who We Are: Opening its doors over 45 years ago, FSSI is a leading document outsourcing company servicing Fortune 500 companies in the financial,…
Facilities Accounts Payable Specialist
Full-time Description Westamerica is among the largest commercial banks headquartered in California. We are looking for outstanding people to join our winning team. We reached our current s…
Graphics Software Engineer, Senior
Company: Qualcomm Technologies, Inc. Job Area: Engineering Group, Engineering Group Graphics Software Engineering General Summary: As a leading technology innovator, Qualcomm pushes…
Senior Infrastructure Engineer
Headquartered in the United States, TP-Link Systems Inc. is a leading global provider of networking devices and smart home products. Consistently ranked as the world's top provider of Wi-Fi devices…
Ready Mix Sales Representative
Job Summary The Ready-Mix Sales Representative is responsible for all sales and sales-related activities in the assigned territory. Related activities include customer care, monitoring market tre…
Ortho NP/PA's are needed to join a prominent group just outside of Orland, CA
Palm Health Resources, a premier healthcare staffing firm, is currently seeking a dedicated Orthopedic Nurse Practitioner (NP) or Physician Assistant (PA) for a permanent position in California. This…
Senior Software Engineer, Product Engagement (Backend)
About the Company Gemini is a global crypto and Web3 platform founded by Cameron and Tyler Winklevoss in 2014, offering a wide range of simple, reliable, and secure crypto products and services …
Outreach and Community Manager, PGA Sustainability (Film/TV Organization)
About the Position The Outreach and Community Manager, PGA Sustainability is responsible for building, managing, and engaging a community of film, television and emerging media producers around su…
Receiving Clerk 1 M-F
• Receive materials and input information into Royal IV system according to the purchase order. • Data entry into various applications Excel, Microsoft word, Royal IV. • Accurate documentation of …
Gear Machinist
Gear Machinist Western Precision Aero, LLP, WPA located in Garden Grove, California, is a manufacturer of precision machined components and assemblies for the aerospace and industrial markets. W…