Senior Site Reliability Engineer (GCP / Kubernetes)
About Us
Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthcare accessibility and health outcomes in the world by bringing deep healthcare expertise to every human. No other technology has the potential to have this level of global impact on health.
Why Join Our Team
Innovative Mission: We are developing a safe, healthcare-focused large language model (LLM) designed to revolutionize health outcomes on a global scale.
Visionary Leadership: Hippocratic AI was co-founded by CEO Munjal Shah, alongside a group of physicians, hospital administrators, healthcare professionals, and artificial intelligence researchers from leading institutions, including El Camino Health, Johns Hopkins, Stanford, Microsoft, Google, and NVIDIA.
Strategic Investors: We have raised a total of $278 million in funding, backed by top investors such as Andreessen Horowitz, General Catalyst, Kleiner Perkins, NVIDIA’s NVentures, Premji Invest, SV Angel, and six health systems.
World-Class Team: Our team is composed of leading experts in healthcare and artificial intelligence, ensuring our technology is safe, effective, and capable of delivering meaningful improvements to healthcare delivery and outcomes.
For more information, visit .
We value in-person teamwork and believe the best ideas happen together. Our team is expected to be in the office five days a week in Palo Alto, CA unless explicitly noted otherwise in the job description
About the Role
We are seeking a highly skilled Senior Site Reliability Engineer to join our team. In this role responsibilities will include designing and implementing infrastructure automation, continuous integration and delivery pipelines, and monitoring and scaling the infrastructure that powers our healthcare AI platform. You will work closely with software engineers, research scientists, and other cross-functional teams to develop and maintain reliable and scalable infrastructure that enables rapid iteration and deployment of our products.
Key Responsibilities
Design and implement infrastructure automation and deployment pipelines using tools such as Terraform, Ansible, and Jenkins
Implement and maintain monitoring and logging systems to ensure the reliability and performance of our healthcare AI platform
Work closely with software engineers to design and deploy scalable, fault-tolerant, and secure production systems on cloud platforms such as AWS, GCP, or Azure
Develop and maintain security and compliance policies and procedures for our healthcare AI platform
Collaborate with cross-functional teams to troubleshoot and resolve complex issues related to infrastructure, deployment, and operations
Implement and maintain disaster recovery and business continuity plans
Develop and maintain documentation related to infrastructure, deployment, and operations
Mentor and provide technical guidance to junior engineers
Qualifications
Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field
At least 5 years of professional experience in DevOps engineering or a related field
Expertise in infrastructure automation and deployment tools such as Terraform, Ansible, Jenkins, or GitLab CI/CD
Experience with cloud platforms such as AWS, GCP, or Azure
Strong knowledge of containerization technologies such as Docker and Kubernetes
Experience with monitoring and logging tools such as ELK, Grafana, or Datadog
Familiarity with security and compliance best practices and tools such as HashiCorp Vault, AWS KMS, or Azure Key Vault
Strong problem-solving skills and ability to work independently and collaboratively in a team environment
Excellent communication and interpersonal skills
Experience implementing HIPAA and SOC2 compliance in a plus
Experience working in an HPC Environment is a plus
Recommended Jobs
Electrician
Job Description Job Description Commercial Electrician with at least 3 or more years exp.
Sr Backend Engineer - Tech Lead
Real is a fast-growing national real estate brokerage powered by technology. Real is currently operating in all U.S. states, Canada, and the District of Columbia. Founded in 2014, Real is a trailblaz…
Customer Support Agent
At Dexterity, we believe robots can positively transform the world. Our breakthrough technology frees people to do the creative, inspiring, problem-solving jobs that humans do best by enabling robots…
Sr. Staff UI/Frontend Engineer (NetSec)
Company Description Our Mission At Palo Alto Networks® everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vi…
DevOps Engineer (Linux/IT)
Who You Are You embrace the ethic of “infrastructure as code” to provide consistency and security for both data center and IT deployments. You deliver solutions that are scalable, maintainable…
Software Engineer - Observability
Job Description Job Description About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highl…
Commercial Assistant Property Manager
Assistant Property Manager – Must have commercial experience Direct Hire; $79K - $100k plus bonus Emeryville, CA 94608 Seeking highly motivated Assistant Property Manager (APM) to join team in …
Software Engineer
Job Duties: Design and implement control systems for in-vehicle autonomous driving functionalities; Develop planning algorithms for trajectories and designing policies and plans to manage …
Operations Team Manager
Tiffany & Co. seeks an Operations Team Manager in Costa Mesa to lead and support sales, operations, and merchandise teams. This role involves ensuring inventory levels, driving client relationships, a…
Lead Cloud Infrastructure Engineer
About the Role Together AI is hiring a Lead Cloud Infrastructure Engineer to own and operate the cloud foundation that powers our rapidly scaling data platforms. In this role, you will be the pr…