Senior Site Reliability Engineer GPU Infrastructure

Genmo
San Francisco, CA

We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the boundaries of what's possible in video generation.

What You’ll Do

  • Own the design and day‑to‑day operation of GPU clusters that train and serve frontier generative models.

  • Lead production Kubernetes operations: GPU scheduling, cluster upgrades, multi‑cluster federation.

  • Define and implement Infrastructure‑as‑Code (Terraform, Helm, Ansible) and GitOps workflows with Argo CD or Flux.

  • Build CI/CD pipelines, automated testing, and rollout strategies for infra changes.

  • Develop an observability stack (Prometheus, Grafana, OpenTelemetry, eBPF) plus GPU telemetry with NVIDIA DCGM.

  • Optimize high‑performance networking (InfiniBand/RDMA) and debug perf bottlenecks.

  • Run and continuously improve the 24×7 on‑call rotation; lead post‑incident reviews.

  • Partner with researchers and engineers, communicate crisply, and ship with a high‑ownership mindset.

Minimum Qualifications

  • BS/MS/PhD in CS, EE, or related field.

  • 3+ yrs SRE/DevOps in production; 2+ yrs managing large Kubernetes fleets.

  • Expert‑level Kubernetes experience.

  • Proficient in Python and Bash and IaC tools (Terraform, Helm, Ansible).

  • Track record of shipping and operating large‑scale infrastructure with high reliability and clear communication.

Nice to Have

  • Multi‑cluster / multi‑cloud (AWS, GCP, Azure, bare‑metal) production experience.

  • Hands‑on with containerized GPU stacks (nvidia‑container‑toolkit, GPU Operator)

  • GPU schedulers such as Slurm or Kueue.

  • Familiarity with CI/CD tooling (GitHub Actions, BuildKit).

  • Prior work with distributed training, model‑serving patterns, or other ML/GPU workloads.

Machine‑learning depth is a plus—not a prerequisite. We’ll help you level up if needed.

Genmo is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. Genmo, Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish .

Posted 2025-09-22

Recommended Jobs

Experienced Driver

Heavenly Mountain Resort
South Lake Tahoe, CA

Create Your Experience of a Lifetime! Come work and play in the mountains!  Whether it’s your first-time seeing snow or you were born on the slopes, joining our team means discovering (or re-di…

View Details
Posted 2025-07-30

Software Developer

Taskmaster Technologies Inc
California

Position : Full-time Salary : $125,000 Location : Remote - Must Live in California, Illinois, New Jersey, or Ohio. We do not offer work visas.  Why Revvim? ~100% employer paid pla…

View Details
Posted 2025-09-14

RV RepairTechnician

Blue Compass RV
Santee, CA

Are you ready for a change and to drive your career to the next level? Start your journey with Blue Compass RV as we are looking for RV Repair Technicians to join our team and deliver extraordinary cu…

View Details
Posted 2025-08-29

Senior Associate - Deals Valuation - Financial Analytics & Derivatives (FAD) Save for Later Remove job

PwC
San Francisco, CA

At PwC, our people in deals focus on providing strategic advice and support to clients in areas such as mergers and acquisitions, divestitures, and restructuring. They help clients navigate complex…

View Details
Posted 2025-09-10

Experience Life’s Miracles in San Ramon’s Birth Hub!

NurseRecruiter
San Ramon, CA

Registered Nurse - Labor & Delivery - Travel - (LD RN) Experience the joys of aiding new life in San Ramon's thriving birth hub as a Labor and Delivery Registered Nurse. This travel position calls fo…

View Details
Posted 2025-08-20

Paralegal, Real Estate

Santa Monica, CA

Real Estate Paralegal About the Role Join our growing real estate client in their legal team as a Real Estate Paralegal in Santa Monica. This hybrid position offers the perfect balance of in-of…

View Details
Posted 2025-09-02

Accounts Payable Specialist

Hochiki Group
Buena Park, CA

Position Title : Accounts Payable Specialist Department : Accounting Location : Buena Park, CA Reports To : Accounting Supervisor Salary Range : $20.00 - $25.00 per hour (30 - 40 hou…

View Details
Posted 2025-09-14

Design Modeler

Hyundai America Technical Center, Inc. (HATCI)
California

Job description: Design Modeler WHAT YOU WILL DO Produce interior and exterior models to the instruction and/or direction of Designers, managers, and Chief Designer. Communicate with…

View Details
Posted 2025-09-10

Detailer/Car Washer

Mossy Nissan Poway
Poway, CA

We are currently seeking Automotive Detailers to join our growing team. We offer a clean and safe environment. Come be part of a team determined to be the best! The Detailer cleans & refurbishes new a…

View Details
Posted 2025-09-10

Principal Product Manager, Managed AI Services

Crusoe
San Francisco, CA

Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated,  purpose-built AI infrastructure solutions trusted by Fortune 500 companies to p…

View Details
Posted 2025-09-13