Staff Software Engineer, ML Infrastructure

Decagon

San Francisco, CA

About Decagon

Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.

Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.

We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.

We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.

About the Team

The ML Infrastructure team builds the systems that power every stage of Decagon's model lifecycle. We own the platforms for model training, the infrastructure for model evaluation and experimentation, and the routing layer that manages inference across multiple providers.

We work at the intersection of research and production: translating cutting-edge ML techniques into reliable, scalable systems that run in customer environments. We collaborate closely with Research, Infrastructure, and Product teams to ensure models train efficiently, serve reliably, and deliver exceptional user experiences.

The team values technical rigor, pragmatic decision-making, and building systems that others love to use.

About the Role

We're hiring a Staff ML Infrastructure Engineer to own the platforms powering Decagon's model training and inference. You'll build distributed training systems, design inference architecture across multiple providers, and create the frameworks that let our Research and Product teams ship faster.

This role is for someone who thrives on technical depth, can lead multi-quarter initiatives, and wants to shape the long-term architecture of our ML stack.

In this role, you will

Design and build distributed training platforms for LLM and multimodal fine-tuning and post-training at scale
Implement and integrate state-of-the-art training algorithms into production pipelines
Own inference architecture and multi-provider routing, including failover and optimization
Research and implement inference optimizations including quantization, speculative decoding, and batching strategies
Lead initiatives to improve latency and cost efficiency across the training and serving stack
Build evaluation and experimentation infrastructure that enables rapid, reliable iteration
Drive technical direction, mentor engineers, and establish best practices for ML infrastructure

Your background looks something like this

8+ years building ML infrastructure or production systems at scale
Deep experience with distributed training: multi-node GPU clusters, fault tolerance, and optimization
Strong understanding of LLM inference: latency optimization, provider tradeoffs, and serving architecture
Proficiency in Python and modern ML frameworks (PyTorch, JAX, or TensorFlow)
Proven track record leading complex, multi-quarter technical projects

Benefits

Medical, dental, and vision benefits
Take what you need vacation policy
Daily lunches, dinners and snacks in the office to keep you at your best

Compensation

$300K – $430K + Offers Equity

Posted 2026-02-28

Recommended Jobs

Job Captain

HMC Architects

Ontario, CA

Who We Are HMC Architects is an employee-owned design firm with an inherent desire to make a difference in our communities. As a purpose-driven brand based on values, our mission to design for good d…

View Details

Posted 2026-02-24

Legal Assistant - Consumer Class Action

Wilshire Law Firm

Los Angeles, CA

Legal Secratary - Consumer Class Action Wilshire Law Firm is a distinguished, award-winning legal practice with over 18 years of experience, specializing in Personal Injury, Employee Rights, and Co…

View Details

Posted 2026-02-28

Law & Motion Attorney - Wage & Hour Class Action (Remote)

Wilshire Law Firm

Torrance, CA

Law & Motion Attorney - Wage & Hour Class Action (Remote) Wilshire Law Firm is a distinguished, award-winning legal practice with over 18 years of experience, specializing in Personal Injury, Emplo…

View Details

Posted 2026-02-25

Retail Operations Coordinator

Sézane

Los Angeles, CA

“I’ve had the joy of building Sézane alongside you - the first French fashion brand born online, rooted in quality, creativity, and intention. To tell the story of Sézane is to retrace more than 20…

View Details

Posted 2026-02-04

Direct Mail Machine Operator

Financial Statement Services, Inc

Santa Ana, CA

Opening its doors over 45 years ago, FSSI is a leading document outsourcing company servicing Fortune 500 companies in the financial, banking, insurance and billing industries across the U.S. FSSI…

View Details

Posted 2026-01-15

Senior Software Engineer, Platform

Unify

San Francisco, CA

About Unify: Unify was founded January 17th, 2023 by Austin Hughes and Connor Heggie. Prior to Unify, Austin led Ramp’s growth product team focused on new customer acquisition, and Connor was a ma…

View Details

Posted 2026-02-13

Branch Coordinator

Northwest Pump

West Sacramento, CA

Northwest Pump is seeking a Branch Coordinator for our Sacramento, CA branch. What does a Branch Coordinator do? This is a blend of positions; learning first the duties of a Warehouse Specialist wil…

View Details

Posted 2026-01-30

Call Center Sales Manager

PSI

Irvine, CA

PSI is seeking a highly motivated and experienced Call Center Sales Manager to lead our sales team in Irvine and drive revenue growth. This is an onsite position that offers an exciting opportunity t…

View Details

Posted 2026-02-25

Manager, Short-Form Content

Greenfly

Santa Monica, CA

Manager, Short-Form Content At Greenfly, we empower the world’s biggest sports leagues, teams, athletes, and entertainment brands to engage their audiences through authentic digital content. O…

View Details

Posted 2026-01-15

Senior Accounting Associate

TMF Group

San Diego, CA

We never ask for payment as part of our selection process, and we always contact candidates via our corporate accounts and platforms. If you are approached for payment, this is likely to be fraudul…

View Details

Posted 2026-02-19