Staff Software Engineer, Ads ML Inference Infrastructure
Staff Software Engineer, Ads ML Inference Infrastructure
The Ads ML Inference Infra team owns the online inference and feature serving systems that power real-time model scoring and delivery for all Ads models at Pinterest. The team is looking for a staff engineer with strong hands-on experience in large-scale ML inference systems, as well as capabilities in solving ambiguous technical problems and driving strategic, cross-functional efforts.
What you’ll do:
- Lead and drive efforts to build next-generation model inference and feature serving systems that power up to 100x larger models and directly uplevel Pinterest’s monetization business.
- Design and optimize low-latency, high-throughput inference pipelines to meet strict SLOs while improving performance, efficiency, and cost .
- Partner with Ads ML and product teams to productionize new model architectures (including LLMs and multi-stage ranking models) and scale them reliably to global traffic.
- Evolve the online feature platform (feature computation, caching, and retrieval) to improve coverage, freshness, and consistency for Ads models.
- Evaluate and integrate new technologies (e.g., GPU acceleration, model compression, Triton, vLLM, Dynamo ) to advance our inference stack.
- Build strong partnerships with other infra and ML teams to improve end-to-end reliability, observability, and developer velocity for Ads ML.
- Mentor and coach other engineers, guiding them through technical decisions, system design, and career development.
What we’re looking for:
- BS (or higher) degree in Computer Science or a related field.
- ~8+ years of relevant industry experience designing and operating large-scale, production ML or distributed infra systems .
- Deep knowledge of at least one programming language ( Java, C++, Python ).
- Deep experience with distributed systems or recommendation / ads serving infrastructure (e.g., request routing, online storage, caching, feature serving, APIs).
- Hands-on experience with at least one deep learning framework ( PyTorch or TensorFlow ) and bringing models from offline experimentation to production.
- [Preferred] Experience with model / hardware accelerator libraries (e.g., CUDA, quantization, distillation, low-precision inference).
- [Preferred] Experience with inference optimization and serving frameworks such as Triton, vLLM, or Dynamo .
- Proven track record of leading complex projects , setting technical direction, and collaborating across functions and orgs ; experience mentoring and coaching other engineers.
In-Office Requirement Statement:
- We let the type of work you do guide the collaboration style. That means we're not always working in an office, but we continue to gather for key moments of collaboration and connection.
- This role will need to be in the office for in-person collaboration 1-2 times per week and therefore needs to be in a commutable distance from one of the following offices Palo Alto, CA; San Francisco, CA; Seattle, WA.
Relocation Statement:
- This position is not eligible for relocation assistance. Visit our PinFlex page to learn more about our working model.
#LI-HYBRID
#LI-AG8
Recommended Jobs
Deli Cook
Position Title: Deli-Cook -Part time (Could lead to Full Time) Department: Market Report To: Market Manager and Market Supervisor Wage: $17.00 DOE Position Summary: The primary focus…
Accounts Payable Specialist
Position Overview: The Accounts Payable Specialist is responsible for managing the full cycle of vendor invoice processing and reconciliation to ensure timely and accurate payment of obligations. Th…
Litigation Assistant (San Jose)
Litigation Assistant - Business, commercial, trust & estate, or real estate litigation! This Jobot Job is hosted by: Jacob Vane Are you a fit? Easy Apply now by clicking the Apply button and sen…
HR Specialist/Supervisor
Job Title: HR Specialist **Overview:** As an HRBP at SwiftX Inc., you will play a crucial role in supporting our human resources department in various administrative functions. Your duties will e…
Software Engineer, Map Health and Validation
Who We Are Aurora’s mission is to deliver the benefits of self-driving technology safely, quickly, and broadly. The Aurora Driver will create a new era in mobility and logistics, one that will …
Insurance Defense Attorney (Sacramento)
This Jobot Job is hosted by: Karyn Spies Are you a fit? Easy Apply now by clicking the Apply button and sending us your resume. Salary: $140,000 - $175,000 per year A bit about us: Civil l…
Sr. Manager, Global Client Strategy & Operations
Job Description Global Client Team Overview The Global Client Team is a high-performing exception to Visa’s regional operating model. It exists to provide a differentiated sales and service mode…
Staff Software Engineer, Cluster Orchestration
CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confid…
Program Manager - Healthcare IT Portfolio, Consultant (Los Angeles)
Precision meets innovation! Join a top aerospace manufacturer as a QC Inspector in sunny Santa Ana. This Jobot Job is hosted by: Kelly Breen Are you a fit? Easy Apply now by clicking the Apply …
Senior Staff Software Engineer, Consumer Experiences
About Quizlet: At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way. Our $1B+ learning platform serves tens of millions of students ever…