ML Infrastructure Engineer

Phizenix
Menlo Park, CA

ML Infrastructure Engineer
Menlo Park, CA | On-Site | Full-Time/Direct Hire


Client Opportunity | Through Phizenix

Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering diffusion-based large language models—built for faster generation, multimodal integration, and scalable enterprise deployment.

We’re looking for a ML Infrastructure Engineer to help build the infrastructure that powers large-scale model training and real-time inference. You’ll collaborate with world-class researchers and engineers to design high-performance, distributed systems that bring advanced LLMs into production.

Responsibilities




  • Design and manage distributed infrastructure for ML training at scale



  • Optimize model serving systems for low-latency inference



  • Build automated pipelines for data processing, model training, and deployment



  • Implement observability tools to monitor performance in production



  • Maximize resource utilization across GPU clusters and cloud environments



  • Translate research requirements into robust, scalable system designs


Must-Haves




  • PhD in Computer Science, Engineering, or a related field (or equivalent experience)



  • Strong foundation in software engineering, systems design, and distributed systems



  • Experience with cloud platforms (AWS, GCP, or Azure)



  • Proficient in Python and at least one systems-level language (C++/Rust/Go)



  • Hands-on experience with Docker, Kubernetes, and CI/CD workflows



  • Familiarity with ML frameworks like PyTorch or TensorFlow from a systems perspective



  • Understanding of GPU programming and high-performance infrastructure


Nice-to-Haves




  • Experience with large-scale ML training clusters and GPU orchestration



  • Knowledge of LLM-serving tools (vLLM, TensorRT, ONNX Runtime)



  • Experience with distributed training strategies (e.g., data/model/pipeline parallelism)



  • Familiarity with orchestration tools like Kubeflow or Airflow



  • Background in performance tuning, system profiling, and MLOps best practices


At Phizenix , we’re committed to supporting diverse and inclusive teams. This is your chance to shape the systems that power the next generation of AI innovation. Let’s build the future—together.

California Pay Range

$180,000 - $200,000 USD

Posted 2026-02-07

Recommended Jobs

Plumbing Sales Representative

P.E.A.C.H. Teams
Long Beach, CA

Apply Today and Join our Team! $130,000 Income Opportunity!!! This position provides the customer options, design and education in the sale or replacement of residential plumbing. Requirements …

View Details
Posted 2026-01-24

Sr. Autonomy Data Collection and Prototyping Engineer

Rivian
California

About Rivian Rivian is on a mission to keep the world adventurous forever. This goes for the emissions-free Electric Adventure Vehicles we build, and the curious, courageous souls we seek to att…

View Details
Posted 2026-01-15

Sr. Data Product Manager - Telemetry & Streaming

Playstation Global
San Diego, CA

Why PlayStation? PlayStation isn’t just the Best Place to Play — it’s also the Best Place to Work. Today, we’re recognized as a global leader in entertainment producing The PlayStation family of pr…

View Details
Posted 2026-02-07

Veterinary Assistant

Companion Pet Partners, LLC
Milpitas, CA

Why Join Our Team: Parktown Veterinary Clinic is a fun and fast-paced general practice committed to providing exceptional care to our patients. We're dedicated to ensuring both animals and their owne…

View Details
Posted 2026-01-26

Senior Software Engineer, Perception Applications

Latitude Ai
Palo Alto, CA

Latitude AI ( lat.ai ) develops automated driving technologies, including L3, for Ford vehicles at scale. We’re driven by the opportunity to reimagine what it’s like to drive and make travel safer, l…

View Details
Posted 2026-02-07

Senior Software Engineer - Backend Microservices, Rust (Bay Area)

Fortanix
Santa Clara, CA

About Us : In today's world, where data spreads across various clouds and devices, traditional security measures aren't enough. Businesses need a dynamic approach to defend against constant cyber…

View Details
Posted 2026-01-22

Associate Software Engineer in Test

Veeva Systems
Pleasanton, CA

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in histo…

View Details
Posted 2025-10-19

Travel Nurse - Oncology Job in Chico, CA - $13,723 per Month (2 Years Experience Needed)

Vetted Health
Chico, CA

Vetted is seeking a RN - Oncology for a travel job in Chico, California . Must have 2+ years of experience. This contract pays approximately $13,723/month gross. Assignment details: Co…

View Details
Posted 2026-02-07

Restaurant General Manager

Taco Bell - B&G Food Enterprises
Cypress, CA

" You are applying for work with a franchisee of Taco Bell, not Taco Bell Corp. or any of its affiliates. If hired, the franchisee will be your only employer. Franchisees are independent business owne…

View Details
Posted 2026-01-28