ML Infrastructure Engineer

Phizenix

Menlo Park, CA

ML Infrastructure Engineer
Menlo Park, CA | On-Site | Full-Time/Direct Hire

Client Opportunity | Through Phizenix

Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering diffusion-based large language models—built for faster generation, multimodal integration, and scalable enterprise deployment.

We’re looking for a ML Infrastructure Engineer to help build the infrastructure that powers large-scale model training and real-time inference. You’ll collaborate with world-class researchers and engineers to design high-performance, distributed systems that bring advanced LLMs into production.

Responsibilities

Design and manage distributed infrastructure for ML training at scale

Optimize model serving systems for low-latency inference

Build automated pipelines for data processing, model training, and deployment

Implement observability tools to monitor performance in production

Maximize resource utilization across GPU clusters and cloud environments

Translate research requirements into robust, scalable system designs

Must-Haves

PhD in Computer Science, Engineering, or a related field (or equivalent experience)

Strong foundation in software engineering, systems design, and distributed systems

Experience with cloud platforms (AWS, GCP, or Azure)

Proficient in Python and at least one systems-level language (C++/Rust/Go)

Hands-on experience with Docker, Kubernetes, and CI/CD workflows

Familiarity with ML frameworks like PyTorch or TensorFlow from a systems perspective

Understanding of GPU programming and high-performance infrastructure

Nice-to-Haves

Experience with large-scale ML training clusters and GPU orchestration

Knowledge of LLM-serving tools (vLLM, TensorRT, ONNX Runtime)

Experience with distributed training strategies (e.g., data/model/pipeline parallelism)

Familiarity with orchestration tools like Kubeflow or Airflow

Background in performance tuning, system profiling, and MLOps best practices

At Phizenix , we’re committed to supporting diverse and inclusive teams. This is your chance to shape the systems that power the next generation of AI innovation. Let’s build the future—together.

California Pay Range

$180,000 - $200,000 USD

Posted 2025-09-22

Recommended Jobs

Senior software engineer full stack

Verse Medical

San Francisco, CA

Our Mission: Hospital-Quality Care, Everywhere. The healthcare industry still relies on faxes and phone tag to coordinate critical care for patients at home. We think patients and the clinicians…

View Details

Posted 2025-10-27

Full Stack Software Engineer

Kapwing

San Francisco, CA

Video editing is the final frontier of software tools moving to the cloud. We're making the next generation of modern creators tools to enable everyone to share their story online. Join us at …

View Details

Posted 2025-09-13

ServiceNow Security Organization (SSO) - Associate Information Security Analyst Intern

Servicenow

San Diego, CA

Company Description It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow st…

View Details

Posted 2025-10-13

Jefe de Marketing

trabajito

Santa Cruz, CA

En Cruzimex, seguimos fortaleciendo nuestro equipo para acompañar el crecimiento de nuestras marcas. Estamos en busca de un Jefe de Marketing , con experiencia en gestión multicategoría, estrate…

View Details

Posted 2025-10-31

Product Manager, Growth

Descript

San Francisco, CA

Descript’s vision is to put video in every communicator’s toolkit. Back in the day you needed like six monitors and a bachelor’s degree to edit video. Descript lets you do it by editing docs & slides…

View Details

Posted 2025-09-25

Principal Infrastructure Engineer

Nextdata Technologies Inc

San Francisco, CA

The company The future of data lies in decentralization, and the concept of a data mesh is the proven approach for implementing this at Enterprise scale. We’re here to make it a reality. Nextdata …

View Details

Posted 2025-09-22

Software Engineer (Helm)

Amidon Heavy Industries

Los Angeles, CA

About Us Amidon Heavy Industries is a venture-backed startup transforming subsea inspection and monitoring. We build autonomous surface vessels (USVs) that autonomously launch and recover remotely o…

View Details

Posted 2025-09-22

Software & System Test Engineer (Lead, Automation & AI-Driven QA)

Chaos Industries

Hawthorne, CA

CHAOS Inc. is a global technology company delivering next-generation capabilities to the defense and critical industrial sectors. Founded in 2022 by a seasoned leadership team, CHAOS has quickly beco…

View Details

Posted 2025-10-22

AI Agent Engineer

Symbolica AI

San Francisco, CA

About Us Symbolica is an AI research lab pioneering the application of category theory to enable logical reasoning in machines. We’re a well-resourced, nimble team of experts on a mission to br…

View Details

Posted 2025-09-22

Senior AI Engineer

Sunnyvale, CA

Company Description LinkedIn is the worlds largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful con…

View Details

Posted 2025-10-19