Senior Inference Platform Engineer - Data Center

San Francisco, CA

Join a stealth-mode hyperscale data center startup building an AI and cloud platform, powered by thousands of H100s, H200s, and B200s, ready to go for experimentation, full-scale model training, or inference.

Our client operates high-performance GPU clusters powering some of the most advanced AI workloads worldwide. They’re now building a serverless inference platform, beginning with cost-efficient batch inference and expanding into low-latency, real-time inference and custom model hosting. This is a unique chance to join at an early stage and help define the architecture, scalability, and technical direction of that platform.

If you are interested in this opportunity, get in tuch! You don't want to miss this opportunity!

Key Responsibilities

Take ownership of the inference platform architecture, from batch to low-latency workloads.
Design, build, and optimise distributed inference systems to maximise GPU utilisation and minimise cold starts.
Integrate, tune, and operate inference engines such as vLLM, SGLang, and TensorRT-LLM across multiple model types.
Develop APIs, orchestration layers, and autoscaling logic to support both multi-tenant and dedicated deployments.
Collaborate with cross-functional teams to translate business and customer needs into robust technical solutions.
Stay up to date with the latest models, serving frameworks, and optimisation techniques, applying best practices in performance and efficiency.
Implement monitoring, alerting, and observability workflows for production systems.

Requirements:

5+ years’ experience building large-scale, fault-tolerant distributed systems (ML inference, HPC, or similar).
Proficiency in Python, Go, Rust, or a comparable language.
Strong understanding of GPU software stacks (CUDA, Triton, NCCL) and Kubernetes orchestration.
Practical experience with model-serving frameworks such as vLLM, SGLang, TensorRT-LLM, or custom PyTorch deployments.
Knowledge of performance optimisation techniques, including batching, speculative decoding, quantisation, and caching.
Familiarity with Infrastructure-as-Code tools (Terraform, Helm) and low-level OS performance tuning.

Nice to Have

Experience with event-driven or serverless architectures.
Exposure to hybrid cloud or multi-cluster environments.
Contributions to open-source ML or inference systems projects.
Proven track record of cost optimisation in high-performance compute environments.

Benefits:

Equity

Salary:

$300,000 gross per year

Posted 2025-11-21

Recommended Jobs

Construction Site Superintendent, Multi-Family Rehabilitation | General Contractor | Berkeley, CA - MyGreat Recruitment

MyGreat Recruitment

Berkeley, CA

C onstruction Site Superintendent â€“ Multi-Family Rehabilitation Location: Â Berkeley, CA Employment Type: Â Permanent, Full-Time Compensation: Â Competitive - $100,000â€“$140,000 + bon…

View Details

Posted 2026-07-06

General Liability Associate Managing Attorney

Travelers

Diamond Bar, CA

Who Are We? Taking care of our customers, our communities and each other. That’s the Travelers Promise. By honoring this commitment, we have maintained our reputation as one of the best property ca…

View Details

Posted 2026-05-15

Audiologist OR Hearing Instrument Specialist

CQ Partners

Cupertino, CA

Job Description Job Description Who We Are: BAL Labs is a privately-owned Audiology practice in Cupertino, CA. help patients in our community and we are passionate about providing both the …

View Details

Posted 2026-06-26

Banana Republic Head of Creative, Banana Republic

Banana Republic

San Francisco, CA

About Banana Republic Banana Republic is a storyteller's brand, outfitting the modern explorer with high-quality, expertly crafted collections made to inspire and enrich life's journeys. Founded i…

View Details

Posted 2026-05-15

Environmental Supervisor / PROJECT MANAGER

Ten West

Fontana, CA

A Supervisor is responsible for leading a team of hazmat response professionals in completing projects by a set deadline to uphold business initiatives. Ten West is willing to train the right individ…

View Details

Posted 2026-06-28

Primary Care Physician (MD/DO) - Womens Health

Korzen Health

Pasadena, CA

Job Description Job Description Korzen Health is proud to partner with a premier, modern women's healthcare group dedicated to high-quality, whole-person care. Due to sustained growth and expans…

View Details

Posted 2026-06-26

Mechanical Project Manager

Gulfstream Strategic Placements, LLC

Oceanside, CA

Mechanical Project Manager Oceanside, CA Currently looking for an experienced Mechanical Project Manager to fill an opening with a large Mechanical Contractor located in the San Diego County ar…

View Details

Posted 2026-07-03

Medical Director

Yokayo Veterinary Center

Ukiah, CA

Managing Veterinarian / Medical Director (MDVM) Yokayo Veterinary Center | Ukiah, CA $150,000–$200,000+ base (DOE) $50,000+ sign-on & relocation package AAHA-accredited | Small-animal exclusi…

View Details

Posted 2026-03-27

Product Complaint Manager

Boston Scientific

Carlsbad, CA

Additional Location(s): US-CA-Carlsbad; US-CA-Valencia; US-MN-Arden Hills; US-MN-Maple Grove Diversity - Innovation - Caring - Global Collaboration - Winning Spirit - High Performance At Bosto…

View Details

Posted 2026-06-26