Machine Learning Engineer - Model Performance

Inference

San Francisco, CA

Inference.net is seeking a Machine Learning Engineer to join our team, focusing on optimizing the performance of our cutting-edge AI inference systems. This role involves working with state-of-the-art large language models and ensuring they run efficiently and effectively at scale. You will be responsible for deploying state-of-the-art models at scale and performing optimizations to increase throughput and enable new features. This position offers the chance to collaborate closely with our engineering team and make significant contributions to open source projects, like SGLang and vLLM.

About Inference.net

We are building a distributed LLM inference network that combines idle GPU capacity from around the world into a single cohesive plane of compute that can be used for running large-language models like DeepSeek and Llama 4. At any given moment, we have over 5,000 GPUs and hundreds of terabytes of VRAM connected to the network.

We are a small, well-funded team working on difficult, high-impact problems at the intersection of AI and distributed systems. We primarily work in-person from our office in downtown San Francisco. Our investors include A16z CSX and Multicoin. We are high-agency, adaptable, and collaborative. We value creativity alongside technical prowess and humility. We work hard, and deeply enjoy the work that we do.

Responsibilities

Design and implement optimization techniques to increase model throughput and reduce latency across our suite of models
Deploy and maintain large language models at scale in production environments
Deploy new models as they are released by frontier labs
Implement techniques like quantization, speculative decoding, and KV cache reuse
Contribute regularly to open source projects such as SGLang and vLLM
Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vLLM, SGLang, CUDA, and other libraries to debug ML performance issues
Collaborate with the engineering team to bring new features and capabilities to our inference platform
Develop robust and scalable infrastructure for AI model serving
Create and maintain technical documentation for inference systems

Requirements

3+ years of experience writing high-performance, production-quality code
Strong proficiency with Python and deep learning frameworks, particularly PyTorch
Demonstrated experience with LLM inference optimization techniques
Hands-on experience with SGLang and vLLM, with contributions to these projects strongly preferred
Familiarity with Docker and Kubernetes for containerized deployments
Experience with CUDA programming and GPU optimization
Strong understanding of distributed systems and scalability challenges
Proven track record of optimizing AI models for production environments

Nice to Have

Familiarity with TensorRT and TensorRT-LLM
Knowledge of vision models and multimodal AI systems
Experience implementing techniques like quantization and speculative decoding
Contributions to open source machine learning projects
Experience with large-scale distributed computing

Compensation

We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is $180,000 - $250,000, plus competitive equity and benefits including:

Full healthcare coverage
Quarterly offsites
Flexible PTO

Equal Opportunity

Inference.net is an equal opportunity employer. We welcome applicants from all backgrounds and don't discriminate based on race, color, religion, gender, sexual orientation, national origin, genetics, disability, age, or veteran status.

If you're passionate about building the next generation of high-performance systems that push the boundaries of what's possible with large language models, we want to hear from you!

Posted 2025-09-22

Recommended Jobs

Senior Backend Engineer

Doxel

San Francisco, CA

Construction is the 2nd largest industry in the world (4x the size of SaaS!). But unlike software (with observability platforms such as AppDynamics and Datadog), construction teams lack automated fee…

View Details

Posted 2025-09-13

Senior Software Engineer, DevOps

Ixl Learning

San Mateo, CA

IXL Learning, developer of personalized learning products used by millions of people globally, is expanding our software engineering team that manages the infrastructure for our Rosetta Stone and Wyz…

View Details

Posted 2025-09-12

Software Developer / Software Engineer

Reinventing The Internet | Psudo.org

San Francisco, CA

Other locations : Canada, London, India & Australia. Remote : OK Job Role: As a software developer, you’ll be the brain behind crafting, developing, testing, going live and…

View Details

Posted 2025-09-14

Dispatcher

Horizon Transportation Llc

Walnut, CA

Job Description Job Description Overview: This in-office position requires a candidate who can handle the phone system, answer calls quickly, and manage multiple tasks. The candidate must be ava…

View Details

Posted 2025-07-29

DRIVER

Skyhop Global

San Diego, CA

Job Description Job Description Description: Job description SkyHop Global is Hiring and making On-the-Spot job offers. Join us to learn more about a Dynamic Career in the Transportation Indu…

View Details

Posted 2025-07-30

Full Time Internal Medicine Job Compton, CA

The Inline Group The Inline Group

Compton, CA

The Inline Group - Full Time Hours:Monday to Friday: 9 am -5 pm Employed New Graduates Average Patients seen: 18-20 Loan Repayment Sign-On Bonus Compensation: $250,000 base sa…

View Details

Posted 2025-09-10

Software Engineer

Inflow Federal

San Diego, CA

At INflow Federal, we're not just navigating the frontier of digital transformation; we're reshaping it. Our dedication to merging the prowess of humans and machines to solve complex problems has set…

View Details

Posted 2025-09-13

Boiler Technician

TechFlow, Inc.

Oceanside, CA

Boiler Technician - Marine Corp Base Camp Pendleton Competitive Wages and an INSURANCE ALLOWANCE! Top reasons to work at EMI Services, a subsidiary of TechFlow: Paid Time Off - Vacat…

View Details

Posted 2025-08-07

Client Service Representative

Confidential

Anaheim, CA

As a Client Service Representative at Confidential Careers, you will be at the forefront of enhancing our customer experiences and upholding our commitment to service excellence. In this role, you wil…

View Details

Posted 2025-09-10

Community Based Instructor

NCI Affiliates

Paso Robles, CA

Why join our team at NCI Affiliates? Do you enjoy helping others while exploring our local community? This position will allow you to encourage impendence and support to our clients. NCI Affilia…

View Details

Posted 2025-09-22