Backend Software Engineer (ML Infra)

Rockstar
San Francisco, CA

Rockstar is recruiting for a mobile-first digital product studio that turns ideas into extraordinary experiences. They are a team of dynamic and savvy professionals who know how to create killer digital products. Our lean structure and remote team mean we can move fast while still delivering top-notch technology and design.

Our client is building the AI backbone for the next generation of intelligent products. They help fast-growing AI startups design, fine-tune, evaluate, deploy, and maintain specialized models across text, vision, and embeddings.

Think of them as “AWS for AI models”—not data or raw compute, but a full-stack backend for fine-tuning, reinforcement learning, inference, and long-term model maintenance.

Their customers are Series A–C AI companies building enterprise-grade products. Their promise is simple: they make your AI system better.

They are hiring a Backend Software Engineer (ML Infrastructure) to help design, build, and scale the core systems that power large-scale model training and deployment.

The candidate will work on distributed training pipelines, cloud-native infrastructure, and internal developer platforms that support fine-tuning, reinforcement learning, and inference at scale. This role sits at the intersection of backend engineering and ML systems—the candidate will collaborate closely with ML engineers while owning production-grade infrastructure.

This is an ideal role for an early-career engineer who wants to work on real distributed systems, GPU workloads, and modern ML infrastructure—not dashboards or CRUD apps.

What You’ll Do

Build & Scale Core Infrastructure

- Design and implement backend systems that support large-scale ML workloads, including fine-tuning and reinforcement learning.

- Build distributed training and inference pipelines that are efficient, fault-tolerant, and observable.

- Develop internal developer tools and platforms that make it easier for ML engineers to train, evaluate, and deploy models.

Cloud & Systems Engineering

- Work on cloud-native systems using containers and orchestration (e.g., Kubernetes).

- Optimize systems for performance, reliability, and cost efficiency, especially for GPU-heavy workloads.

- Implement monitoring, logging, and observability for long-running training jobs and production services.

Collaborate with ML Engineers

- Partner closely with ML engineers to support evolving model architectures, training workflows, and evaluation needs.

- Translate ML requirements into scalable backend and infrastructure solutions.

Who You Are

Required

- 1–3 years of backend engineering experience, ideally working on production systems.

- Strong fundamentals in distributed systems, networking, and backend architecture.

- Experience building systems that scale under real load.

- Comfortable working in Python and/or Go (or similar backend languages).

- Excited to work on-site in San Francisco with a fast-moving early-stage team.

Strongly Preferred

- Experience with or exposure to ML infrastructure or ML platforms.

- Familiarity with GPU workloads, training pipelines, or inference systems.

- Experience with containerization and orchestration (Docker, Kubernetes).

- Contributions to or deep familiarity with ML infrastructure libraries such as:

- Ray

- vLLM

- SGLang

- or similar distributed ML systems

Bonus

- Computer science background from a top-tier program or equivalent demonstrated excellence.

- Open-source contributions, research projects, or side projects in systems or ML infrastructure.

- A track record of high ownership and technical curiosity.

Posted 2026-02-04

Recommended Jobs

Front End Developer

Webconnex LLC
Sacramento, CA

Location While our core team and headquarters are in Sacramento, California, we welcome remote workers from all over the country.  We've built a strong culture to foster valuable team relationshi…

View Details
Posted 2026-01-29

Warehouse Class A Driver (NorCal)

99 Ranch Market
Union City, CA

99 Ranch Market Warehouse is now hiring for Class A Driver!99 Ranch Market is one of the largest Asian supermarket chains in the United States, with over 70 store locations in California, Nevada, Texa…

View Details
Posted 2026-01-29

Remote Graphic Designer

Success Centers
San Francisco, CA

You will collaborate closely with marketing, product, and content teams to deliver innovative design solutions that align with our strategic goals. REQUIREMENTS ~ Bachelor’s degree in Graphic Desi…

View Details
Posted 2026-01-30

Hybrid Sales Representative - Field and Virtual

Los Angeles, CA

The Hybrid Sales Representative is responsible for executing customer‑driven objectives within the assigned territory, delivering effective sales presentations, and conducting virtual, and in‑person …

View Details
Posted 2026-01-30

Senior Principal Photonics Systems Engineer

Monarch Quantum
San Diego, CA

Join our fast-paced and passionate team as a Senior Principal Photonics Systems Engineer. As we scale, you will be instrumental in building our foundation from the ground up. This is a dynamic, hands…

View Details
Posted 2026-01-28

Assembler

Lancaster, CA

Job Title: Assembly Technician Job Description We are seeking a dedicated Assembly Technician to join our team in a dynamic manufacturing environment. The ideal candidate will be responsible for as…

View Details
Posted 2026-01-26

Advanced Practice Provider (NP or PA)

Columbia Healthcare
Apple Valley, CA

Advanced Practice Provider (NP or PA) – Apple Valley, CA $150K–$165K | Flexible Scheduling | On-Site Role with Stable Team If you're a Nurse Practitioner or Physician Assistant looking for a hig…

View Details
Posted 2026-01-21

Director of Strategic Partnerships

PayJunction
Santa Barbara, CA

PayJunction is seeking a Director of Strategic Partnerships to lead, expand, and inspire a high-performing team while scaling our No-code Payment Integration (NCPI) partner ecosystem. This is not a…

View Details
Posted 2026-01-15

Entry Level Account Manager

SRO Marketing, Inc.
San Francisco, CA

In a world increasingly reliant on connectivity, SRO Marketing stands as a bridge. We envisioned a future where everyone, everywhere, has access to reliable telecommunication services. That vision …

View Details
Posted 2026-01-27

Assistant research scientist safety center human - artificial intelligence computer

University of California Berkeley
Berkeley, CA

Position overview Salary range: The UC academic salary scales set the minimum pay determined by rank and step at appointment. See the following table for the salary scale for this position: T…

View Details
Posted 2026-01-21