Senior / Staff Site Reliability Engineer, Compute (San Francisco)

Fluidstack
San Francisco, CA

About Fluidstack

Fluidstack is building GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.

Our team is small, highly motivated, and focused on providing a world class supercomputing experience. We put out customers first in everything we do, working hard to not just win the sale, but to win repeated business and customer referrals.

We hold ourselves and each other to high standards. We expect you to care deeply about the work you do, the products you build, and the experience our customers have in every interaction with us.

You must work hard, take ownership from inception to delivery, and approach every problem with an open mind and a positive attitude. We value effectiveness, competence, and a growth mindset.

About the Role

Our Senior / Staff Site Reliability Engineers (Storage) are the backbone of Fluidstacks platform. Youll utilise deep systems expertise and software engineering to keep our bare-metal and virtualised compute fleet fast, reliable and cost-efficient at petabyte scale.

Focus

  • Super-charge virtualisation. Tune hypervisors (KVM/QEMU), kernel subsystems and NUMA layouts to squeeze micro-seconds off tail-latency for AI & HPC jobs.

  • Deploy & optimise at scale. Roll out new CPU/GPU/DPU nodes, validate SmartNIC and BlueField off-loads and harden workload isolation.

  • Automate observability. Build kernel-to-orchestrator telemetry, incident-response bots and performance dashboards.

  • Root-cause the gnarly stuff. Lead crash-dumps, kexec/kdump analyses and performance regressions; turn insights into upstream patches and config templates.

  • Drive kernel & hardware collaboration. Pair with silicon and Linux teams to debug drivers, accelerate I/O paths and integrate emerging compute hardware (TPUs, DPUs).

  • Continuously improve. Inject chaos, run game-days and codify post-mortem learnings into SLIs/SLOs that matter to customers.

About you

  • 5+ yrs in compute-heavy SRE, kernel or virtualisation engineering.

  • Mastery of Linux internals (scheduler, memory, drivers) and system-level debugging.

  • Production experience with KVM, Xen, QEMU, VMware or similar hypervisors.

  • Fluency in C, Go or Rust; solid Infra-as-Code & CI/CD chops.

  • Familiarity with SmartNICs/DPUs and kernel-bypass networking.

  • Proven track record scaling high-throughput compute or HPC platforms.

Benefits

  • Competitive total compensation package (cash + equity).

  • Retirement or pension plan, in line with local norms.

  • Health, dental, and vision insurance.

  • Generous PTO policy, in line with local norms.

#J-18808-Ljbffr
Posted 2025-08-10

Recommended Jobs

Senior Vascular Sales Rep - San Jose, CA

Medtronic
San Jose, CA

We anticipate the application window for this opening will close on - 11 Aug 2025 At Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access …

View Details
Posted 2025-07-29

Drama Teacher Needed in Hangzhou => Relocate to China

Zhejiang Haicheng Education Technology Co., Ltd
Los Angeles, CA

Job Description: Position: Full Time Drama Teacher Start time: Sep,2025 Location:  Hangzhou, Zhejiang Province, China Class size: 35 students Age Group: 6-12 years old Teaching Schedule: Worki…

View Details
Posted 2025-08-07

Mechanical Business Development Lead, Ottawa

Plan Group
Ontario, CA

Mechanical Business Development Lead, Ottawa The Mechanical Business Development Lead, Ottawa will be responsible for leading strategic initiatives to expand the company's presence in the Ottawa marke…

View Details
Posted 2025-07-30

Locum Tenens Psychiatry Job Los Angeles, CA

VISTA Staffing Solutions, Inc. VISTA Staffing Solutions, Inc.
Los Angeles, CA

Are you a Psychiatry physician searching for your next exciting locum tenens opportunity? This position with one of VISTAs healthcare partners in Los Angeles, CA might just be the opportunity for you…

View Details
Posted 2025-07-30

Technology/IP Transactional Associate VIP-18376

Vanguard-IP
East Palo Alto, CA

REQUIREMENTS Technology/IP transactional experience with a background in technology and/or life sciences transactions. Should have experience providing support on technology and IP aspects of corpo…

View Details
Posted 2025-08-07

Medical Billing Specialist

Pacific Medical Inc.
Tracy, CA

Established in 1987, Pacific Medical, Inc. is a distributor of durable medical equipment; specializing in orthopedic rehabilitation, arthroscopic surgery, sports medicine, prosthetics, and orthotics. …

View Details
Posted 2025-08-07

HVAC Technician

C&W Services
San Juan Capistrano, CA

**Job Title** HVAC Technician **Job Description Summary** **Job Description** **Position Summary** HVAC Technician **Job Description Summary** Responsible for the operation, installation, inspection, …

View Details
Posted 2025-07-29

OTR Truck Driver Trainee - FedEx Contractor

SaBSaF Logistics
Bloomington, CA

OTR Truck Driver Trainee w/ FedEx Contractor   Did you graduate from one of the Truck Driving Schools listed below: ~160 Driving Academy ~ American Career Training  ~ Advanced Career Instit…

View Details
Posted 2025-07-29