Senior / Staff Site Reliability Engineer, Compute (San Francisco)
About Fluidstack
Fluidstack is building GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.
Our team is small, highly motivated, and focused on providing a world class supercomputing experience. We put out customers first in everything we do, working hard to not just win the sale, but to win repeated business and customer referrals.
We hold ourselves and each other to high standards. We expect you to care deeply about the work you do, the products you build, and the experience our customers have in every interaction with us.
You must work hard, take ownership from inception to delivery, and approach every problem with an open mind and a positive attitude. We value effectiveness, competence, and a growth mindset.
About the Role
Our Senior / Staff Site Reliability Engineers (Storage) are the backbone of Fluidstacks platform. Youll utilise deep systems expertise and software engineering to keep our bare-metal and virtualised compute fleet fast, reliable and cost-efficient at petabyte scale.
Focus
Super-charge virtualisation. Tune hypervisors (KVM/QEMU), kernel subsystems and NUMA layouts to squeeze micro-seconds off tail-latency for AI & HPC jobs.
Deploy & optimise at scale. Roll out new CPU/GPU/DPU nodes, validate SmartNIC and BlueField off-loads and harden workload isolation.
Automate observability. Build kernel-to-orchestrator telemetry, incident-response bots and performance dashboards.
Root-cause the gnarly stuff. Lead crash-dumps, kexec/kdump analyses and performance regressions; turn insights into upstream patches and config templates.
Drive kernel & hardware collaboration. Pair with silicon and Linux teams to debug drivers, accelerate I/O paths and integrate emerging compute hardware (TPUs, DPUs).
Continuously improve. Inject chaos, run game-days and codify post-mortem learnings into SLIs/SLOs that matter to customers.
About you
5+ yrs in compute-heavy SRE, kernel or virtualisation engineering.
Mastery of Linux internals (scheduler, memory, drivers) and system-level debugging.
Production experience with KVM, Xen, QEMU, VMware or similar hypervisors.
Fluency in C, Go or Rust; solid Infra-as-Code & CI/CD chops.
Familiarity with SmartNICs/DPUs and kernel-bypass networking.
Proven track record scaling high-throughput compute or HPC platforms.
Benefits
Competitive total compensation package (cash + equity).
Retirement or pension plan, in line with local norms.
Health, dental, and vision insurance.
Generous PTO policy, in line with local norms.
Recommended Jobs
Senior Vascular Sales Rep - San Jose, CA
We anticipate the application window for this opening will close on - 11 Aug 2025 At Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access …
Drama Teacher Needed in Hangzhou => Relocate to China
Job Description: Position: Full Time Drama Teacher Start time: Sep,2025 Location: Hangzhou, Zhejiang Province, China Class size: 35 students Age Group: 6-12 years old Teaching Schedule: Worki…
Mechanical Business Development Lead, Ottawa
Mechanical Business Development Lead, Ottawa The Mechanical Business Development Lead, Ottawa will be responsible for leading strategic initiatives to expand the company's presence in the Ottawa marke…
Locum Tenens Psychiatry Job Los Angeles, CA
Are you a Psychiatry physician searching for your next exciting locum tenens opportunity? This position with one of VISTAs healthcare partners in Los Angeles, CA might just be the opportunity for you…
Technology/IP Transactional Associate VIP-18376
REQUIREMENTS Technology/IP transactional experience with a background in technology and/or life sciences transactions. Should have experience providing support on technology and IP aspects of corpo…
Medical Billing Specialist
Established in 1987, Pacific Medical, Inc. is a distributor of durable medical equipment; specializing in orthopedic rehabilitation, arthroscopic surgery, sports medicine, prosthetics, and orthotics. …
HVAC Technician
**Job Title** HVAC Technician **Job Description Summary** **Job Description** **Position Summary** HVAC Technician **Job Description Summary** Responsible for the operation, installation, inspection, …
OTR Truck Driver Trainee - FedEx Contractor
OTR Truck Driver Trainee w/ FedEx Contractor Did you graduate from one of the Truck Driving Schools listed below: ~160 Driving Academy ~ American Career Training ~ Advanced Career Instit…