Product Manager (Lighthouse)
About FluidStack
Fluidstack is the AI Cloud Platform. We build GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.
Our team is small, highly motivated, and focused on providing a world class supercomputing experience. We put our customers first in everything we do, working hard to not just win the sale, but to win repeated business and customer referrals.
We hold ourselves and each other to high standards. We expect you to care deeply about the work you do, the products you build, and the experience our customers have in every interaction with us.
You must work hard, take ownership from inception to delivery, and approach every problem with an open mind and a positive attitude. We value effectiveness, competence, and a growth mindset.
About the Role
We're looking for a Product Manager to lead Lighthouse, our MLOps and observability platform. You'll own the complete product lifecycle—from strategy and roadmap to execution and customer success.
You will work directly with our engineering and infrastructure teams as well as collaborate closely with customers to ensure that we're providing ML developers the metrics that matter. You will have the opportunity to partner with top tier AI labs to increase their utilization and performance as well as scale our infrastructure to hundreds of thousands of GPUs.
Focus
Building and executing on the roadmap for Lighthouse.
Partner with engineering to translate customer requirements into technical specifications and guide implementation.
Creating alerting rules for GPU cluster health, job failures, and resource bottlenecks
Designing dashboards for ML-specific KPIs (training loss curves, inference latency, batch processing metrics)
Collaborate with sales and customer success teams to drive adoption, gather feedback, and ensure customer satisfaction.
Engage directly with AI labs and enterprises to understand their observability challenges and shape the product roadmap accordingly.
About You
3-5+ years of experience building developer tools or cloud infrastructure, ideally in the observability space.
Deeply experienced with the LGTM stack, Alertmanager, or proprietary observability tools like Datadog, etc.
Have an understanding of the metrics that matter to an AI/ML customer, including infrastructure availability, performance, and utilization, as well as application level metrics like MFU.
Understanding of GPU monitoring tools (DCGM, nvidia-smi, GPU exporters for Prometheus).
Knowledge of Infrastructure-as-Code (IaC) tools (e.g. Terraform, Pulumi) to standardize and simplify the deployment of the observability stack.
Comfortable writing SQL queries.
Understanding of SLA, SLO, frameworks and error budget management.
Experience with ML-specific monitoring tools (Weights & Biases, ClearML, etc.).
Benefits
Competitive total compensation package (salary + equity).
Retirement or pension plan, in line with local norms.
Health, dental, and vision insurance.
Generous PTO policy, in line with local norms.
Recommended Jobs
Client Advisor
Louis Vuitton seeks a Client Advisor in Palo Alto to deliver exceptional client experiences and build lasting relationships. The role requires a strategic sales mindset and a minimum of 3 years of exp…
Nurse Practitioner / Physician Assistant Cardiac Step Down
Palm Health Resources is hiring an Experienced Cardiac Surgery Advanced Practice Provider for a major academic facility in Beautiful Los Angeles, CA! Â This would be an integral part of our dept s…
Customer Success Manager
Plain is redefining customer support for the next generation of B2B companies. We’re building the fastest, most powerful platform to help companies move beyond reactive support and build true custome…
Senior Staff Hardware Security & Post-Silicon Test Engineer
Company: Qualcomm Technologies, Inc. Job Area: Engineering Group, Engineering Group Hardware Engineering General Summary: This individual leads the team to improve engineering effic…
Product Manager, Payments
The Opportunity Wonderschool is solving one of the most critical market problems in the U.S.: childcare access and affordability. We’re building an AI-powered platform that enables providers to op…
Principal Insights Analyst, Growth Marketing - Pubsports
The Insights discipline helps empower Rioters' decision-making to be more data-informed. We're an interdisciplinary group of scientists, consultants, and strategists from all sorts of backgrounds who…
Customer Success Manager
Company Description At Jellyfish, we believe in the power of diverse perspectives and inclusive collaboration. We welcome individuals who excel in collaborative, varied teams and value the uniqu…
OPERATEUR(TRICE) POLYMÈRE- QUART JOUR
Oldcastle® APG, une entreprise de CRH, est le principal fournisseur nord-américain de solutions innovantes pour la vie en plein air, permettant aux clients de bien vivre à l'extérieur. Le porte…
Staff Infrastructure Engineer
About the Company: World is a network of real humans, built on privacy-preserving proof-of-human technology, and powered by a globally inclusive financial network that enables the free flow of digit…
Contract Negotiation Manager/ Contract Manager
&##127775; We're Hiring: Contract Negotiation Manager! &##127775; We are seeking an experienced and strategic Contract Negotiation Manager to lead our contract management processes and drive success…