Senior Product Manager - AI Observability
About Clockwork Systems
Clockwork.io – Software Driven Fabrics to increase GPU cluster utilization
Clockwork Systems was founded by Stanford researchers and veteran systems engineers who share a vision for redefining the foundations of distributed computing. As AI workloads grow increasingly complex, traditional infrastructure struggles to meet the demands of performance, reliability, and precise coordination. Clockwork is pioneering a software-driven approach to AI fabrics by delivering cross-stack observability to catch and quickly resolve problems, workload fault tolerance to keep jobs running through failures, and performance acceleration that dynamically routes and paces traffic to avoid congestion.
To learn more, visit .
About the Role As Senior Product Manager for AI Observability, you will lead the product strategy and execution for Clockwork’s cross-stack observability solution which helps customers detect slow or failing workloads and precisely correlate them with underlying infrastructure issues. You’ll work at the forefront of the emerging AI market, bringing world-first observability technologies to life.
What You Will Do
- Define and drive product strategy and roadmap for Clockwork’s AI Observability portfolio, covering Fleet Audit (pre-flight validation), Fleet Observability (to uncover and solve fabric issues in real-time) and AI Workload Observability (to identify workload issues and correlate them to the underlying infrastructure).
- Develop a deep understanding of pain points and workflows by working directly with customers and crisply translate them into compelling and differentiated product requirements.
- Drive end-to-end rapid execution - write PRDs, set priorities, unblock teams, make tradeoffs, and ensure high-quality releases.
- Partner cross-functionally with engineering, sales, and marketing to shape the product, ship reliably, and communicate clear value to technical customers.
- Be the voice of the product internally
What We’re Looking For
- 7+ years of Product Management experience with at least some time working in the observability space
- Strong experience with modern observability stacks: metrics, logs, traces, OpenTelemetry, Prometheus/Grafana. Familiarity with GPU observability tooling (e.g, NVIDIA DCGM, NSight) and experience with MLOps and LLMOpps ecosystems is a plus.
- Strong technical depth in Kubernetes, SLURM, AI training and related components (e.g. PyTorch, NCCL, etc.), GPU clusters and RDMA networking (InfiniBand and RoCE)
- Excellent product leadership - clear writing, crisp tradeoffs, strong prioritization, and the ability to collaborate effectively with highly technical engineering teams
- Customer empathy and discovery strength - able to identify high-impact pain points and convert them into compelling product strategy and execution.
- A builder mindset that is energized by early-stage products, rapid iteration, customer closeness, and shipping market changing solutions.
Enjoy
- Challenging projects.
- A friendly and inclusive workplace culture.
- Competitive compensation.
- A great benefits package.
- Catered lunch.
Clockwork Systems is an equal opportunity employer. We are committed to building world-class teams by welcoming bright, passionate individuals from all backgrounds. All qualified applicants will receive consideration for employment without regard to race, color, ancestry, religion, age, sex, sexual orientation, gender identity or expression, national origin, disability, or protected veteran status. We believe diversity drives innovation, and we grow stronger together.
Recommended Jobs
Automotive Full-Stack Software Engineer
About Gruve Gruve is an innovative software services startup dedicated to transforming enterprises to AI powerhouses. We specialize in cybersecurity, customer experience, cloud infrastructure, a…
Principal Product Manager, Growth
Who We Are At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom. OKX is a leading crypto exchange, and the developer of OK…
Product Manager (Accounting) POST NUMBER: 453319
Product Manager - Accounting Our client, a fast-growing SaaS company developing AI-enabled ERP solutions, is seeking a Product Manager with a strong accounting or finance background to join their…
Production Technician (Irvine)
B. Braun Medical, Inc. Company: B. Braun US Pharmaceutical Manufacturing LLC Job Posting Location: Irvine, California, United States Functional Area: Production Working Model: Onsite …
Software Engineer IV
Job Responsibilities: Design and implement core data collection tooling components Conduct design and code reviews Maintain, analyze, and improve efficiency, scalability, and stability of va…
Facilities and Safety Supervisor
Job ID: 5676 Coast Personnel Services is seeking to hire a Facilities and Safety Supervisor for one of our well-established clients in Newark, CA. Â This is a temp to hire position on 1st shift 7:30am…
Product Manager, Messaging Channels Innovation
Netflix is one of the world's leading entertainment services, with 283 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages…
Software Engineer, Full Stack
A Bit About Us: We are Arcadia Science, an evolutionary biology company founded and led by scientists. Our mission is to turn natural innovations into real-world solutions by developing systematic…
Room Attendant | Le Petit Pali Laguna Beach (Laguna Beach)
ROOM ATTENDANT | LE PETIT PALI LAGUNA BEACH POSITION PROFILE: We're looking for a qualified Room Attendant that's prepared to clean, sanitize, and tidy guest rooms to maintain pristine accommod…