Machine Learning Engineer, Training Infrastructure

Hedra

San Francisco, CA

About Hedra

Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures. We're building Hedra Studio, a multimodal creation platform capable of control, emotion, and creative intelligence.

At the core of Hedra Studio is our Character-3 foundation model, the first omnimodal model in production. Character-3 jointly reasons across image, text, and audio for more intelligent video generation — it’s the next evolution of AI-driven content creation.

At Hedra, we’re a team of hard-working, passionate individuals seeking to fundamentally change content creation and build a generational company together. We value startup energy, initiative, and the ability to turn bold ideas into real products. Our team is fully in-person in SF/NY with a shared love for whiteboard problem-solving.

Overview

We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate has diverse experience managing ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if you don't meet every requirement — we value curiosity, creativity, and the drive to solve hard problems.

Responsibilities

Design, implement, and maintain scalable computing solutions for training and deploying ML models, ensuring infrastructure can handle large video datasets.
Manage and optimize the performance of our computing clusters or cloud instances, such as AWS or Google Cloud, to support distributed training.
Ensure that our infrastructure can handle the resource-intensive tasks associated with training large generative models.
Monitor system performance and implement improvements to maximize efficiency and utilization , using tools like Airflow for orchestration.
Collaborate across research teams to understand their computational needs and provide appropriate solutions, facilitating seamless model deployment.

Qualifications

Bachelor’s degree in Computer Science, Information Technology, or a related field, with a focus on system administration.
Experience with cloud computing platforms such as Amazon Web Services, Google Cloud, or Microsoft Azure, essential for managing large-scale ML workloads.
Values engineering processes and version control (CI/CD).
Knowledge of containerization technologies like Docker and Kubernetes required for deployments at scale.
Understanding of distributed training techniques and how to scale models across multi-node clusters aligning with video generation needs.
Strong problem-solving and communication skills, given the need to collaborate with diverse teams.

This role is vital for ensuring the computational backbone supports the company’s ML efforts, focusing on deployment and scalability.

Benefits

Competitive compensation + equity
401k (no match)
Healthcare (Silver PPO Medical, Vision, Dental)
Lunch and snacks at the office

Posted 2025-09-22

Recommended Jobs

PRODUCCIÓN EN LA INCUBADORA

California

¿Quiere construir un futuro más sólido, sustentable y cultivar tu carrera? Súmate al equipo global de Cargill que cuenta con 160,000 empleados que están comprometidos en usar maneras seguras, responsa…

View Details

Posted 2025-10-03

Research scientist core

Facebook App

Burlingame, CA

Summary: The Core AI group at Reality Lab Research is seeking an outstanding Research Scientist Intern to join and advance multi-modal/vision research. As an intern, you will conduct impactful …

View Details

Posted 2025-10-27

Director, New Digital Business & Innovation - Santa Monica, 90404

Universal Music Group

Santa Monica, CA

Director, New Digital Business & Innovation - Santa Monica, 90404, United States of America How we LEAD: We are currently seeking a Director, Business Development & Innovation to join our Digital…

View Details

Posted 2025-10-28

Fullstack Engineer

Open

Los Angeles, CA

About Open Be Present, together. Our mission is two simple words. Open is a modern studio designing a new way to practice wellbeing. We combine our hybrid methodology with a social and engaging…

View Details

Posted 2025-10-13

Principal Data Scientist

Videoamp Careers Website

Los Angeles, CA

VideoAmp is on a mission to create the best employee and workplace experience where people can bring their whole self to work everyday. We believe that accomplishing something great requires a …

View Details

Posted 2025-09-22

Lead Product Manager

Match Group

Los Angeles, CA

The League is a curated dating community for ambitious singles—whether they're seeking love, friendship, or networking. Following our acquisition by Match Group (home of Tinder, Hinge, etc.), we're e…

View Details

Posted 2025-09-25

Test Engineer

Elevate Semiconductor

San Diego, CA

At Elevate Semiconductor, our mission is to empower semiconductor and system test customers by delivering world-class test integrated circuits (ICs) that tackle the industry's most complex automated …

View Details

Posted 2025-10-25

Sales Representative - Corrugated Packaging/Manufacturing

Veritiv Corporation

Union City, CA

Job Purpose: Our Sales Representatives will be responsible for lead generation, new business development, maintaining customer relationships, providing specific solutions, and meeting Veritiv'…

View Details

Posted 2025-10-24

Sr. Software Engineer, Backend

Match Group

Palo Alto, CA

Our Mission Launched in 2012, Tinder® revolutionized how people meet, growing from 1 match to one billion matches in just two years. This rapid growth demonstrates its ability to fulfill a fundame…

View Details

Posted 2025-11-01