Machine Learning Engineer, Training Infrastructure

Hedra
San Francisco, CA

About Hedra

Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures. We're building Hedra Studio, a multimodal creation platform capable of control, emotion, and creative intelligence.

At the core of Hedra Studio is our Character-3 foundation model, the first omnimodal model in production. Character-3 jointly reasons across image, text, and audio for more intelligent video generation — it’s the next evolution of AI-driven content creation.

At Hedra, we’re a team of hard-working, passionate individuals seeking to fundamentally change content creation and build a generational company together. We value startup energy, initiative, and the ability to turn bold ideas into real products. Our team is fully in-person in SF/NY with a shared love for whiteboard problem-solving.

Overview

We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate has diverse experience managing ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if you don't meet every requirement — we value curiosity, creativity, and the drive to solve hard problems.

Responsibilities

  • Design, implement, and maintain scalable computing solutions for training and deploying ML models, ensuring infrastructure can handle large video datasets.

  • Manage and optimize the performance of our computing clusters or cloud instances, such as AWS or Google Cloud, to support distributed training.

  • Ensure that our infrastructure can handle the resource-intensive tasks associated with training large generative models.

  • Monitor system performance and implement improvements to maximize efficiency and utilization , using tools like Airflow for orchestration.

  • Collaborate across research teams to understand their computational needs and provide appropriate solutions, facilitating seamless model deployment.

Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, or a related field, with a focus on system administration.

  • Experience with cloud computing platforms such as Amazon Web Services, Google Cloud, or Microsoft Azure, essential for managing large-scale ML workloads.

  • Values engineering processes and version control (CI/CD).

  • Knowledge of containerization technologies like Docker and Kubernetes required for deployments at scale.

  • Understanding of distributed training techniques and how to scale models across multi-node clusters aligning with video generation needs.

  • Strong problem-solving and communication skills, given the need to collaborate with diverse teams.

This role is vital for ensuring the computational backbone supports the company’s ML efforts, focusing on deployment and scalability.

Benefits

  • Competitive compensation + equity

  • 401k (no match)

  • Healthcare (Silver PPO Medical, Vision, Dental)

  • Lunch and snacks at the office

Posted 2025-09-22

Recommended Jobs

Civic Market Leader

NBBJ
San Diego, CA

NBBJ is an award-winning design firm recognized as a TIME100 Most Influential Company, a Fast Company Most Innovative Architecture Firm and a two-time 2025 AIA National Honor Award recipient. These …

View Details
Posted 2025-08-07

Senior Software Engineer

Kuzco
San Francisco, CA

Kuzco is seeking a Senior Full-Stack Software Engineer to join our team. This role involves building and scaling our user-facing applications, website, and other critical product infrastructure. If y…

View Details
Posted 2025-09-13

Locum Tenens Internal Medicine Job CA

CompHealth CompHealth
California

CompHealth exists to make the locums process easier. Not only will we search for jobs that fit your interests, we'll be here to handle all the details like credentialing, housing, travel arrangements,…

View Details
Posted 2025-09-10

Senior Cost Accountant

Harbinger Motors
Garden Grove, CA

About Harbinger Harbinger is an American commercial electric vehicle (EV) company on a mission to transform an industry starving for innovation. Harbinger's best-in-class team of EV, battery, and …

View Details
Posted 2025-09-12

Machine Learning Engineer

Arc Institute
Palo Alto, CA

About Arc Institute The Arc Institute is a new scientific institution that conducts curiosity-driven basic science and technology development to understand and treat complex human diseases. Head…

View Details
Posted 2025-09-22

Branch Manager - Walnut Creek, CA

Firefighters First Credit Union
Walnut Creek, CA

Job Description Job Description Firefighters First Federal Credit Union has proudly been serving the firefighter community since 1935. We have built trusting relationships within the fire family …

View Details
Posted 2025-07-29

Development Test Engineer

Critical Energy
Hawthorne, CA

Company Description Critical Energy is a trailblazing startup in stealth mode, poised to redefine the world of clean energy. Our Mission : To build an abundant, low-cost, 100% clean energy …

View Details
Posted 2025-09-22

Bilingual Administrative Assistant

BA House Cleaning
Castro Valley, CA

Purpose of Position: To provide office support to the technicians and office staff, plan and execute functions, order supplies and maintain supplies. Customer service and sales backup. Duties & Res…

View Details
Posted 2025-09-09

Customer Success Manager (CSM)

Coworker.ai
San Francisco, CA

Position: Customer Success Manager Location: San Francisco, CA (Hybrid) Type: Full-Time Experience: 2–4 Years in SaaS Client Success Want to (actually) change the world? Hi, we're …

View Details
Posted 2025-09-22

Full Time Gastroenterology Job Eureka, CA

CompHealth CompHealth
Eureka, CA

You will enjoy this spectacular area, situated on California's second-largest natural bay. It has a year-round mild climate for enjoying the outdoors, with quick access to breathtaking views against t…

View Details
Posted 2025-09-10