Staff Machine Learning Data Engineer

Backflip
San Francisco, CA

About Backflip

Mechanical design, the work done in CAD, is the rate-limiter for progress in the physical world. However, there are only 2-4 million people on Earth who know how to CAD. But what if hundreds of millions could? What if creating something in the real world were as easy as imagining the use case, or sketching it on paper?

Backflip is building a foundation model for mechanical design : unifying the world’s scattered engineering knowledge into an intelligent, end-to-end design environment. Our goal is to enable anyone to imagine a solution and hit “print.”

Founded by a second-time CEO in the same space (first company: Markforged), Backflip combines deep industry insight with breakthrough AI research. Backed by a16z and NEA , we raised a $30M Series A and built a deeply technical, mission-driven team.

We’re building the AI foundation that tomorrow’s space elevators, nanobots, and spaceships will be built in.

If you’re excited to define the next generation of hard tech, come build it with us.

The Role

We’re looking for a Staff Machine Learning Data Engineer to lead and build the data pipelines powering Backflip’s foundation model for manufacturing and CAD .

You’ll design the systems, tools, and strategies that turn the world’s engineering knowledge - text, geometry, and design intent - into high-quality training data.

This is a core leadership role within the AI team, driving the data architecture, augmentation, and evaluation that underpin our model’s performance and evolution.

You’ll collaborate with Machine Learning Engineers to run data-driven experiments, analyze results, and deliver AI products that shape the future of the physical world.

What You’ll Do

  • Architect and own Backflip’s ML data pipeline, from ingestion to processing to evaluation.

  • Define data strategy: establish best practices for data augmentation, filtering, and sampling at scale.

  • Design scalable data systems for multimodal training (text, geometry, CAD, and more).

  • Develop and automate data collection, curation, and validation workflows.

  • Collaborate with MLEs to design and execute experiments that measure and improve model performance.

  • Build tools and metrics for dataset analysis, monitoring, and quality assurance.

  • Contribute to model development through insights grounded in data, shaping what, how, and when we train.

Who You Are

  • You’ve built and maintained ML data pipelines at scale, ideally for foundation or generative models, that shipped into production in the real world.

  • You have deep experience with data engineering for ML , including distributed systems, data extraction, transformation, and loading, and large-scale data processing (e.g. PySpark, Beam, Ray, or similar).

  • You’re fluent in Python and experienced with ML frameworks and data formats (Parquet, TFRecord, HuggingFace datasets, etc.).

  • You’ve developed data augmentation, sampling, or curation strategies that improved model performance.

  • You think like both an engineer and an experimentalist : curious, analytical, and grounded in evidence.

  • You collaborate well across AI development, infra, and product, and enjoy building the data systems that make great models possible.

  • You care deeply about data quality, reproducibility, and scalability .

  • You’re excited to help shape the future of AI for physical design.

Bonus points if:

  • You are comfortable working with a variety of complex data formats , e.g. for 3D geometry kernels or rendering engines.

  • You have an interest in math, geometry, topology, rendering , or computational geometry.

  • You’ve worked in 3D printing, CAD, or computer graphics domains.

Why Backflip

This is a rare opportunity to own the data backbone of a frontier foundation model, and help define how AI learns to design the physical world.

You’ll join a world-class, mission-driven team operating at the intersection of research, engineering, and deep product sense, building systems that let people design the physical world as easily as they imagine it.

Your work will directly shape the performance, capability, and impact of Backflip’s foundation model, the core of how the world will build in the future.

Let’s build the tools the future will be made in.

Posted 2025-11-28

Recommended Jobs

Senior Staff Software Engineer, Storage

Crusoe
San Francisco, CA

Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to po…

View Details
Posted 2026-01-07

Sr. Software Engineer

Cyberark
Santa Clara, CA

Company Description About CyberArk : CyberArk (NASDAQ: CYBR ), is the global leader in Identity Security . Centered on privileged access management, CyberArk provides the most comprehens…

View Details
Posted 2025-11-25

Military Engine Test Engineer Specialist and Flight Test Engineer Specialist

GE Renewable Energy Power and Aviation
Edwards, CA

Job Description Summary Activities contributing to the design and development of products, solutions and systems. Includes activities linked to technical improvement of existing products and compo…

View Details
Posted 2025-12-13

Software Engineer, Infrastructure

Reinforce Labs
Palo Alto, CA

Member of Technical Staff, Software Engineer Location: Palo Alto, CA (Hybrid) What You'll Work On At Reinforce Labs, we partner directly with customers to build AI systems that enhance the…

View Details
Posted 2026-01-07

Staff Product Manager - Agentic Solutions

Yurts
San Francisco, CA

About us: Let’s be real: AI isn’t magic. Legion was built to slice through hype and deliver secure, dependable agentic systems that work alongside the people tackling the world’s most critical cha…

View Details
Posted 2026-01-13

Senior Machine Learning Engineer, Computer Vision

Metropolis
Los Angeles, CA

The Company Metropolis is an artificial intelligence company that uses computer vision technology to enable frictionless, checkout-free experiences in the real world. Today, we are reimagining par…

View Details
Posted 2026-01-07

Compliance Product Manager, Safety

Roblox
San Mateo, CA

Roblox is looking for a Compliance Product Manager with deep subject matter expertise in Online Safety to help build and mature our regulatory compliance program from the ground up. This critical rol…

View Details
Posted 2025-12-07

CAM Programmer

Hadrian Automation
Torrance, CA

Hadrian – Manufacturing the Future Hadrian is building autonomous factories that help aerospace and defense companies manufacture rockets, satellites, jets, and ships up to 10x faster and up to 2x c…

View Details
Posted 2025-12-10

Fullstack Engineer MERN Stack | 2025CBIN07002/27538

Mindverse Consulting Services
Glendale, CA

Job Summary The Company Headquartered in Los Angeles, this leader in the Entertainment & Media space is focused on delivering world-class stories and experiences to its global audience. …

View Details
Posted 2026-01-07