Staff Software Engineer, ML Performance & Systems

Fal
San Francisco, CA

Help fal maintain its frontier position on model performance for generative media models. Design and implement novel approaches to model serving architecture on top of our in-house inference engine, focusing on maximizing throughput while minimizing latency and resource usage. Develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities. Work closely with our Applied ML team and customers (frontier labs on the media space) and make sure their workloads benefit from our accelerator.

Key Responsibilities:



  • Help fal maintain its frontier position on model performance for generative media models.



  • Design and implement novel approaches to model serving architecture on top of our in-house inference engine, focusing on maximizing throughput while minimizing latency and resource usage.



  • Develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities.



  • Work closely with our Applied ML team and customers (frontier labs on the media space) and make sure their workloads benefit from our accelerator.


Requirements:



  • Strong foundation in systems programming with expertise in identifying and fixing bottlenecks.



  • Deep understanding of cutting edge ML infrastructure stack (anything from PyTorch, TensorRT, TransformerEngine to Nsight), including model compilation, quantization, and serving architectures. Ideally following closely the developments in all these systems as they happen.



  • Have a fundamental view of the underlying hardware (Nvidia based systems at the moment), and when necessary go deeper into the stack to fix bottlenecks (custom GEMM kernels with CUTLASS for common shapes).



  • Proficient in Triton or willingness to learn with comparable experience in lower-level accelerator programming.



  • New frontier: multi-dimensional model parallelism (combining multiple parallelism techniques like TP with context parallel / sequence parallel).



  • Familiar with internals of Ring Attention, FA3, FusedMLP implementations.


What we offer at fal:




  • Interesting and challenging work



  • Competitive salary and equity



  • Employee-friendly equity terms (early exercise, extended exercise)



  • A lot of learning and growth opportunities



  • We offer visa sponsorship and will help you relocate to San Francisco.



  • Health, dental, and vision insurance (US)



  • Regular team events and offsite


Compensation:




  • $180,000 - $250,000 + equity + comprehensive benefits package


Location:




  • We are currently hiring in downtown San Francisco.


Posted 2025-12-19

Recommended Jobs

Software Engineer, Infrastructure

Reinforce Labs
Palo Alto, CA

Member of Technical Staff, Software Engineer Location: Palo Alto, CA (Hybrid) What You'll Work On At Reinforce Labs, we partner directly with customers to build AI systems that enhance the…

View Details
Posted 2025-12-25

Union Warehouse Associate

Ferguson Enterprises, LLC
Ventura, CA

Union Warehouse Associate Location Ventura, CA : : Since 1953, Ferguson has been a source of quality supplies for a variety of industries. Together We Build Better infrastructure, better homes and b…

View Details
Posted 2026-01-12

Senior-Software Engineer Compiler Concepts and Optimizations

Siemens
Fremont, CA

Siemens EDA is a global technology leader in Electronic Design Automation software. Our software tools enable companies around the world to develop highly innovative electronic products faster and mo…

View Details
Posted 2026-01-07

Class A Freight Truck Driver Job

CNK Trucking
Glenn County, CA

Class A Freight Truck Driver Job We are seeking a Dedicated Truck Driver to join our team! You will be responsible for safely operating a truck with a capacity of at least 26,000 pounds Gross Vehicle…

View Details
Posted 2026-01-09

2.93 Software Engineer: ROS Developer

Field AI
Mission Viejo, CA

Field AI  is transforming how robots interact with the real world. We are building risk-aware, reliable, and field-ready AI systems that address the most complex challenges in robotics, unlocking the…

View Details
Posted 2025-12-19

Accounting Payable Clerk

Maxzone Auto Parts Corp.
Fontana, CA

Accounting Payable Clerk Location Fontana, CA : Responsibilities: Enter AP Invoices: Maintain AP invoices match with bills mailed in. Enter Manual Checks: Enter manual checks based on the ACH or …

View Details
Posted 2026-01-07

Telemetry Unit Clinical RN Nurse Coordinator

Clinical Management Consultants
Mecca, CA

Telemetry Unit Clinical Nurse Coordinator — Elevate telemetry nursing leadership in Southern California The Telemetry Unit Clinical Nurse Coordinator will step into a high-impact leadership role at …

View Details
Posted 2025-12-12

CQV Engineer

Takeda
Dublin, CA

By clicking the “Apply” button, I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Taked…

View Details
Posted 2025-12-18

Data Analyst

Midstream Health
San Francisco, CA

Data Analyst (Founding Team) Stealth Healthcare Start-Up &##128205; SF-based preferred | &##129523; Some travel required | &##128336; Full-time A Different Approach to Building Healthcare Tech…

View Details
Posted 2025-11-25

Staff Product Manager, Mobile

Calendly
San Francisco, CA

About the team & opportunity What’s so great about working on Calendly’s Product team? We strive to design a seamless product experience that delights our customers. Why do we need you? Wel…

View Details
Posted 2025-11-25