Software Engineer (AI Performance)

Gimlet Labs
San Francisco, CA

Gimlet Labs is building the foundation for the next generation of AI applications. As generative AI workloads rapidly scale, inference efficiency is becoming the critical bottleneck. Gimlet is redefining AI inference from the ground up, combining cutting-edge research with an integrated hardware-software stack that delivers breakthrough performance, efficiency, and model quality. Gimlet pairs its inference stack with a seamless developer experience, allowing users to deploy, manage, and monitor AI workloads from frameworks like PyTorch and LangChain at production scale in seconds.

Gimlet is spun out of a Stanford research project under Professors Zain Asgar and Sachin Katti. The founding team has deep experience across AI, distributed systems, and hardware with previous successful exits.

Gimlet Labs is seeking a Software Engineer focused on AI Performance. You will be researching and implementing techniques to drive performance and quality optimizations across the latest AI models. You will implement techniques such as quantization, KV caching, and FlashAttention to enable inference efficiency. You will design parallelism strategies to distribute data and workloads across compute nodes at production scale. You will dive deep into GPU code and kernel optimizations to accelerate AI workloads.

Responsibilities:

  • Evaluating and implementing cutting-edge AI research for model performance and efficiency

  • Architecting infrastructure for distributed AI workloads across both the software stack and GPU kernel layers

  • Profiling, benchmarking, and analyzing system performance, identifying bottlenecks and optimization opportunities in execution runtimes targeting various hardware systems

Qualifications:

  • Bachelor’s degree in computer science, engineering, applied mathematics or comparable area of study

  • Experience with performance optimization

Preferred Qualifications:

  • Graduate degree in computer science, engineering, applied mathematics or comparable area of study

  • Familiarity with compilers and compiler frameworks such as MLIR

  • Experience with PyTorch, TensorFlow, vLLM, ONNX and other AI frameworks

  • Software development experience with Python, C++, and CUDA

Posted 2025-11-25

Recommended Jobs

Staff Site Reliability Engineer

Crusoe
San Francisco, CA

Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to po…

View Details
Posted 2025-11-25

AGENTE DE SERVICIOS PARA HUÉSPEDES

California

Organization- Hyatt House San Ramon Resumen HYATT house es un hotel de estilo residencial para estancias prolongadas que apunta a proporcionar a cada viajero la sensación de un condominio moder…

View Details
Posted 2025-09-10

Account Manager Commercial Sales

Advanced Integrated Pest Management
San Jose, CA

Advanced IPM is a family-owned company dedicated to providing exceptional pest management solutions to commercial clients. As a Commercial Sales Account Manager, you will play a key role in driving b…

View Details
Posted 2025-09-04

Data Scientist

Kodiak
San Francisco, CA

Kodiak Robotics, Inc. was founded in 2018 and has become a leader in autonomous ground transportation committed to a safer and more efficient future for all. The company has developed an artificial i…

View Details
Posted 2025-11-25

Clerical Technician

RennickBarrett Recruiting, INC
Imperial, CA

Job Title: Clerical Technician Section: Contract Administration Schedule: 9/80  Location: Imperial, CA *Contract Position PRIMARY FUNCTION Under general supervision, performs technical …

View Details
Posted 2025-11-20

Lead Product Manager, GalateaTV

Inkitt
San Francisco, CA

Inkitt is building the Disney of the 21st Century, standing at the forefront of technology and entertainment. Leveraging AI and predictive algorithms,  Inkitt discovers unknown stories and turns them…

View Details
Posted 2025-11-25

Principal Mobile Product Designer

Veeva Systems
Pleasanton, CA

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in histo…

View Details
Posted 2025-09-28

Senior Platform Engineer

Dtex Systems
Fremont, CA

DTEX Systems helps hundreds of organizations worldwide better understand their workforce, protect their data, and make human-centric operational investments. At DTEX, our philosophy towards our busine…

View Details
Posted 2025-11-25

Porter

Havasu Landing Casino
Lake County, CA

INDIAN PREFERENCE POLICY: Preference in filling vacancies is given to qualified Indian candidates in accordance with the Indian Preference Act of 1934 (Title 25, USC. Section 472) POSITION: PORTER …

View Details
Posted 2025-09-07