Software Engineer (AI Performance)

Gimlet Labs
San Francisco, CA

Gimlet Labs is building the foundation for the next generation of AI applications. As generative AI workloads rapidly scale, inference efficiency is becoming the critical bottleneck. Gimlet is redefining AI inference from the ground up, combining cutting-edge research with an integrated hardware-software stack that delivers breakthrough performance, efficiency, and model quality. Gimlet pairs its inference stack with a seamless developer experience, allowing users to deploy, manage, and monitor AI workloads from frameworks like PyTorch and LangChain at production scale in seconds.

Gimlet is spun out of a Stanford research project under Professors Zain Asgar and Sachin Katti. The founding team has deep experience across AI, distributed systems, and hardware with previous successful exits.

Gimlet Labs is seeking a Software Engineer focused on AI Performance. You will be researching and implementing techniques to drive performance and quality optimizations across the latest AI models. You will implement techniques such as quantization, KV caching, and FlashAttention to enable inference efficiency. You will design parallelism strategies to distribute data and workloads across compute nodes at production scale. You will dive deep into GPU code and kernel optimizations to accelerate AI workloads.

Responsibilities:

  • Evaluating and implementing cutting-edge AI research for model performance and efficiency

  • Architecting infrastructure for distributed AI workloads across both the software stack and GPU kernel layers

  • Profiling, benchmarking, and analyzing system performance, identifying bottlenecks and optimization opportunities in execution runtimes targeting various hardware systems

Qualifications:

  • Bachelor’s degree in computer science, engineering, applied mathematics or comparable area of study

  • Experience with performance optimization

Preferred Qualifications:

  • Graduate degree in computer science, engineering, applied mathematics or comparable area of study

  • Familiarity with compilers and compiler frameworks such as MLIR

  • Experience with PyTorch, TensorFlow, vLLM, ONNX and other AI frameworks

  • Software development experience with Python, C++, and CUDA

Posted 2025-10-31

Recommended Jobs

Accountant

Confidential-tech-company
Santa Ana, CA

We are currently seeing an experienced accountant to join our growing team. You will be responsible for performing general bookkeeping, maintaining financial reports, tax reporting, and preparing and…

View Details
Posted 2025-10-01

Job Coach

Achievement House, Inc.
San Luis Obispo, CA

TITLE: Job Coach Rover ANTICIPATED SALARY: $23.50/hr SCHEUDLE: Part-Time (Tuesday, Thursday, Friday 8:00 AM - 3:15 PM) BONUS : $250 Sign on Bonus (paid at 90 and 180 days) Achievement Hous…

View Details
Posted 2025-10-19

Senior Frontend Engineer - USA

Inworld Ai
Mountain View, CA

About Inworld At Inworld, we believe the processes of building, scaling, and evolving applications are monsters that consume value before it can reach users. Our mission is to solve evolution and tr…

View Details
Posted 2025-10-13

Corporate Recruiter

Premier Healthcare Services
Los Angeles, CA

The Recruiter is responsible for assisting with the recruitment process. The process has to be properly designed and implemented. The Recruiter assists with building a healthy relationship with inter…

View Details
Posted 2025-09-17

Test Engineer - CA

MGA Research Corporation
Hughson, CA

TEST ENGINEERS The sky is the limit when it comes to new ideas here at MGA Research Corporation because we are always looking for ways to improve and your ideas are valued. From testing individua…

View Details
Posted 2025-09-17

Software Engineer - OpenSearch

Datastax
San Jose, CA

Company Description A career in IBM Software means you'll be part of a team that transforms our customer's challenges into solutions. Seeking new possibilities and always staying curious, we …

View Details
Posted 2025-09-14

Fullstack Engineer, Seller Engineering

Whatnot
San Francisco, CA

&##128640; Join the Future of Commerce with Whatnot! Whatnot is the largest live shopping platform in North America and Europe to buy, sell, and discover the things you love. We’re re-defining e-comme…

View Details
Posted 2025-11-01

Lead Package Handler

OnTrac
Sun Valley, CA

Lead Package Handler Location Sun Valley, CA : OnTrac is hiring a Lead Package Handler ! Are you eager to join a dynamic and expanding company where you can both learn and make a meaningful impact? If…

View Details
Posted 2025-11-04

Staff Software Engineer (Full Stack)

Onos Health
San Francisco, CA

About Onos Health Onos Health’s mission is simple but ambitious: ensure every healthcare dollar goes toward delivering the highest quality care. Today, 30% of total U.S. healthcare spending is waste…

View Details
Posted 2025-10-13

Driver

College Hunks Hauling Junk & Moving
Canoga Park, CA

Compensation: $17.89 - $28 hourly pay rate. Including tips, and bonus! College Hunks Hauling Junk and Moving is looking to hire Drivers!  Get paid to stay fit, build your resume and work side-by-sid…

View Details
Posted 2025-10-01