Software Engineer (AI Performance)
Gimlet Labs is building the foundation for the next generation of AI applications. As generative AI workloads rapidly scale, inference efficiency is becoming the critical bottleneck. Gimlet is redefining AI inference from the ground up, combining cutting-edge research with an integrated hardware-software stack that delivers breakthrough performance, efficiency, and model quality. Gimlet pairs its inference stack with a seamless developer experience, allowing users to deploy, manage, and monitor AI workloads from frameworks like PyTorch and LangChain at production scale in seconds.
Gimlet is spun out of a Stanford research project under Professors Zain Asgar and Sachin Katti. The founding team has deep experience across AI, distributed systems, and hardware with previous successful exits.
Gimlet Labs is seeking a Software Engineer focused on AI Performance. You will be researching and implementing techniques to drive performance and quality optimizations across the latest AI models. You will implement techniques such as quantization, KV caching, and FlashAttention to enable inference efficiency. You will design parallelism strategies to distribute data and workloads across compute nodes at production scale. You will dive deep into GPU code and kernel optimizations to accelerate AI workloads.
Responsibilities:
Evaluating and implementing cutting-edge AI research for model performance and efficiency
Architecting infrastructure for distributed AI workloads across both the software stack and GPU kernel layers
Profiling, benchmarking, and analyzing system performance, identifying bottlenecks and optimization opportunities in execution runtimes targeting various hardware systems
Qualifications:
Bachelor’s degree in computer science, engineering, applied mathematics or comparable area of study
Experience with performance optimization
Preferred Qualifications:
Graduate degree in computer science, engineering, applied mathematics or comparable area of study
Familiarity with compilers and compiler frameworks such as MLIR
Experience with PyTorch, TensorFlow, vLLM, ONNX and other AI frameworks
Software development experience with Python, C++, and CUDA
Recommended Jobs
Accountant
We are currently seeing an experienced accountant to join our growing team. You will be responsible for performing general bookkeeping, maintaining financial reports, tax reporting, and preparing and…
Job Coach
TITLE: Job Coach Rover ANTICIPATED SALARY: $23.50/hr SCHEUDLE: Part-Time (Tuesday, Thursday, Friday 8:00 AM - 3:15 PM) BONUS : $250 Sign on Bonus (paid at 90 and 180 days) Achievement Hous…
Senior Frontend Engineer - USA
About Inworld At Inworld, we believe the processes of building, scaling, and evolving applications are monsters that consume value before it can reach users. Our mission is to solve evolution and tr…
Corporate Recruiter
The Recruiter is responsible for assisting with the recruitment process. The process has to be properly designed and implemented. The Recruiter assists with building a healthy relationship with inter…
Test Engineer - CA
TEST ENGINEERS The sky is the limit when it comes to new ideas here at MGA Research Corporation because we are always looking for ways to improve and your ideas are valued. From testing individua…
Software Engineer - OpenSearch
Company Description A career in IBM Software means you'll be part of a team that transforms our customer's challenges into solutions. Seeking new possibilities and always staying curious, we …
Fullstack Engineer, Seller Engineering
&##128640; Join the Future of Commerce with Whatnot! Whatnot is the largest live shopping platform in North America and Europe to buy, sell, and discover the things you love. We’re re-defining e-comme…
Lead Package Handler
Lead Package Handler Location Sun Valley, CA : OnTrac is hiring a Lead Package Handler ! Are you eager to join a dynamic and expanding company where you can both learn and make a meaningful impact? If…
Staff Software Engineer (Full Stack)
About Onos Health Onos Health’s mission is simple but ambitious: ensure every healthcare dollar goes toward delivering the highest quality care. Today, 30% of total U.S. healthcare spending is waste…
Driver
Compensation: $17.89 - $28 hourly pay rate. Including tips, and bonus! College Hunks Hauling Junk and Moving is looking to hire Drivers! Get paid to stay fit, build your resume and work side-by-sid…