GPU / ML Engineer
Company
Solving AI inference economics through intelligent orchestration, real-time telemetry & automatic runtime optimization.Description
Description:
The company is looking for an engineer to support model optimization and inference for large language models , working mainly with Python and NVIDIA GPUs (CUDA).
Tasks:
- Work with NVIDIA GPUs (CUDA) to run and optimize ML workloads
- Apply quantization techniques to LLMs using existing libraries (e.g., GPTQ)
- Integrate and run off-the-shelf tools for model optimization and inference
- Optimize performance of models on modern GPU architectures (e.g., Hopper, Blackwell)
- Collaborate with the team to validate approaches and results
- Quickly prototype and validate technical solutions
Must-have:
- 5+ years of experience in software engineering / ML / GPU-related roles
- Strong hands-on experience with NVIDIA GPUs and CUDA
- Solid Python skills
- Experience working with ML frameworks and running models in production or near-production environments
- Ability to work independently
- Basic background in applied mathematics (education)
Nice-to-have:
- Experience with LLM optimization and inference pipelines
- Familiarity with modern GPU architectures (Hopper, Blackwell)
- Experience with quantization techniques (e.g., GPTQ or similar)
- English skills
- Embedded systems or low-level optimization
Benefits:
- Remote, flexible engagement
- Opportunity to expand into a larger role if collaboration is successful
- Work on modern AI / LLM optimization problems
Interview process:
- Intro call with Toughbyte
- First interview with the architect
- Follow-up interview with the company executives (if needed)
Recommended Jobs
Software Engineer, Loans Originations
*Preference will be given to those already in the San Francisco Bay Area, we are not allowing for relocation* The Role Join us in revolutionizing the lending landscape. SoFi is seeking enthusiast…
Network Software Engineer ($250K $300K + 0.1% 0.5% Equity) at Blaxel
This is a job that Jill, our AI Recruiter, is recruiting for on behalf of one of our customers. She will pick the best candidates from Jack's network. The next step is to speak to Jack . …
Optometrist - Doctor of Optometry - Oxnard, CA
Optometrist - Doctor of Optometry - Oxnard, CA Location: Oxnard, CA We are a Privately Owned Optometry Practice that is looking to add an Optometrist to our team! We treat Adult and Chil…
Psychiatric Technician (Part-Time)
Support Delivery of Mental Health Care; Great Job if You are in School or Just Starting in the Field Interact with clients and clinicians in proven programs in a beautiful setting with plenty of…
Software Engineer
Software Engineer Location: San Francisco, USA Workplace Type: Hybrid About the Role We are seeking a talented and passionate Software Engineer to join our dynamic engineering team. In …
Engineering Manager (West Coast)
Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in histo…
Class A Roll-Off Driver - Sign on Bonus, Local-Great Schedule, Excellent Training, and Amazing Team Job
Class A Roll-Off Driver - Sign on Bonus, Local-Great Schedule, Excellent Training, and Amazing Team Job Class A or B Roll-Off Driver safely and efficiently operate heavy-duty trucks and is responsib…
Software Automation and Integration Engineer
Automation & Integration Engineer – TDI Refrigeration TDI Refrigeration is seeking an Automation & Integration Engineer to streamline and modernize how our business systems work together. This r…
Senior Fullstack Engineer, Patient Experience
Senior Full-stack Engineer Location: Hybrid- San Francisco OR Palo Alto Office, Tuesday/Thursday The difference you will make We're looking for a senior fullstack engineer to join our growi…
Software Engineer
Working across the entire stack—from backend APIs to responsive frontends—you'll integrate AI/ML components to solve complex problems and enhance user experiences. Implements tasks within the Software…