CUDA Kernel Optimizer ML Engineer
1) Role Overview
Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization performance profiling and numerical efficiency. These professionals possess a deep mental model of how modern GPU architectures execute deep learning workloads. They are comfortable translating algorithmic concepts into finely tuned kernels that maximize throughput while maintaining correctness and reproducibility
2) Key Responsibilities
-
Develop tune and benchmark CUDA kernels for tensor and operator workloads.
-
Optimize for occupancy memory coalescing instruction-level parallelism and warp scheduling.
-
Profile and diagnose performance bottlenecks using Nsight Systems Nsight Compute and comparable tools.
-
Report performance metrics analyze speedups and propose architectural improvements.
-
Collaborate asynchronously with PyTorch Operator Specialists to integrate kernels into production frameworks.
-
Produce well-documented reproducible benchmarks and performance write-ups.
3) Ideal Qualifications
-
Deep expertise in CUDA programming GPU architecture and memory optimization.
-
Proven ability to achieve quantifiable performance improvements across hardware generations.
-
Proficiency with mixed precision Tensor Core usage and low-level numerical stability considerations.
-
Familiarity with frameworks like PyTorch TensorFlow or Triton (not required but beneficial).
-
Strong communication skills and independent problem-solving ability.
-
Demonstrated open-source research or performance benchmarking contributions.
4) More About the Opportunity
-
Ideal for independent contractors who thrive in performance-critical systems-level work.
-
Engagements focus on measurable high-impact kernel optimizations and scalability studies.
-
Work is fully remote and asynchronous; deliverables are outcome-driven.
-
Access to shared benchmarking infrastructure and reproducibility tooling via Mercor support resources.
5) Compensation & Contract Terms
-
Typical range: $120$250/hour depending on scope specialization and results achieved. Payments will be based on accepted task output over flat hourly.
-
Structured as a contract-based engagement not an employment relationship.
-
Compensation tied to measurable deliverables or agreed milestones.
-
Confidentiality IP and NDA terms as defined per engagement.
6) Application Process
-
Submit a brief overview of prior CUDA optimization experience profiling results or performance reports.
-
Include links to relevant GitHub repos papers or benchmarks if available.
-
Indicate your hourly rate time availability and preferred engagement length.
-
Selected experts may complete a small paid pilot kernel optimization project
7) About Mercor
-
Mercor connects domain experts with top AI research and technology organizations through project-based contracts.
-
Contractors operate independently with full flexibility over methods timelines and tools.
-
Our mission is to help top engineers and researchers access frontier technical work without rigid employment structures.
Recommended Jobs
FY26 - FSO Assurance - Technology Risk Senior - Los Angeles
Location: Los Angeles, Irvine At EY, we’re all in to shape your future with confidence. We’ll help you succeed in a globally connected powerhouse of diverse teams and take your career where…
Catering Chef
M-F Catering Chef opportunity in San Jose! Ideal candidate has catering/event experience, is creative, worked in scratch kitchens and can lead a team. Hiring NOW! Bay Area residents only!
Au Pair
We have 2 boys, who will be ages 8 and 13 years old this fall. Our current J-1 au-pair will complete her year will us in October, and we are beginning the interview process for her replacement now. Yo…
Mechanical Engineer
Description Company Overview: For over 50 years, Wunder-Bar, a Middleby Company, has been at the forefront of beverage dispensing innovation. We have built a legacy of exceptional product develop…
Senior Associate Scientist
Target PR Range: 37-47/hr *Depending on experience The Thousand Oaks Global Critical Reagents (GCR) Team is looking for an experienced, hands-on scientist to join the GCR - Process Excellence (GC…
Mechanical Assembler
**Job Title: Mechanical Assembler** **Job Description** We are seeking a dedicated Mechanical Assembler to join our team from 5:30 am to 2:00 pm. The successful candidate will be responsible for assem…
Marketing Coordinator
Job Responsibilities: ;Creator Lists &; Vetting ;Researching and creating influencer lists for the team for earned kits, earned experiences and live events ;Vetting Creators for earned progr…
Staffing Office Coordinator - Float Pool Orange - FT Night Shift
Overview: UCI Health is the clinical enterprise of the University of California, Irvine, and the only academic health system based in Orange County. UCI Health is comprised of its main campus, UCI …
Dean, College of Professional and Global Education
Dean College of Professional and Global Education *Academic Search is assisting Cal Poly Pomona in this search under the direction of Managing Directors and Senior Consultant Dr. Cynthia M. Patters…