GPU Kernel Engineer
Sciforium is an AI infrastructure company developing next-generation multimodal AI models and a proprietary, high-efficiency serving platform. Backed by multi-million-dollar funding and direct sponsorship from AMD with hands-on support from AMD engineers the team is scaling rapidly to build the full stack powering frontier AI models and real-time applications.
About the role
We are seeking a highly skilled GPU Kernel Engineer who is passionate about pushing the limits of performance on modern accelerators. In this role, you will design and optimize custom GPU kernels that power next-generation large-scale AI systems. You will work across the hardware–software stack, from low-level kernel development to integrating optimized ops into high-level ML frameworks used for large-scale training and inference.
Key Responsibilities
- Design, implement, and optimize custom GPU kernels using C++, PTX, CUDA, ROCm, Triton, and/or JAX Pallas.
- Profile and optimize end-to-end performance of ML operations, with a focus on large-scale LLM training and inference.
- Integrate low-level GPU kernels into frameworks such as PyTorch, JAX, and custom internal runtimes.
- Develop performance models, identify bottlenecks, and deliver kernel-level improvements that significantly accelerate AI workloads.
- Collaborate with ML researchers, distributed systems engineers, and model-serving teams to optimize compute performance across the stack.
- Work closely with hardware vendors (NVIDIA/AMD) and stay current on the latest GPU architecture capabilities and compiler/toolchain improvements.
- Contribute to tooling, documentation, benchmarking suites, and testing frameworks to ensure correctness and performance reproducibility.
Must‑Haves
- 5+ years of industry or research experience in GPU kernel development or high-performance computing.
- Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Electrical Engineering, Applied Mathematics, or a related field.
- Strong programming skills in C++, Python, and familiarity with ML frameworks.
- Deep expertise in CUDA/ROCm, GPU memory models, and performance optimization strategies.
- Hands‑on experience with Triton and/or JAX Pallas for custom kernel development.
- Strong understanding of PTX, GPU ASM, and low-level GPU execution.
- Extensive experience writing and optimizing custom GPU kernels in C++ and PTX.
- Proven ability to integrate low-level kernels into PyTorch, JAX, or similar frameworks.
- Experience working with large-scale LLM workloads (training or inference).
Nice‑to-Haves
- Experience with AMD GPUs and ROCm optimization.
- Familiarity with JAX FFI and custom ML operator development.
- Experience with efficient model serving frameworks (e.g., vLLM, TensorRT).
- Experience with TPUs, XLA, or similar accelerator programming environments.
- Contributions to open‑source ML systems, compilers, or GPU kernels.
Benefits include
- Medical, dental, and vision insurance
- 401k plan
- Daily lunch, snacks, and beverages
- Flexible time off
- Competitive salary and equity
Equal opportunity
Sciforium is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.
#J-18808-LjbffrRecommended Jobs
Community Health Outreach Worker
At TrueCare, we believe that everyone deserves access to excellent healthcare. For over 50 years we have been helping patients and families have timely, affordable, and expert healthcare. Joining us…
Auto Dealership Accounts Payable
Accounts Payable Clerk is responsible for processing daily accounting transactions accurately and efficiently. REQUIREMENTS AND QUALIFICATIONS: Two years of accounts payable managem…
Data Engineer
About Rhythm360 We're a leading cardiac data management company, turning complex healthcare data into clear, actionable insights for providers and patients. Our platform helps care teams manage var…
Physician Assistant
Job Overview We are seeking a dedicated and skilled Physician Assistant to join our healthcare team. The ideal candidate will work collaboratively with physicians and other healthcare professional…
Geospatial Data Analyst for Fire Intelligence & Maps
A leading AI technology company in San Francisco is seeking a Geospatial Data Analyst to enhance wildfire detection efforts. The ideal candidate will have a Bachelor's degree in a relevant field and 1…
Vehicle Audit Rep
Vehicle Audit Rep Location Los Angeles, CA : CCC Intelligent Solutions Inc. (CCC) is a leading cloud platform for the multi-trillion-dollar insurance economy, creating intelligent experiences for ins…
COORDINADORA DE CUENTA DE CLIENTE TRANSPORTE
El Coordinador de cuentas de clientes supervisa a un cliente dedicado y es responsable de la interacción con el cliente en el día a día. Trabaja con flotas dedicadas para participar en oportunidade…
Founding Deployment Strategist
Location Boston, San Francisco Employment Type Full time Location Type Hybrid Department Sales Founding Deployment Strategist TL;DR: We're looking for a Founding Deployment…
Junior Payroll Analyst
The Junior Payroll Analyst supports the payroll team in processing accurate and timely payroll for employees. This role involves assisting with data entry, payroll calculations, compliance with compa…
Senior Software Engineer, AI Products
Location San Francisco, CA Employment Type Full time Department Engineering Compensation ~$185K – $225K • Offers Equity Actual compensation packages depends on various factor…