Senior Software Engineer, Machine Learning Infrastructure
- Build up a model serving platform for efficient large-scale simulations and reinforcement learning (RL) training.
- Collaborate with ML researchers to optimize large distributed processes like PyTorch ML training and latency-critical processes like an on-edge ML runtime.
- Maintain observability and monitoring for critical services like ML training, data dumping, and deployment.
- Implement tools to track the model development lifecycle for an efficient deployment and evaluation process.
- 2+ years of relevant work experience or an equivalent experience in Masters/PhD with 2+ years of relevant experience.
- Strong coding skills in Python or C++.
- You have relevant exposure to the ML development life cycle and ML models.
- You have experience in building a cloud-based distributed training platform.
- You have experience profiling and optimizing performance bottlenecks for deep learning models.
- Experience with model & data parallel training frameworks like PyTorch FSDP
- Experience with ML optimization techniques like quantization, distillation, or pruning.
- Experience with ML compiler frameworks like XLA, Pytorch compiler, or MLIR
- Experience with on-device ML deployment frameworks like ONNX, TensorRT or Nvidia DriveOS.
Recommended Jobs
Principal Software Engineer (ZDX)
About Zscaler Serving thousands of enterprise customers around the world including 40% of Fortune 500 companies, Zscaler (NASDAQ: ZS) was founded in 2007 with a mission to make the cloud a safe pl…
Full-Stack Developer - Robotics & AI Systems
Company Description We are an early-stage robotics startup working on building multi-purpose mobile robots that can do complex manipulation tasks. We are looking for a creative, skilled, and motivat…
Scheduling Clerk-Cath Lab-Per Diem Various-Temecula Valley Hospital
Responsibilities About Temecula Valley Hospital Temecula Valley Hospital (TVH), part of Southwest Healthcare, brings advanced technology, innovative programs, patient-centered and family sensit…
Pipeline AI Engineer
Company Description LinkedIn is the world’s largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful co…
Locum Tenens ObGyn Job CA
This Job at a Glance Job Reference Id: ORD-183664-MD-CA Title: MD Dates Needed: June - Ongoing Shift Type: Day Shift Assignment Type: Clinic Call Required: No Board Certifi…
Staff Product Manager, Compute SW Developer Ecosystem Enabling
Company: Qualcomm Technologies, Inc. Job Area: Operations Group, Operations Group Product Management General Summary: Qualcomm is looking for an experienced Product Management leade…
Lead Web Developer
Netradyne harnesses the power of Computer Vision and Edge Computing to revolutionize the modern-day transportation ecosystem. We are a leader in fleet safety solutions. With growth exceeding 4x ye…
Director of Utility Engineering Department
Director of Utility Engineering Department At EKN Engineering, we solve challenging problems with innovative engineering and configurable software solutions. We use an engineering-first approach, …
Sr. Accountant
GENERAL SUMMARY: The Accountant III performs accounting functions for various operations including retail, mission services, and transportation. This position provides narrative, statistical an…
CV/ML Engineer
About Fulfil: Fulfil is a venture capital funded, stealth startup in the automation robotics space and is located in Redwood City, CA. Founded by a group of engineers with a history of…