Machine Learning Engineer, Training Infrastructure
Job Title: Machine Learning Engineer, Training Infrastructure
Position Type: Full time
Location: San Francisco, CA, USA
Salary Range: $150,000 - $250, 000 (USD)
Job ID#: 158135
We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate has diverse experience managing ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if you don't meet every requirement — we value curiosity, creativity, and the drive to solve hard problems.
ResponsibilitiesDesign, implement, and maintain scalable computing solutions for training and deploying ML models, ensuring infrastructure can handle large video datasets.
Manage and optimize the performance of our computing clusters or cloud instances, such as AWS or Google Cloud, to support distributed training.
Ensure that our infrastructure can handle the resource-intensive tasks associated with training large generative models.
Monitor system performance and implement improvements to maximize efficiency and utilization , using tools like Airflow for orchestration.
Collaborate across research teams to understand their computational needs and provide appropriate solutions, facilitating seamless model deployment.
Bachelor’s degree in Computer Science, Information Technology, or a related field, with a focus on system administration.
Experience with cloud computing platforms such as Amazon Web Services, Google Cloud, or Microsoft Azure, essential for managing large-scale ML workloads.
This role is vital for ensuring the computational backbone supports the company’s ML efforts, focusing on deployment and scalability.
Values engineering processes and version control (CI/CD).
Knowledge of containerization technologies like Docker and Kubernetes required for deployments at scale.
Understanding of distributed training techniques and how to scale models across multi-node clusters aligning with video generation needs.
Strong problem-solving and communication skills, given the need to collaborate with diverse teams.
Founded in 2009, IntelliPro is a global leader in talent acquisition and HR solutions. Our commitment to delivering unparalleled service to clients, fostering employee growth, and building enduring partnerships sets us apart. We continue leading global talent solutions with a dynamic presence in over 160 countries, including the USA, China, Canada, Singapore, Japan, Philippines, UK, India, Netherlands, and the EU.
IntelliPro, a global leader connecting individuals with rewarding employment opportunities, is dedicated to understanding your career aspirations. As an Equal Opportunity Employer, IntelliPro values diversity and does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, or any other legally protected group status. Moreover, our Inclusivity Commitment emphasizes embracing candidates of all abilities and ensures that our hiring and interview processes accommodate the needs of all applicants. Learn more about our commitment to diversity and inclusivity at . Compensation: The pay offered to a successful candidate will be determined by various factors, including education, work experience, location, job responsibilities, certifications, and more. Additionally, IntelliPro provides a comprehensive benefits package, all subject to eligibility.
Recommended Jobs
Dental Assistant Job Training
Job Description Job Description Benefits: ~401(k) ~ Competitive salary ~ Dental insurance ~ Employee discounts ~ Flexible schedule ~ Health insurance ~ Paid time off ~ Training & d…
Staff Accountant
Umbra builds next-generation space systems that observe the Earth in unprecedented fidelity. Our mission: Deliver global omniscience. To stay ahead of climate change, geopolitical risk, and other majo…
Applied AI Engineer
About Us Parcha is building the automation layer for fintech and banks. Our platform leverages cutting-edge language models and machine-learning techniques to automate complex compliance workflo…
Fullstack Software Engineer (B2B, Software Development) - Hybrid
Why Candidates Should Join: What you'll achieve with us &##129351; You'll make significant individual contributions that get used instantly & heavily by our clients who depend on our softwar…
Full Time Primary Care Physician Job Baldwin Park, CA
The Inline Group - Full Time Employed Loan Repayment Compensation: $238,231 to $266,183 base salary Benefits: - Health/Dental/Vision - Life Insurance - Paid Vacation/Sick Leave 9 days…
Sr. Staff Data Scientist, Machine Learning
Varo is an entirely new kind of bank. All digital, mission-driven, FDIC insured and designed for the way our customers live their lives. A bank for all of us. Each member of the Data team plays an…
Administrative Assistant, Paralegal - Santa Monica, 90404
Administrative Assistant, Paralegal - Santa Monica, 90404, United States of America How we LEAD: We are currently seeking an eager and exceptional Paralegal/Administrative Assistant in our Sant…
Test Engineer (Mechanical)
About the Company At General Matter, we’re strengthening America’s capacity in nuclear energy to create a new set of possibilities for our shared future, from generating clean energy at scale to f…
Server / Waitstaff
At Waffle House, we are not in the food business. We are in the People Business and we are hiring immediately for full time and part time cooks for (All Shifts). Being in the People Business, we don…
BI Engineer
Req ID: 6894 Department: Information Technology Status: Reg F-T Exempt , Exempt Location: Napa, California (US-CA) Workplace Location: Job Summary: Trinchero Family Estates is l…