Engineering Manager, Fleet Clusters

OpenAI
San Francisco, CA

About the Team

Our team runs the GPU fleet that serves the models backing ChatGPT and the API. We build automation to provision and manage one of the largest cutting edge GPU inference fleets in the world, exposing it as a singular platform for other OpenAI teams to seamlessly run production applied AI workloads. 

We seek to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely. Safety is more important to us than unfettered growth.

About the Role

We are looking for an experienced engineering manager to help lead our Fleet Clusters team. You’ll be responsible for building, scaling, and operating the massive GPU fleet clusters that power AI inference and general purpose training at OpenAI. This role focuses on designing and managing large-scale, high-availability GPU clusters across multiple environments, ensuring reliability, scalability, and efficiency. You will partner closely with product, research, and infrastructure teams to rapidly ship and support advanced AI products at global scale.

In this role, you will:

  • Manage and build a team of high performing infrastructure engineers

  • Guide the roadmap for automation for a fleet that can grow an order of magnitude in size or more

  • Build a world-class, secure compute fleet that serves users at scale

  • Set technical direction on evolving our compute and abstractions to support a growing business

  • Collaborate closely with a broad set of stakeholders, including product engineering, inference, security, research and finance

  • Work with external partners to unlock bleeding edge compute and making it available as a turnkey resource for scheduling workloads

  • Coach and nurture engineers to accelerate their growth and learning

You might thrive in this role if you:

  • 10+ years of experience in infrastructure software engineering, including 5+ years in engineering management.

  • Proven track record of building high-performance computing infrastructure teams at scale.

  • Hands-on experience provisioning bare-metal server data centers interconnected across WANs.

  • Experience designing and operating hybrid-cloud platforms.

  • Ownership mentality: willing to pick up new skills and knowledge to solve problems end-to-end. Comfortable being hands-on when needed to help debug systems and support the team.

  • Ability to operate effectively in fast-paced environments with loosely defined priorities and competing deadlines.

.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. 

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement .

Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form . No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link .

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Compensation

$300K – $450K + Offers Equity

Posted 2025-07-31

Recommended Jobs

Women's Health Nurse Practitioner/CNM (Union City, CA) - Bonus Available

Tiburcio Vasquez Health Center
Union City, CA

Working from a solutions-oriented approach, the NP/PA/CNM delivers Women’s Health care, including commensurate with training. Women’s Health includes OB/GYN care and monitors clinical performance t…

View Details
Posted 2025-07-29

Product Support Representative

ATR International
San Jose, CA

Job Description Job Description ATR International is seeking a Product Support Representative for a major client in San Jose, CA! Details: Provide support via inbound and outbound phone calls,…

View Details
Posted 2025-07-30

Office Service Representative I

Canon Business Process Services, Inc.
Sacramento, CA

Office Service Representative I Location Sacramento, CA (Downtown area) : Overview: Under general supervision, is responsible for all mail, packages and interoffice correspondence handling and disp…

View Details
Posted 2025-07-30

Optometrist Job San Luis Obispo, CA area full time opening

The Eye Group
Morro Bay, CA

Join a busy, growing practice with 2 MDs and 1 OD. Fully staffed optical shop and physician owned Medicare certified ASC. Located halfway between Los Angeles and San Francisco, along Californias Centr…

View Details
Posted 2025-07-29

Staff RN Surgery / OR Registered Nurse

Xpert Recruiters LLC
Anaheim, CA

Job Description Job Description The following statements are intended to describe the major elements and requirements of the position and should not be taken as an all-inclusive list of responsib…

View Details
Posted 2025-07-30

Product Manager - Catalyst Switching

Cisco
San Jose, CA

The application window is expected to close on: 07/24/2025. Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received.  About the Role: …

View Details
Posted 2025-07-31

Customer Service Rep - 17421 Beach Blvd.

Domino's Franchise
Huntington Beach, CA

Job Description Take in person and phone orders Complete transactions with guest at register Help other stations when needed Clean as you go Follow food safety standards Good Cu…

View Details
Posted 2025-07-30

OPERATIONS MANAGER

Mountain Valley Express
Santa Rosa, CA

Description Operations Manager Santa Rosa, CA – Onsite Who We Are Mountain Valley Express (MVE) is a leading LTL Carrier and 3PL Services provider with locations across California, Arizo…

View Details
Posted 2025-07-30

Informatica admin/administrator

Cloud Analytics Technologies LLC
Pleasanton, CA

Job Description: Job Description: Informatica Administrator Requirement 8-12 years of experience Informatica Administration for products like Power Center Data Quality DVO and Test Data Managem…

View Details
Posted 2025-07-30