Staff Software Engineer

Crusoe
San Francisco, CA

Crusoe is on a mission to accelerate the abundance of energy and intelligence . As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens — to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building the future, faster.

We're in the midst of the greatest industrial revolution of our time. The demand for AI compute is boundless, and power is a bottleneck. We're solving that — with an energy-first approach that makes AI infrastructure better for the world and faster for the people innovating with AI.

We're looking for problem-solving, opportunity-finding teammates with a sense of urgency, who believe in the scale of our ambition and thrive on a path not fully paved — people who want to grow their careers alongside a team of experts across energy, manufacturing, data center construction, and cloud services.

If you want to do the most meaningful work of your career, help our customers and partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at Crusoe.

Crusoe’s Data Center Infrastructure Engineering (DCIE) team is fundamental to our mission of providing AI hardware and infrastructure as a service. The team provides infrastructure for Crusoe’s fleet GPU’s and data center. The team sits at the nexus of high performance computing and AI infrastructure as a service.

The DCIE team owns, deployment maintenance, observability, critical environment, and automation. The team builds, and maintains GPU clusters, develops automation for logical and physical maintenance, provision systems, and observability tooling.

About the Role:

We are seeking a highly skilled and motivated Software Engineer to join Crusoe’s Data Center Infrastructure Engineering team. This position is focused on the development of software for the management of a fleet of GPU servers as well as the data centers that house those systems. The role focuses on the developing and implementing advanced diagnostic, observability, automation and repair tooling for high-performance GPU compute clusters.

The ideal new team member will be a hands-on problem solver who is comfortable working independently. The new team member will play a critical role in maintaining the health and scalability of Crusoe’s rapidly growing GPU fleet.

What You’ll Be Doing:

  • Developing and implementing deep-level diagnostics and troubleshooting of hardware faults within GPU racks and high-density compute systems.

  • Developing troubleshooting and automation tooling for GPU platforms including NVIDIA A100, H200, GB200, B200 and AMD 350X / 355X.

  • Developing automation and AI agents for executing component-level diagnosis and remediation for failed or degraded hardware.

  • In conjunction with data center operations develop innovative tooling and AI agents for managing the critical environment.

  • Developing tooling for post-repair validation and testing tools such as burn-in, Pytorch, and NVIDIA NCCL to ensure system stability and performance.

  • Own the deployment, monitoring, and operational support of developed tooling, ensuring solutions maximize GPU fleet availability and performance to drive customer success.

  • Developing automation and operational tooling for facilities management power as well as direct liquid cooling hardware systems

What You’ll Bring to the Team:

  • Software engineering experience.

  • The ability to identify a problem, rapidly develop a scalable solution and ship it.

  • Ability to lean in and assist team members working on critical or complex technical initiatives.

  • Ability to set the technical direction for a specific project and execute.

  • Expertise in distributed systems, reliability, and cloud platforms (Kubernetes, IaC, GCP etc.)

  • Strength in at least one programming language - Go, Python, Java, Rust.

  • Strong analytical and problem-solving skills.

  • Excellent communication and collaboration skills.

  • Ability to work independently and within a team

Nice to Have:

  • Experience with Temporal and Kubernetes.

  • Experience working directly with hardware vendors.

  • Background in large-scale GPU fleet operations or hyperscale data center environments.

Benefits:

  • Industry competitive pay

  • Restricted Stock Units in a fast growing, well-funded technology company

  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents

  • Employer contributions to HSA accounts

  • Paid Parental Leave

  • Paid life insurance, short-term and long-term disability

  • Teladoc

  • 401(k) with a 100% match up to 4% of salary

  • Generous paid time off and holiday schedule

  • Cell phone reimbursement

  • Tuition reimbursement

  • Subscription to the Calm app

  • MetLife Legal

  • Company paid commuter benefit; $300 per month

Compensation Range

Compensation will be paid in the range of up to $208,000 - $253,000 + Bonus. Restricted Stock Units are included in all offers. Compensation to be determined by the applicants knowledge, education, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

Posted 2026-03-25

Recommended Jobs

Senior Product Designer

Veeva Systems
Pleasanton, CA

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in histo…

View Details
Posted 2026-02-25

Physical Therapist

Blue United Sourcing
Sacramento, CA

Physical Therapist – Skilled Nursing Facility Travel or Perm position. 📍 Sacramento, CA 🕒 13-Week Assignment | 36 Hours per Week 💲 $60–$65 per hour 🚀 Start Date: ASAP 🔁 Possible…

View Details
Posted 2026-01-15

Senior Electrical Design Engineer

zoox
Foster, CA

Help shape the future of mobility. As a Senior Electrical Design Engineer at Zoox, you will lead the development of electronic control modules that power our autonomous vehicle platform. We’re lookin…

View Details
Posted 2025-11-20

Youth Mentoring Sports Coach at Boys & Girls Club of the Foothills

Coach Across America
Monrovia, CA

Job Description Job Description Salary: About Coach Across America Founded in 2010, Coach Across America is a national nonprofit organization focused on the power of coaches to transform th…

View Details
Posted 2026-03-21

Common Area Attendant. Club Wyndham Oceanside Pier Resort

Wyndham Destinations
California

We Put the World on Vacation Travel + Leisure Co. is the world’s leading vacation ownership and travel membership company, with a dynamic and growing portfolio of resort, travel club, and lifestyl…

View Details
Posted 2026-03-04

Hardware Integration & Test Engineer III

Innoflight LLC
San Diego, CA

GROW WITH US AND STAY EXTRAORDINARY: Launch your career to new heights with Innoflight—one of San Diego’s fastest-growing Aerospace and Defense innovators. Here, visionary minds engineer the futur…

View Details
Posted 2026-01-30

Associate Software Engineer in Test

Veeva Systems
Pleasanton, CA

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in histo…

View Details
Posted 2025-10-19

Licensed Clinical Social Worker (LCSW) - Red Bluff, CA

AB Staffing Solutions
Redding, CA

Red Bluff / Redding, CA Competitive Pay – Up to $3,086/week 12-Month Contract | Extensions Possible We are seeking an experienced, compassionate Licensed Clinical Social Worker (LCS…

View Details
Posted 2026-01-06

Inventory Coordinator II

Rocket Lab USA
Long Beach, CA

About The Role ABOUT ROCKET LAB Rocket Lab is an end-to-end space company delivering responsive launch services, complete spacecraft design and manufacturing, payloads, satellite components, and mor…

View Details
Posted 2026-01-31

Online SAT / Test Prep Tutor

Learner Education
San Francisco, CA

SAT/Test Prep Tutor (Contract) Location: Remote Hours: Set Your Own Schedule Pay: $35.00/hr Test Prep - $25/hr Academic Subjects About Learner Education Learner Education is on a miss…

View Details
Posted 2026-03-09