Senior Software Engineer, Reliability

Box
Redwood City, CA

We design high-performance, low-latency, high-throughput services, promote best practices, and engage in architectural design to embed reliability into every layer of our products. We seek your expertise in distributed systems, resilience engineering, and large-scale production operations — to identify gaps, design and build solutions, and guide product teams towards building highly available and resilient services. Your work will directly strengthen our SRE strategy, operational excellence, system performance, and reliability culture. We are seeking innovative problem-solvers passionate about large-scale distributed systems and eager to grow their skills in modern SRE practices. As a small team tackling complex challenges at scale, we offer the opportunity to make significant technical contributions while driving observability culture across the organization. 5+ years of working experience designing, developing, and operating large-scale, customer-facing products or services Experience coding in higher-level languages (e.g., Java, Scala, Go, Python) is preferred A strong interest in solving challenging problems using innovative and data-driven approaches An SRE-centric mindset — you build and manage systems with reliability, scalability, availability, and security as core principles Experience designing complex systems and frameworks using proven system design principles, such as NALSD (Non-Abstract Large System Design) methodologies Experience troubleshooting issues across distributed Linux environments, with comfort tracing problems across applications, systems, and networks Proficient with modern cloud technologies such as GCP, AWS, and Kubernetes Experienced in service observability practices and tools (e.g., Prometheus, OpenTelemetry, SignalFx, or similar) Comfortable learning new software, frameworks, and APIs quickly and effectively Natural collaborator who inspires others, mentors junior engineers, and drives technical excellence Bonus: Familiarity with PHP/JavaScript/NodeJS You will be constantly developing automations / frameworks / tools for better platform reliability/resilience/availability You will collaborate with other engineers on the team as well as cross functionally to foster solid software engineering principles and represent our engineering values You will participate in various POCs on new projects and frameworks being evaluated for the product/platforms You will improve our observability as both a developer/maintainer of systems/frameworks, and a mentor to our product development teams You will work with modern cloud-native technologies including container orchestration (Kubernetes, Docker), service mesh solutions (Istio, Linkerd), and cloud platforms (AWS, GCP) You will participate in product design reviews and architectural discussions to ensure reliability is considered early in the development lifecycle of product/services You will participate in a team on-call rotation

Posted 2025-11-28

Recommended Jobs

Software Engineer II

Aurora Innovation
Mountain View, CA

Aurora hires talented people with diverse backgrounds who are ready to help build a transportation ecosystem that will make our roads safer, get crucial goods where they need to go, and make mobility…

View Details
Posted 2025-12-22

Data Scientist AI

Internet Brands
El Segundo, CA

Description Data Scientist AI Internet Brands and WebMD are looking for a Data Scientist to join our Los Angeles based headquarters, and work on exciting personalization initiatives! The position …

View Details
Posted 2025-12-12

Product Manager, AI Products

Coactive Ai
San Jose, CA

If you're a product leader who thrives in fast-paced environments, loves solving complex customer problems, and is driven to build AI-powered products that make a tangible impact, this is your role. …

View Details
Posted 2025-11-28

Full-Stack Crypto Software Engineer

Curio Research
San Francisco, CA

About us Curio builds bleeding edge crypto games and infrastructure. Since 2021 , we’ve been pioneers in the onchain game space, shipped mini games to thousands of users, and we are about to ship…

View Details
Posted 2026-01-07

FLOOR TECHNICIAN (FULL TIME)

Compass Group
Long Beach, CA

  We are hiring immediately for full time FLOOR TECHNICIAN positions. Location : Long Beach Medical Center - 2801 Atlantic Avenue, Long Beach, CA 90802. Note: online applications accepted on…

View Details
Posted 2025-10-03

Software Engineer: Backend & Infrastructure

Vooma
San Francisco, CA

About the role Your role, should you choose to join us, will be as an Infrastructure / Backend Engineer on our founding team. You're the right person for this role if you're excited to build not …

View Details
Posted 2026-01-07

Manager, Technical Recruiting

zoox
Foster, CA

Zoox is seeking an experienced Technical Recruiting Manager to lead one of our teams focused on hiring engineering talent. In this role, you will partner with engineering leaders to develop and execu…

View Details
Posted 2026-01-12

Test Engineer, Manufacturing Test & Diagnostics

Zoox
San Carlos, CA

Zoox is looking for a test engineer to build test solutions for manufacturing the electronic platform that underpins our autonomous vehicles. In this role, you will be responsible for the development…

View Details
Posted 2025-11-25

Lead Product Manager, Enterprise Services Management

Asana
San Francisco, CA

The Product Management team drives Asana’s product strategy and execution, translating customer needs and opportunities into a compelling roadmap and working cross-functionally to deliver impactful s…

View Details
Posted 2025-12-13

1564 - Software Engineer II

Sigma Defense
Ridgecrest, CA

Sigma Defense is seeking a Software Engineer II to work under a Senior Software lead to develop new software for the Remotely Operated Tracking Radar #1 at the China Lake Range. This software produ…

View Details
Posted 2025-11-28