Senior Site Reliability Engineer (SRE)
About favorited
At favorited, we believe that digital communities should be more than just spaces to watch content. Our platform is a place to connect, engage, and play, and empowers creators by enhancing audience participation and fostering deeper connections.
Our work culture is intense and isn’t for everyone. But, if you’re a self-starter eager to shape the future of social interaction with a team that holds itself to the highest standards, this is the place for you. We value open, yet respectful communication and real-time feedback to help each other grow quickly. If you’re passionate about gaming and have a knack for gamifying everyday life, you’ll thrive in our fast-moving, collaborative environment.
About the Role
We are looking for a Senior Site Reliability Engineer to help ensure the reliability, scalability, and performance of the infrastructure that powers favorited’s real-time platform. You will play a key role in building and maintaining systems that support high-traffic applications used by a rapidly growing global audience.
This role is ideal for someone who enjoys solving complex infrastructure challenges, improving system reliability, and building automation that allows engineering teams to move quickly and confidently.
Responsibilities
Design, implement, and maintain highly reliable and scalable infrastructure supporting real-time applications.
Build automation and tooling to improve system reliability, deployment processes, and operational efficiency.
Develop and maintain monitoring, logging, and alerting systems to ensure high availability and rapid incident response.
Partner closely with engineering teams to improve service reliability, performance, and observability.
Support incident response, root cause analysis, and postmortems, ensuring learnings are incorporated into system improvements.
Optimize infrastructure for performance, cost efficiency, and scalability.
Manage and scale containerized environments using Docker, Kubernetes, and related orchestration technologies.
Help define and enforce reliability standards, SLOs, and operational best practices across engineering teams.
Continuously evaluate new infrastructure tools and practices to improve system resilience and developer productivity.
What We’re Looking For
6+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering roles.
Experience managing infrastructure for large-scale systems supporting millions of users.
Strong expertise with cloud infrastructure, ideally Google Cloud Platform (GCP).
Hands-on experience with Kubernetes, container orchestration, and distributed systems.
Experience implementing monitoring and observability systems (Prometheus, Grafana, Datadog, or similar).
Strong scripting or programming experience in languages such as Python, Go, or TypeScript.
Deep understanding of reliability engineering practices including SLOs, SLIs, and incident management.
Strong collaboration skills and ability to work cross-functionally with engineering teams.
Nice to Have:
Experience supporting real-time streaming, gaming, or large-scale consumer applications.
Familiarity with event-driven architectures and large-scale data processing systems.
Experience optimizing infrastructure costs in high-growth environments.
Salary & Benefits
Compensation: $150k - $200k base salary + options.
Benefits Include:
Unlimited PTO to prioritize work-life balance.
401(k) plan to invest in your future.
Comprehensive health insurance to support your well-being.
Paid company holidays for time to recharge.
Competitive salary that values your expertise and contributions.
Where You’ll Work: This is a full-time, on-site position in Santa Monica.
Recommended Jobs
Lead Engineer, Electrical Design
Req ID: 127714 Region: Americas Country: USA State/Province: California City: San Jose Thank you for checking us out! If you’re reading this, chances are that you’re getting this info…
Project Associate, Cyber Breach
Project Associate, Cyber Breach Contact Discovery Services LLC Location: Remote Start date: Negotiable A leading eDiscovery technology and Cyber Incident Response firm headquartered…
Customer Service Representative
The Customer Service Rep (CSR) is the first and last point of contact with Auto Collision Group, Inc. customers. The CSR will play an integral role in delivering the highest quality of service to eve…
Senior principal software engineer
Job Description Potential Locations : Nashville, TN Austin, TX Broomfield, CO Santa Clara, CA Redwood City, CA Seattle, WA As a Senior Principal Eng…
Software Engineer
About Paradromics Brain-related illness is one of the last great frontiers in medicine, not because the brain is unknowable, but because it has been inaccessible. Paradromics is building a brain-c…
Warehouse Technician - Sun Valley (La Mirada)
Since opening our doors in 1988, Impact Property Solutions has served thousands of multifamily properties for over 30 years. Today, management companies and property managers trust our outstanding …
UX Researcher
Apply now: UX Researcher, location is Remote (PST hours). The start date is 2 weeks from offer for this 12 month contract position. Job Title: UX Researcher Location-Type: 100% Remote (PST ho…
OPIR TAP Lab System Architect Lead
Title: OPIR TAP Lab System Architect Lead OPIR TAP Lab System Architect Lead Belong. Connect. Grow. with KBR! KBR's National Security Solutions team provides high-end engineering and…
Key Account Manager, Dental
JOB TITLE: Key Account Manager LOCATION: Los Angeles/ Bay Area REPORTS TO: Operations Manager, Key Accounts The Company: Sunbit is a top ranked financial technology company headquarter…
Commercial Tax Director (West Los Angeles)
Discover this higher education opportunity. This Jobot Job is hosted by: Dexter Dionio Are you a fit? Easy Apply now by clicking the "Apply" button and sending us your resume. Salary: $10…