Senior Platform Engineer, Observability and AIOps

Synopsys Inc

Sunnyvale, CA

We Are

Synopsys is the leader in engineering solutions from silicon to systems, enabling customers to rapidly innovate AI-powered products. We deliver industry-leading silicon design, IP, simulation and analysis solutions, and design services. We partner closely with our customers across a wide range of industries to maximize their R&D capability and productivity, powering innovation today that ignites the ingenuity of tomorrow.

You Are

You are a strong platform engineer with a passion for building platforms and services that improve how complex infrastructure is observed, understood, and operated. You bring experience developing solutions for observability, automation, and operational intelligence in large-scale enterprise environments. You are comfortable working across software, systems, and operations domains, and you enjoy solving difficult technical problems at scale.

In this role, you will design and develop software and platform capabilities that advance observability, operational analytics, and intelligent automation across Synopsysâ€™ infrastructure ecosystem. You will help improve service visibility, accelerate incident response, reduce operational complexity, and increase infrastructure reliability across environments that support critical engineering workloads.

What You'll Be Doing

Design, develop, and enhance software solutions that support observability, operational analytics, and intelligent automation across infrastructure and platform services.
Build scalable, reliable, high-performance systems and services for telemetry collection, processing, searching, correlation, analysis, and visualization.
Develop tools, APIs, and integrations that enhance monitoring, alerting, incident management, and operational workflow automation.
Create software capabilities that improve visibility across operating systems, orchestration platforms, compute infrastructure, storage, networking, cloud services, and business-critical enterprise platforms.
Partner with infrastructure, SRE, platform engineering, and operations teams to identify observability gaps and implement scalable solutions.
Apply Infrastructure as Code practices to deploy, configure, and maintain observability components in a consistent and repeatable way.
Apply data-driven techniques, AI-assisted methods, or intelligent analytics to improve signal quality, anomaly detection, alert prioritization, and root cause analysis.
Document technical designs, implementation patterns, and operating procedures to boost teamwork productivity and efficiency within the organization.

The Impact You Will Have

Enable faster incident response and resolution across a global hybrid-cloud infrastructure environment that supports mission-critical engineering and business workflows.
Reduce operational complexity and alert fatigue by building intelligent systems that surface actionable signals instead of noise
Improve infrastructure reliability and uptime by making it easier for teams to see, understand, and act on what is happening in real time
Accelerate troubleshooting and root cause analysis by correlating telemetry across compute, storage, networking, and cloud platforms
Increase operational efficiency by automating repetitive triage, escalation, and remediation workflows
Empower SRE and platform teams with better tooling, better visibility, and better data so they can focus on high-value work instead of firefighting
Contribute to a culture of operational excellence where observability and intelligent automation are first-class engineering priorities

What You'll Need

8-10 years of experience in software engineering, platform engineering, site reliability engineering, or infrastructure engineering, including substantial experience building observability capabilities.
Proven experience working in large-scale infrastructure environments with thousands of high-performance compute nodes and/or petabyte-scale storage.
Strong hands-on experience designing, implementing, and operating observability platforms using technologies such as Elastic, Grafana, Kafka, Logstash, OpenTelemetry, and Prometheus.
Strong scripting and programming skills in Python, Ruby, or Bash for custom tool and ETL process development.
Solid working knowledge of Linux systems, Kubernetes, and containerized application environments.
Experience with Infrastructure as Code and configuration management tools such as Ansible, experience with incident management platforms like ServiceNow, Rootly, or PagerDuty is a plus
Practical knowledge of AI technologies, including machine learning, generative AI, LLM-based tools, or intelligent analytics, with experience applying them to observability, incident response, automation workflows, or operational decision-making.
Bachelor's or Master's degree in Computer Science, Information Technology, or a related engineering field.

Who You Are

You can enter into a room of SREs drowning in alerts and leave with a tooling plan that changes how they understand the problem, not just how they respond to it.
You do not wait for perfect requirements or a fully defined roadmap, you work with what you have, ask the right questions, and start building.
You think about the person on call when you design a system, if your alerting logic wakes someone up at 3 AM, it better be for something they can actually fix.
You are comfortable working across domains, you can debug a Kafka pipeline, tune a Grafana query, write a Python exporter, and still understand the operational workflow it all supports.
You have a point of view on what good observability looks like and you push back when a solution is too noisy, too fragile, or too hard to maintain.
You care about making your work reusable and understandable, you document your decisions, write maintainable code, and leave things better than you found them

The Team You'll Be Part Of

You will join the Observability Platform Services team in the Enterprise Intelligence and Cloud Platform (EI&CP) organization, a team focused on building scalable observability, operational intelligence, and automation capabilities across Synopsys' global infrastructure. This team works closely with infrastructure, SRE, platform engineering, and operations teams to improve service visibility, reduce operational complexity, and increase reliability across hybrid cloud environments, GPU-enabled compute farms, enterprise networking, storage platforms, and business-critical systems.

Posted 2026-06-15

Recommended Jobs

Engineering Technician, Optimus

Tesla

California

What To Expect This person will be responsible for building, reworking, and maintaining auxiliary Optimus components. The primary objectives of this role are to collect data, assist with engineeri…

View Details

Posted 2026-07-30

Maintenance Technician - Part Time

Oakmont Management

Westlake Village, CA

Position: Maintenance Technician Shifts, Time, and Days: Part Time - Saturday & Sunday Pay Range: $20.50- $21.50 **Open Interviews Every Tuesday & Thursday!** Visit us from 10:00 AM – 4:00 PM to me…

View Details

Posted 2026-07-25

Clinical Palliative Care Manager

Livingston Memorial VNA

Ventura, CA

Job Description Job Description Description: Livingston Memorial is looking for a Full Time, 40 hours a week, Palliative Care Manager to join our Home Health and Palliative Team. We are p…

View Details

Posted 2026-06-26

Office Technician (Typing)

Department of Social Services

Sonoma County, CA

Job Description and Duties Under the supervision of the Office Services Supervisor ll, the Office Technician (OT), in the Community Care Licensing Division, Adult and Senior Care Program, provides…

View Details

Posted 2026-07-27

Learning and Development Trainee - $24/hr

Rapid Response Monitoring

Corona, CA

POSITION PURPOSE/SUMMARY The purpose of the Professional Development Program – Training track, is to provide a year-long in-depth training. This program cultivates future Training Specialists through…

View Details

Posted 2026-05-27

Reservationist

Culinary Lab

Los Angeles, CA

Job Description Job Description We are looking for individuals who are sharp, friendly and articulate, to field all incoming calls for our restaurants. An interest and passion for food and hospit…

View Details

Posted 2026-07-17

Care Companion

SJB Child Development Centers

San Jose, CA

Under the direction of the Mental Health Clinical Supervisor, provides individualized one-on-one behavioral support by matching the intervention to the unique needs of each child. Addresses unsafe…

View Details

Posted 2026-07-03

Licensed Insurance Sales Rep (Hourly + Commission + Bonuses)

Paul Hernandez State Farm

La Mirada, CA

Licensed Insurance Sales Rep (Hourly + Commission + Bonuses) Full-Time & Part-Time Options | Hourly + Commission + Bonuses Paul Hernandez State Farm is seeking energetic, outgoing, and motivate…

View Details

Posted 2026-07-27

Project Administrative Assistant

HMC Architects

San Diego, CA

Who We Are HMC Architects is an employee-owned design firm with a desire to make a difference in our communities. As a purpose-driven firm based on values, our mission to design for good drives every…

View Details

Posted 2026-04-24

Director, Systems Engineering

Fox Corporation

Los Angeles, CA

OVERVIEW OF THE COMPANY FOX Entertainment With a legacy spanning nearly 40 years, FOX Entertainment is one of the world’s most recognizable media brands and a leading global creator of multi-ge…

View Details

Posted 2026-07-21