Software Engineer, Observability (Backend)

Anyscale
San Francisco, CA

About Anyscale:

At Anyscale , we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We’re commercializing Ray , a popular open-source project that's creating an ecosystem of libraries for scalable machine learning. Companies like OpenAI , Uber , Spotify , Instacart , Cruise , and many more, have Ray in their tech stacks to accelerate the progress of AI applications out into the real world.

With Anyscale, we’re building the best place to run Ray, so that any developer or data scientist can scale an ML application from their laptop to the cluster without needing to be a distributed systems expert.

Proud to be backed by Andreessen Horowitz, NEA, and Addition with $250+ million raised to date.

About the role

We are seeking a Backend Software Engineer to join our team focused on building user-facing application features for the Anyscale AI platform. The role involves interacting with users, understanding their requirements, designing and implementing features, and finally maintaining and improving these features over time. The backend of the platform generally deals with implementing the core business logic of these features.

About the team

The Workspace & Observability Team is dedicated to empowering clients to create robust AI applications using our powerful platform built on Ray. We are a collaborative group of experts committed to providing bespoke monitoring tools and integrations that enhance the development lifecycle. In particular, these tools accelerate the process of writing, debugging, deployment, and monitoring of AI applications.

Observability in a distributed cluster can deal with a ton of data. There are a ton of interesting problems to solve around how to ingest, aggregate, format, and ultimately present that data to our users in a digestible way. With Ray and Anyscale, we have the opportunity to provide great tools out of the box for our users. Join us in shaping the future of AI application development!

\n

A snapshot of projects you may work on
  • The Ray Dashboard observability tool which gives users insight into their Ray application including what code is running in which machine, how much data is being moved between various machines, and the hardware utilization of each machine.
  • Library-specific observability tools like the Ray Train dashboard or Ray Serve dashboard which accelerates our users ability to develop distributed training or model serving applications.
  • Unified log viewer, a tool that ingests logs across a ray cluster and presents the ability to query those logs in meaningful ways, such as by function name, log level, timestamp, or machine.
  • Anomaly detection. The ability for the Anyscale platform to automatically detect performance bottlenecks or bugs in our users workloads and suggest or automatically fix these issues.
  • Work with a team of leading distributed systems and machine learning experts.
  • Communicate your work to a broader audience through talks, tutorials, and blog posts.
  • Help us to build and shape a world class company.

We'd love to hear from you if have
  • Proficiency in backend or full stack development, including experience with web API frameworks and databases.
  • Proficiency in Python or an ability to quickly learn new programming languages. 
  • Good understanding of AI and machine learning concepts.
  • Experience with observability tools and monitoring solutions (e.g., Datadog, Splunk, AWS CloudWatch).
  • Familiarity with Ray or similar distributed systems frameworks.
  • Solid background in debugging, architecture design, and coding.
  • Excellent problem-solving skills and a collaborative mindset.
  • Passion for building tools that enhance user experience and optimize workflows.

Compensation
  • At Anyscale, we take a market-based approach to compensation. We are data-driven, transparent, and consistent. The target salary for this role is $202,000 ~ $237,000. As the market data changes over time, the target salary for this role may be adjusted.

This role is also eligible to participate in Anyscale's Equity and Benefits offerings, including the following:

  • Stock Options
  • Healthcare plans, with premiums covered by Anyscale at 99%
  • 401k Retirement Plan
  • Wellness stipend
  • Education stipend
  • Paid Parental Leave
  • Fertility Benefits
  • Flexible Time Off
  • Commute reimbursement
  • 100% of in office meals covered

\n

Anyscale Inc. is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law.

Anyscale Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish

Posted 2025-08-20

Recommended Jobs

Customer Service Support Specialist (Livermore)

Russell Tobin
Livermore, CA

Customer Service Support Livermore, CA (94551) Duration: 12+ Months Pay rate: $30 - $35 / Hourly Schedule Notes: Floater must be willing to work on various shift base on business needs. Wedn…

View Details
Posted 2025-08-17

Project Coordinator

IMRI Technology & Engineering Solutions
Los Angeles, CA

Position: Project Coordinator Low Voltage Systems Minimum Qualifications: At least 2 years of experience in low voltage systems, including LAN, WLAN, VOIP/PA, surveillance, and intrusion systems. Must…

View Details
Posted 2025-07-31

Sales Professional

GIA Legacy Planning
Los Angeles, CA

Location: [Remote Position/ Work from Home] Job Type: [Full-Time/Part-Time/Uncapped Commission-Based] About Us: GIA Legacy Planning is a dynamic and client-focused insurance agency special…

View Details
Posted 2025-07-30

Senior Software Engineer

Cisco
San Jose, CA

The application window is expected to close on: August 16, 2025. NOTE: Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received.   …

View Details
Posted 2025-07-30

Chief Administrative Officer (CAO)

UCLA
Los Angeles, CA

Chief Administrative Officer (CAO) Location Los Angeles, CA : The Strategy, Planning, and Operations (SP&O team) is in charge of developing and implementing the IT strategy, as well as ensuring that t…

View Details
Posted 2025-08-18

Maintenance Technician - Parallel

Greystar
Anaheim, CA

ABOUT GREYSTAR Greystar is a leading, fully integrated global real estate company offering expertise in property management, investment management, development, and construction services in ins…

View Details
Posted 2025-07-30

Data Scientist- Data Cloud (San Diego)

Apple Inc.
San Diego, CA

San Diego, California, United States Software and Services Description The Data Cloud team is building an analytics platform that creates invaluable insights for Apple's development teams and the…

View Details
Posted 2025-08-18

BOH Supervisor

King's Fish House - Laguna Hills
Laguna Hills, CA

King's Fish House wants to "Welcome You to the House that Seafood Built". With our embodiment of hospitality from the heart and spectacular seafood dishes to delight our Guests, come join our world-c…

View Details
Posted 2025-07-30

Performance Engineer

Veeva Systems
San Luis Obispo, CA

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in histo…

View Details
Posted 2025-07-31

Hospital | MRI Tech

Highland, CA

MRI Technologist – Travel Healthcare Job in Lewistown, Pennsylvania Advance your career as a Travel MRI Technologist in Lewistown, PA (zip code 17044)! Join a leading hospital MRI department and pl…

View Details
Posted 2025-08-07