Senior Software Engineer, Observability
About The Role
Together AI is building the AI Acceleration Cloud, an end-to-end platform for the full generative AI lifecycle, combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure.
The AI Infrastructure team at Together AI is at the forefront of building and scaling the foundational systems that power our generative AI platform. The storage and observability team is crucial for designing, implementing, and maintaining robust distributed storage solutions, ensuring seamless data access and management. They are also responsible for developing comprehensive observability platforms, providing critical insights into system performance and GPU utilization, and proactively identifying and resolving issues.
Requirements
- 5+ years of demonstrated experience in building large scale, fault tolerant, distributed systems and API microservices
- Experience designing, analyzing and improving efficiency, scalability, and stability of various system resources
- Excellent communication skills – able to write clear design docs and work effectively with both technical and non-technical team members
- Demonstrated experience with building and operating high-performance and/or globally distributed microservice architectures across one or more cloud providers (AWS, Azure, GCP)
Responsibilities
- Identify, design, and develop foundational backend services that power Together’s cloud platform
- Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure
- Partner with product teams to understand functional requirements and deliver solutions that meet business needs
- Write clear, well-tested, and maintainable software and IaC for both new and existing systems
- Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance
- Participate in an on-call rotation to address critical incidents when necessary
About Together AI
Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.
Compensation
We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $160,000 - $260,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.
Equal Opportunity
Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
Please see our privacy policy at
Recommended Jobs
Manager, Fund Accounting-Administration - Santa Monica, 90404
Manager, Fund Accounting-Administration - Santa Monica, 90404, United States of America How we LEAD: Universal Music Group (UMG) currently has an opening for an Fund Accounting / Administratio…
Software Engineer (Infrastructure)
About Column For companies building financial technology and transforming the financial services space, the biggest bottleneck to their growth and innovation is often the underlying banks and infras…
Temp Accounts Receivable Clerk
Collision Auto Parts, comprised of three entities that operate distribution facilities across California and New Mexico, is a leading value-added distributor of certified and non-certified aftermarke…
Accounting Manager - Aerospace/Manufacturing
Consolidated Precision Products (CPP) is seeking a detail-oriented and strategic Accounting Manager to oversee financial operations within our CPP Rancho facility. This role is critical in ensuring c…
Automotive Technician (Santa Barbara)
Position Title: Automotive Technician Location: Santa Barbara, CA 93105 Description Company Overview Santa Barbara Auto Group encompasses outstanding franchises, automotive…
Software Engineer
About Quantum Design For more than 40 years Quantum Design (QD) has been providing technology solutions to researchers in the fields of physics, chemistry, biotechnology, materials science, and …
Founding AI Engineer
At Kixie, we’re building a modern, all-in-one sales engagement platform that helps teams work smarter and close deals faster. Our cloud-based calling and texting solution integrates seamlessly with l…
Principal Mobile Product Designer
Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in histo…
Senior bookkeeper
Full-time Description Job Title: Experienced Bookkeeper Job Description: We are seeking a detail-oriented and experienced Bookkeeper to join our team. The ideal candidate will have …
Product Manager
About the job At BuildOps, we’re building a groundbreaking software solution, purpose-built to support today’s commercial contractors. From helping our customers manage their service department …