Software Engineer - E5 (Kubernetes)
Who are we?
Founded in 2014 by Khadim Batti and Vara Kumar, Whatfix is a leading global B2B SaaS provider and the largest pure-play enterprise digital adoption platform (DAP). Whatfix empowers companies to maximize the ROI of their digital investments across the application lifecycle, from ideation to training to the deployment of software. Driving user productivity, ensuring process compliance, and improving user experience of internal and customer-facing applications.
Spearheading the category with serial innovation and unmatched customer-centricity, Whatfix is the only DAP innovating beyond the category, positioning itself as a comprehensive suite for GenAI-powered digital adoption, analytics, and application simulation.
Whatfix product suite consists of 3 products - DAP, Product Analytics, and Mirror. This product suite helps businesses accelerate ROI on digital investments by streamlining application deployment across its lifecycle.
Whatfix has seven offices across the US, India, UK, Germany, Singapore, and Australia and a presence across 40+ countries.
Customers: 700+ enterprise customers, including over 80 Fortune 500 companies such as Shell, Microsoft, Schneider Electric, and UPS Supply Chain Solutions.
Investors: Raised a total of ~$270 million. Most recently Series E round of $125 Million led by Warburg Pincus with participation from existing investor SoftBank Vision Fund 2. Other investors include Cisco Investments, Eight Roads Ventures (A division of Fidelity Investments), Dragoneer Investments, Peak XV Partners, and Stellaris Venture Partners.
- With over 45% YoY sustainable annual recurring revenue (ARR) growth, Whatfix is among the “Top 50 Indian Software Companies” as per G2 Best Software Awards.
- Recognized as a “Leader” in the digital adoption platforms (DAP) category for the past 4+ years by leading analyst firms like Gartner, Forrester, IDC, and Everest Group.
- The only vendor recognized as a Customers’ Choice in the 2024 Gartner® Voice of the Customer for Digital Adoption Platforms has once again earned the Customers’ Choice distinction in 2025. We also boast a star rating of 4.6 on G2 Crowd, 4.5 on Gartner Peer Insights, and a high CSAT of 99.8%
- Highest-Ranking DAP on 2023 Deloitte Technology Fast 500™ North America for Fourth Consecutive Year
- Won the Silver for Stevie's Employer of the Year 2023 – Computer Software category and also recognized as Great Place to Work 2022-2023
- Only DAP to be among the top 35% companies worldwide in sustainability excellence with EcoVadis Bronze Medal
- On the G2 peer review platform, Whatfix has received 77 Leader badges across all market segments, including Small, Medium, and Enterprise, in 2024, among numerous other industry recognitions.
Position Overview: We are looking for a highly skilled and experienced Senior Software Engineer (E5) to join our Site Reliability Engineering team who can take end‑to‑end ownership of large, business‑critical features. You’ll design, build, ship, and operate reliable, scalable services; break complex work into actionable tasks for yourself and other engineers; set the technical bar through thoughtful design and rigorous reviews; and mentor teammates while partnering with product, platform, and customer‑facing groups to keep our systems fast, observable, and always‑on.
Responsibilities:
Scope & Impact:
This role is critical to enhancing the reliability, availability, and overall resilience of Whatfix’s software products. The role will own these Non Functional Areas and build automated mechanisms to target gaps in these areas. These automated mechanisms should be scalable to an extent where other Engineering Teams can build their own pipelines to ensure reliability for their owned services. The role should be able to build a framework which can democratize the approach to enhance observability, recoverability and self healing capabilities of the products in Whatfix EcoSystem. This should also provide visibility to other engineering systems on the performance of their microservices.
Ownership:
- Designs and ships scalable platform code that bakes‑in reliability, fault‑tolerance and self‑healing for all Whatfix products
- Owns, designs and develops frameworks (eliminate or significantly reduce manual efforts, e.g., through self-healing and auto-scaling systems, and platformization), processes and architecture which enhances the Availability and Reliability of the System.
- Provides as a first responder for critical software issues within the team’s domain.
- Prioritizes and takes ownership of unowned or complex tasks that enable the team to move faster.
- Ensure that customer issues are not just fixed but that effective long-term solutions are implemented to prevent recurrence.
Technical Execution:
- Own task breakdown from stories/features, ensuring each task is feasible within five days
- Detail out design documents for the features being worked on
- Implement well tested and documented code based on engineering standards and best practices
- Own and support the features owned by the team to ensure high availability and compliances
- Review designs and code written by peers as well as other teams from perspectives of testability, maintainability, reliability, security and cost.
- Work with other teams to enhance developer experience through the enhancement of developer tools, suggest and implement AI workflows in the area of observability, availability and reliability
- Demonstrate expertise in one or more technical areas and contribute to the overall technical direction of the team.
Skillset:
Observability and Alertability of Infrastructure:
The candidate should have proven experience in:
- Increasing the observability of Software Systems
- Managing Infrastructure in automated manner (utilizing automated pipelines for CI/CD and frameworks for IaaC)
- Identifying gaps in Monitoring and Observability and fixing such gaps in a sustainable, scalable and automated manner.
- Proven track record of defining SLAs for Systems and working on tasks to continuously track these SLAs and enhancing these SLAs
- Resilience Engineering Practices: Drives post‑incident blameless RCAs and converts findings into code, tests and platform improvements
Collaboration & Guidance:
The candidate should have experience in:
- Working with other teams to help enhance the observability and recoverability (such as through self healing) of those team’s features
- Conduct training sessions or workshops on observability and reliability practices.
- Provide guidance on best practices for monitoring, alerting, and logging.
Required Technical Skills and Qualifications:
- Candidate should have experience in the following technologies
- Strong experience in Java.
- Working experience in Kubernetes, Helm, ArgoCD
- Ability to work with Java and Python based applications and identify gaps that could result in failures.
- Familiarity with CI/CD pipelines and infrastructure as code (IaC) practices.
Preferred Skills:
- Familiarity with log aggregation tools (e.g., ELK Stack).
- Knowledge of Chaos Engineering principles.
Soft Skills:
- Strong problem-solving and troubleshooting abilities.
- Excellent communication and collaboration skills.
- Ability to mentor and guide cross-functional teams.
Perks / Benefits
- Uncapped incentives
- Equity plan
- Mac shop, work with the newest technologies
- Unlimited PTO policy
- Paid maternity/paternity leave
- Monthly cell phone stipend
- Paid UberEats lunches-daily
- Medical, Dental, and Vision coverage (Whatfix pays 80% of the premium for individuals and their families; for the HSA, Whatfix contributes $1,000 for individuals and $2,000 for a family)
- Team and company outings
- Learning and Development benefits
At Whatfix, we value collaboration, innovation, and human connection. We believe that working together in the office five days a week fosters open communication, strengthens our community, and drives innovation, helping us achieve our goals more effectively.
To facilitate global collaboration, our US teams start and end early, while our India teams start and end late. US teams do not have any evening meetings. Relocation and Sponsorship offered.
We strive to live and breathe our Cultural Principles and encourage employees to demonstrate some of these core values - Customer First; Empathy; Transparency; Fail Fast and scale Fast; No Hierarchies for Communication; Deep Dive and innovate; Trust is the foundation; and Do it as you own it.
Whatfix is an Equal Opportunity Employer and an E-Verify participant. All activities must comply with our Equal Opportunity Laws, ADA, and other regulations, as appropriate.
We are an equal opportunity employer and value diverse people because of and not in spite of the differences. We do not discriminate on the basis of race, religion, color, national origin, ethnicity, gender, sexual orientation, age, marital status, veteran status, or disability status.
The salary range for this position is 130K-170K OTE. Compensation will be determined by factors such as level, job-related knowledge, skills, and experience.
Due to our company's global nature and our hiring committee's span of different time zones, the interviews for this role will be recorded for those not in attendance to review.
Recommended Jobs
Executive Assistant, Drama (Disney Entertainment Television)
Job Summary: Disney Entertainment Television is seeking a highly skilled and experienced Executive Assistant to support the Senior Vice President of Drama for ABC Entertainment and Hulu. This role…
Licensed Insurance Account Manager
State Farm, as represented by agent Lori Curry in Sacramento, CA, is seeking a detail-oriented and licensed Insurance Account Manager to join our team. Our goal is to assist customers in managing the …
Medium Voltage Product Manager
Description Enphase Energy is a global energy technology company and a leading provider of solar, battery, and electric vehicle charging products. Founded in 2006, our innovative microinverter…
Sr. Product Manager - Access Intelligence
About the Opportunity As the Product Manager for Veza Access Intelligence product, you’ll work with various internal stakeholders and customers to define, refine and execute the product vision for…
Senior Product Manager, Operations
Who We Are Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge A…
Sr. Test Engineer, Powertrain
About Rivian Rivian is on a mission to keep the world adventurous forever. This goes for the emissions-free Electric Adventure Vehicles we build, and the curious, courageous souls we seek to att…
Preventative Maintenance Mechanic
Overview We are seeking a dedicated and skilled Preventative Maintenance Mechanic to join our team in Alameda, CA. In this role, you will play a crucial part in ensuring the reliability and efficien…
Spacecraft Structures Analyst
About the role Muon seeks an experienced Structures and Dynamics Analysis and Test Engineer to join our team. The ideal candidate can go deep on design and analysis but also has a “systems mindset…
Interior Engineer
As the leading transit bus manufacturer in the United States, GILLIG buses play a critical role in the environmental and social initiatives in communities across our nation. GILLIG is on the forefr…
Senior Trade Compliance Manager
The Senior Trade Compliance Manager helps ensure that the company adheres to international trade laws and regulations, including import and export compliance, managing trade programs, and staying up-…