Principal Data Engineer

Sanas
Palo Alto, CA

Weʼre looking for an experienced and forward-thinking Principal Data Engineer to lead the design and implementation of our end-to-end data infrastructure for industry leading Voice AI products. This is a high impact role where you will shape the technical vision, own strategic architecture decisions, and mentor a growing team of Data engineers focused on delivering reliable and scalable data systems for Machine Learning at scale.

Youʼll work cross-functionally with AI research scientists, Infrastructure and product teams to ensure that data - from raw audio to training-ready features - is consistently accessible, compliant and optimized for speed and scale. Youʼll help push the boundaries of real-time Voice AI!

Key Responsibilities:

  • Architect and lead the development of large scale data pipelines and data lakes to ingest, transform and serve high quality data for AI model training, product telemetry and analytics.
  • Drive long‑term data infrastructure strategy across streaming and batch, feature store extensions, Iceberg/Delta lake choices, metadata management, and lakehouse evolution.
  • Drive platform and infrastructure decisions, optimizing compute fleets (e.g.Ray, Spark clusters), orchestration tooling Airflow, Dagster), and streaming stacks Kafka, Flink)
  • Collaborate with AI research scientists, engineering leads, product, finance, marketing, and legal to align data architecture with business and regulatory requirements.
  • Advocate best practices in data governance, lineage, observability, testing, tooling, and disaster recovery across pipelines and data stores.
  • Act as a mentor and technical leader - review design and code, share patterns, elevate team capability, and support recruitment and hiring
  • Drive build vs buy decisions for tools to implement data quality and observability solutions to achieve high data quality.

Qualifications:

  • 10+ years of experience in Data Engineering, Infrastructure, or ML Systems, with at least 2+ years in a technical leadership capacity.
  • Expertise in building distributed batch and real-time data systems
  • Expertise in Databases (like Postgres) andData Lakes (like Snowflake, Databricks and ClickHouse
  • Experience using Data Processing frameworks like Spark, Flink and Ray
  • Deep Experience with cloud platforms AWS/GCP, object storage (e.g., S3), and orchestrators like Airflow and Dagster
  • Strong knowledge of data lifecycle management, including privacy, security, compliance and reproducibility
  • Comfortable working in a fast-paced startup environment
  • Strategic mindset and proven ability to collaborate across engineering, ML and product teams to deliver infrastructure that scales with the business.

Nice to Have:

  • Familiarity with audio data and its unique challenges, like large file sizes, time- series features, metadata handling, is a strong plus
  • Experience with Voice AI models like ASR, TTS and speaker verification.
  • Familiarity with real-time data processing frameworks like Kafka, Flink, Druid and Pinot
  • Familiarity with ML workflows including: MLOps, feature engineering, model training and inference.
  • Experience with labeling tools, audio annotation platforms, or human-in-the- loop annotation pipelines.

Posted 2026-01-13

Recommended Jobs

Class 8 Fleet Mechanic

Sentinel Transportation
Yuba City, CA

Join our team in Yuba City, CA as a Company Owned Fleet Mechanic. Day Shift, a 5/2 Schedule (i.e. Monday - Friday), with flexible hours and a competitive hourly rate. As a key player in our operations…

View Details
Posted 2025-12-12

Senior Product Manager

Inmobi
San Mateo, CA

InMobi Advertising is a global technology leader helping marketers win the moments that matter. Our advertising platform reaches over 2 billion people across 150+ countries and turns real-time cont…

View Details
Posted 2025-12-01

Fronthouse Service

Miguel's Jr. - 28
Rialto, CA

Fronthouse Service Location Rialto, CA : Join Miguel's Family! Do you believe food is more than just a meal? At Miguel's, we're passionate about creating memorable experiences through delicious food …

View Details
Posted 2026-01-10

Administrative Auditor

OPCO Skilled Management
Los Angeles, CA

Administrative Auditor Location Los Angeles, CA (Hancock Park area) : About the Company: OPCO Skilled Management is responsible for the business administration of multi-state skilled and long-term-…

View Details
Posted 2026-01-07

Software Engineer, Performance Optimization

Fireworks Ai
Redwood City, CA

About Us: Here at Fireworks, we’re building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highest-quality models and the fastest, most scalable…

View Details
Posted 2025-12-13

Senior Backend Software Engineer - Oakland (Hybrid)

Teleport
Oakland, CA

We help companies stay secure while moving fast. Built by engineers for engineers, The Teleport Access Platform delivers on-demand, least privileged access to infrastructure based on cryptographic…

View Details
Posted 2025-11-25

Product Backend Engineer

Xai
Palo Alto, CA

About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on eng…

View Details
Posted 2025-11-25

Continuous Improvement Lead

Alstom
Los Angeles, CA

At Alstom, we understand transport networks and what moves people. From high-speed trains, metros, monorails, and trams, to turnkey systems, services, infrastructure, signalling and digital mobility,…

View Details
Posted 2025-11-28

Production Chemist

SGS Consulting
California

Job Responsibilities: Performs in-process testing of bulk solutions. Maintains detailed device history records and training logs. Ensures proper transportation, handling, and disposal of haz…

View Details
Posted 2025-11-14

Senior Software Engineer

Clockwork.io
Palo Alto, CA

Clockwork.io is a Silicon Valley startup that delivers state-of-the-art AI compute acceleration.  We are founded by Stanford researchers and veteran systems engineers with a shared belief: distrib…

View Details
Posted 2025-11-25