Sr. Data Engineer, New Venture
At Sanity.io , we’re building the future of AI-powered Content Operations. Our AI Content Operating System gives teams the freedom to model, create, and automate content the way their business works, accelerating digital development and supercharging content operations efficiency. Companies like SKIMS , Figma , Riot Games , Anthropic , COMPLEX , Nordstrom , and Morningbrew are using Sanity to power and automate their content operations.
As part of our new venture, your work will center on addressing one of AI’s toughest problems: how to help machines truly understand and use human-created content. You’ll build systems that structure and enrich large volumes of information to enable AI agents and LLMs to access the right context at the right time. This means designing and developing tools and pipelines that shape, structure, and connect information and content in innovative ways, and creating new methods to ensure AIs reflect the most accurate, authentic, and up-to-date representation of a business, its brand, products, and knowledge base.
As a Senior Data Engineer you'll architect and optimize the data infrastructure that powers our next generation of AI capabilities. You'll be the engine behind our AI systems, building scalable, efficient data pipelines that process massive volumes of content while maintaining low latency and managing costs intelligently. Your work will directly enable AI agents and LLMs to access the right data at the right time. You'll join a small, cross-functional team where your expertise in data engineering and ML infrastructure will be critical to turning ambitious AI concepts into production-ready systems. If you're passionate about building robust data systems that power cutting-edge AI, obsess over performance optimization, and love solving complex scaling challenges, we'd love to have you on the team.
What you will do:
Design, build, and optimize scalable data pipelines for AI and ML workloads, handling large volumes of structured and unstructured content data.
Architect data processing systems that transform, enrich, and prepare content for LLM consumption, with a focus on latency optimization and cost efficiency.
Build ETL/ELT workflows that extract, transform, and load data from diverse sources to support real-time and batch AI operations.
Implement data quality monitoring and observability systems to ensure pipeline reliability and data accuracy for AI models.
Collaborate with engineers and product teams to understand data requirements and design optimal data architectures that support AI features.
Optimize data storage strategies across data lakes, warehouses, and vector databases to balance performance, cost, and scalability.
Build automated data validation and testing frameworks to maintain data integrity throughout the pipeline.
Stay at the forefront of LLM research, understanding model behaviors, limitations, and capabilities to inform system design decisions.
Monitor and optimize pipeline performance, identifying bottlenecks and implementing solutions to improve throughput and reduce latency.
Create clear documentation of data architectures, pipeline logic, and operational procedures.
About you:
Based in the San Francisco Bay Area and able to work at least 2 days per week in our San Francisco office.
5+ years of data engineering experience, with at least 2 years focused on AI/ML data pipelines or supporting machine learning workloads.
High level of proficiency in Python and SQL.
Strong experience with distributed data processing frameworks like Apache Spark, Dask, or Ray.
Proficiency with GCP and their data services.
Experience with real-time data streaming technologies like Kafka, Redpanda or NATS.
Familiarity with vector databases (e.g., Milvus, ElasticSearch, Vespa) and their role in AI applications.
Experience with data modeling, schema design, and working with both relational and NoSQL databases (PostgreSQL, MongoDB, Cassandra).
Strong focus on performance optimization, cost management, and building systems that scale efficiently.
Experience implementing data observability and monitoring solutions (e.g., Prometheus, ClickHouse).
Ability to write clean, well-documented, maintainable code with proper testing practices.
Excellent problem-solving skills and a data-driven approach to decision making.
Strong communication skills and ability to collaborate effectively with cross-functional teams.
Comfortable with ambiguity and excited about working on undefined problems that require creative solutions.
Familiarity with data pipeline orchestration tools such as Airflow, Dagster, Prefect, or similar frameworks is a nice to have.
What we can offer:
A highly skilled, inspiring, and supportive team.
Positive, flexible, and trust-based work environment that encourages long-term professional and personal growth.
A global, multi-culturally team of colleagues and customers.
Comprehensive health plans and perks.
A healthy work-life balance that accommodates individual and family needs.
Competitive salary and stock options program.
Base Salary Range: $210,000 - $265,000 annually. Final compensation within this range will be determined based on the candidate’s experience and skill set.
Sanity.io is a modern, flexible content operating system that replaces rigid legacy content management systems. One of our big differentiators is treating content as data so that it can be stored in a single source of truth, but seamlessly adapted and personalized for any channel without extra effort. Forward-thinking companies choose Sanity because they can create tailored content authoring experiences, customized workflows, and content models that reflect their business.
Sanity recently raised a $85m Series C led by GP Bullhound and is also backed by leading investors like ICONIQ Growth, Threshold Ventures, Heavybit and Shopify, as well as founders of companies like Vercel, WPEngine, Twitter, Mux, Netlify and Heroku. This funding round has put Sanity in a strong position for accelerated growth in the coming years.
You can only build a great company with a great culture. Sanity is a 200+ person company with highly committed and ambitious people. We are pioneers , we exist for our customers , we are hel ved , and we love type two fun ! Read more about our values here!
Sanity.io pledges to be an organization that reflects the globally diverse audience that our product serves. We believe that in addition to hiring the best talent, a diversity of perspectives, ideas, and cultures leads to the creation of better products and services. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, or gender identity.
Recommended Jobs
Software Engineer, Data Infrastructure
About the Company Companies want to train their own large models on their own data. The current industry standard is to train on a random sample of your data, which is inefficient at best and active…
Staff Cloud Backend Engineer
Company Introduction We exist to wow our customers. We know we’re doing the right thing when we hear our customers say, "How did we ever live without Coupang?" Born out of an obsession to make s…
Staff AI Implementation Engineer
Company Description It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow st…
Senior Education Policy & Data Analyst
XQ Institute is the nation's leading organization dedicated to rethinking the high school experience so that every student graduates ready to succeed in life. We work in communities nationwide, with …
Nocturnist Physician Hospitalist - Internal Medicine
Palm Health Resources seeks Internal Medicine residency trained physicians to join a thriving, rapidly growing hospitalist practice in Southeast Alabama. The facility, a 420-bed regional referral cen…
Member of Technical Staff - Backend Engineer
Member of Technical Staff - Backend Engineer Location: San Francisco, CA Job Type: Full-time Sage Labs is an AI-powered commerce startup revolutionizing product discovery. We work at the…
Audio Visual / IT Technician
OUR COMPANY: EOS IT Solutions is a Global Technology and Logistics company, providing Collaboration and Business IT Support services to some of the world’s largest industry leaders, delivering for…
Data Analyst
Data Analyst (Founding Team) Stealth Healthcare Start-Up &##128205; SF-based preferred | &##129523; Some travel required | &##128336; Full-time A Different Approach to Building Healthcare Tech…
Product Manager, City Storage Systems - Los Angeles
Who we are At City Storage Systems, we're building Infrastructure for Better Food. We help restaurateurs around the world succeed in online food delivery. Our goal is to make food more affordable,…
Multimedia Journalist - Spectrum Noticias
This role requires the ability to work lawfully in the U.S. without employment-based immigration sponsorship, now or in the future. Are you a storyteller at heart? Do you want to shine a light o…