Data Engineer (Founding Team)

Fabrion
San Francisco, CA

Data/ETL Engineer (Founding Team)

Location: San Francisco Bay Area

Type: Full-Time

Compensation: Competitive salary + early-stage equity

Backed by 8VC, we're building a world-class team to tackle one of the industry’s most critical infrastructure problems.

About the Role

We’re building a multi-tenant, AI-native platform where enterprise data becomes actionable through semantic enrichment, intelligent agents, and governed interoperability. At the heart of this architecture lies our Data Fabric — an intelligent, governed layer that turns fragmented and siloed data into a connected ontology ready for model training, vector search, and insight-to-action workflows.

We're looking for engineers who enjoy hard data problems at scale : messy unstructured data, schema drift, multi-source joins, security models, and AI-ready semantic enrichment. You’ll build the backend systems, data pipelines, connector frameworks, and graph-based knowledge models that fuel agentic applications.

If you've worked on streaming unstructured pipelines, built connectors into ugly legacy systems, or mapped knowledge graphs that scale — this role will feel like home.

Responsibilities

  • Build highly reliable, scalable data ingestion and transformation pipelines across structured, semi-structured, and unstructured data sources

  • Develop and maintain a connector framework for ingesting from enterprise systems (ERPs, PLMs, CRMs, legacy data stores, email, Excel, docs, etc.)

  • Design and maintain the data fabric layer — including a knowledge graph (Neo4j or Puppygraph) enriched with ontologies, metadata, and relationships

  • Normalize and vectorize data for downstream AI/LLM workflows — enabling retrieval-augmented generation (RAG), summarization, and alerting

  • Create and manage data contracts, access layers, lineage, and governance mechanisms

  • Build and expose secure APIs for downstream services, agents, and users to query enriched semantic data

  • Collaborate with ML/LLM teams to feed high-quality enterprise data into model training and tuning pipelines

What We’re Looking For

Core Experience:

  • 5+ years building large-scale data infrastructure in production environments

  • Deep experience with ingestion frameworks (Kafka, Airbyte, Meltano, Fivetran) and data pipeline orchestration (Airflow, Dagster, Prefect)

  • Comfortable processing unstructured data formats: PDFs, Excel, emails, logs, CSVs, web APIs

  • Experience working with columnar stores, object storage, and lakehouse formats (Iceberg, Delta, Parquet)

  • Strong background in knowledge graphs or semantic modeling (e.g. Neo4j, RDF, Gremlin, Puppygraph)

  • Familiarity with GraphQL, RESTful APIs, and designing developer-friendly data access layers

  • Experience implementing data governance : RBAC, ABAC, data contracts, lineage, data quality checks

Mindset & Culture Fit:

  • You’re a system thinker: you want to model the real world, not just process it

  • Comfortable navigating ambiguous data models and building from scratch

  • Passionate about enabling AI systems with real-world, messy enterprise data

  • Pragmatic about scalability, observability, and schema evolution

  • Value autonomy, high trust, and meaningful ownership over infrastructure

Bonus Skills

  • Prior work with vector DBs (e.g. Weaviate, Qdrant, Pinecone) and embedding pipelines

  • Experience building or contributing to enterprise connector ecosystems

  • Knowledge of ontology versioning , graph diffing , or semantic schema alignment

  • Familiarity with data fabric patterns (e.g. Palantir Ontology, Linked Data, W3C standards)

  • Familiar with fine-tuning LLMs or enabling RAG pipelines using enterprise knowledge

  • Experience enforcing data access policy with tools like OPA , Keycloak , Snowflake row-level security

Why This Role Matters

Agents are only as smart as the data they operate on. This role builds the foundation — the semantic, governed, connected substrate — that makes autonomous decision-making and agent action possible. From factory ERP records to geopolitical news alerts, the data fabric unifies it all.

If you're excited to tame complexity, unify chaos, and power intelligent systems with trusted data — we’d love to hear from you.

Posted 2026-02-22

Recommended Jobs

Research Scientist Intern, Quantum Algorithms, Center for Quantum Computing

Amazon Web Services (AWS)
Pasadena, CA

Description Do you enjoy solving challenging problems and driving innovations in research? As a Research Science intern with the Quantum Algorithms Team at CQC, you will work alongside global ex…

View Details
Posted 2026-01-30

Project Manager

Leo Tidwell Excavating Corporation
Selma, CA

As a Project Manager (PM), you will be assigned a variety of projects to plan and supervise from start to finish. In this position, you are responsible for overseeing financial budgets, coordinating …

View Details
Posted 2026-01-25

Registered Dental Assistant (RDA)- Weideman Pediatric Dentistry & Orthodontics

Smile Island Pediatric and Adult Dental Group
Citrus Heights, CA

Weideman Pediatric Dentistry & Orthodontics Our practice is seeking a team member for the following position:  ~Position:         Registered Dental Assistant (RDA) ~ Schedule:        Full-time: …

View Details
Posted 2026-01-15

Sr. Product Manager - Visa Token Service (VTS) platform

Visa
Foster, CA

Company Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and govern…

View Details
Posted 2026-02-22

CODE COMPLIANCE OFFICER (20707760)

CalOpps
Ventura County, CA

Location 2929 Tapo Canyon Road Simi Valley, 93063 Description Step into a role where every day brings new challenges, meaningful interactions, and the chance to shape the future of o…

View Details
Posted 2026-02-15

Caregiver - Glendale (Bilingual English/Arabic)

San Fernando, CA

About Aarris At Aarris Homecare, we understand that our caregivers are our best asset and we care about the work that you do. If you’re passionate and committed to client well-being and are looking …

View Details
Posted 2025-09-08

Caregiver - FT - NOC

Oakmont Management
Brea, CA

Position: Care Provider - Full Time - NOC Shifts, Time, and Days: Sunday - Thursday, 10:00 PM - 6:30 AM Pay Range: $17.50 - $19.00/hr. Capriana is a premier senior living community situated on a be…

View Details
Posted 2026-02-21

Power Electronics Firmware, Senior Manager

ChargePoint
California

About Us With electric vehicles expected to be nearly 30% of new vehicle sales by 2025 and more than 50% by 2040, electric mobility is becoming a reality. ChargePoint (NYSE: CHPT) is at the center…

View Details
Posted 2026-01-30

Senior Controller

Confidential
San Diego, CA

As a member of the senior management team, the Senior Controller is responsible for the management of all financial and business matters of Pacific Building Group ensuring timely reporting and …

View Details
Posted 2026-02-22

Manager, Production Finance (Drama Series)

Netflix
Los Angeles, CA

Netflix is one of the world's leading entertainment services, with over 300 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and lang…

View Details
Posted 2026-02-22