Inference Software Engineer

Etched
San Jose, CA

About Etched

Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents.

Job Summary

Etched’s Inference SW team enables optimal mapping of models to Sohu’s dataflow architecture and serving requests across multiple chips, hosts and racks. We are seeking a highly skilled and motivated engineer to join our team as we work towards enabling Mixture-of-Experts (MoE) architectures on Sohu systems. You’ll build SW enabling frontier inference performance to satisfy exponentially growing serving demand.

This role is for a general contributor and will be expected to contribute to all parts of our stack. We also have more specialized needs for this team posted on the site.

Key responsibilities

  • Support porting state-of-the-art models to our architecture. Help build programming abstractions and testing capabilities to rapidly iterate on model porting

  • Scale and enhance Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling

  • Optimize routing and communication layers using Sohu’s collectives

  • Develop tools for performance profiling and debugging, identifying bottlenecks and correctness issues

You may be a good fit if you have

  • Proficiency in Rust and/or C++

  • Good familiarity with PyTorch and/or JAX.

  • Good familiarity with transformers architectures

  • Ported applications to non-standard or accelerator hardware platforms.

  • Solid systems knowledge, including Linux internals, accelerator architectures (e.g., GPUs, TPUs), and high-speed interconnects (e.g., NVLink, InfiniBand)

Strong candidates may also have experience with

  • Developed low-latency, high-performance applications using both kernel-level and user-space networking stacks.

  • Deep understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols, consistency models, and communication patterns.

  • Solid grasp of large language model architectures, particularly Mixture-of-Experts (MoE).

  • Experience analyzing performance traces and logs from distributed systems and ML workloads.

  • Built applications with extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths.

  • Familiar with cluster orchestration tools (e.g., Kubernetes, Slurm) and ML platforms (e.g., Ray, Kubeflow)

  • Experience designing and implementing CI/CD pipelines for MLOps workflows.

Benefits

  • Full medical, dental, and vision packages, with generous premium coverage

  • Housing subsidy of $2,000/month for those living within walking distance of the office

  • Daily lunch and dinner in our office

  • Relocation support for those moving to West San Jose

Compensation Range

  • $175,000 - $275,000

How we’re different

Etched believes in the Bitter Lesson . We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in West San Jose, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Posted 2025-11-19

Recommended Jobs

Director, Business & Legal Affairs Litigation - Santa Monica, 90404

Universal Music Group
Santa Monica, CA

Director, Business & Legal Affairs Litigation - Santa Monica, 90404, United States of America How we LEAD: We are currently seeking an exceptional litigation attorney to oversee a busy docket of …

View Details
Posted 2025-11-21

Au Pair

GreatAuPair LLC
Temecula, CA

Get hired for Tiana's aupair Job in Temecula, CA. SoCal family seeks AuPair to help with kids & house. Find aupair care work in Temecula.

View Details
Posted 2025-11-09

Paralegal

The Walt Disney Company
Burbank, CA

The Walt Disney Company is a diversified, international family entertainment and media organization whose operations include theme parks and resorts, filmed entertainment including motion pictures and…

View Details
Posted 2025-12-21

ML Engineer

Phizenix
Menlo Park, CA

Machine Learning Engineer Menlo Park, CA On-Site Full-Time/Direct Hire  Client Opportunity | Through Phizenix Phizenix, a certified minority and women-led recruiting firm, is hiring o…

View Details
Posted 2025-12-10

Senior Product Manager

Unity Technologies
San Francisco, CA

The opportunity Unity is shaping the future of how developers understand and serve their players. As a Senior Product Manager for IAP+, you'll be at the forefront of a groundbreaking transformatio…

View Details
Posted 2025-12-16

Staff / Principal Machine Learning Engineer

Inworld Ai
Mountain View, CA

About Inworld At Inworld, we believe the processes of building, scaling, and evolving applications are monsters that consume value before it can reach users. Our mission is to solve evolution and tr…

View Details
Posted 2025-12-13

Project Manager, Gas Distribution

Pacific Gas and Electric Company
Hayward, CA

Requisition ID # 169137  Job Category: Project / Program Management  Job Level: Individual Contributor Business Unit: Gas Operations Work Type: Hybrid Job Location: Hayward    Departme…

View Details
Posted 2026-01-09

Staff Product Manager - Transformations

Fivetran
Oakland, CA

About the Role We’re looking for an experienced Product Manager to evolve Transformations and Orchestration—a critical pillar of our Enterprise Platform group. In this role, you will help our cust…

View Details
Posted 2025-11-25

Director of Personal Finance

MAI Capital Management
San Ramon, CA

GENERAL JOB DESCRIPTION The  Director of Personal Finance – Sports Division leads the firm’s athlete-focused Personal Finance platform. This leader builds and manages a team that provides fina…

View Details
Posted 2025-12-18

Data Infrastructure Engineer

Openai
San Francisco, CA

About the Team You’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering, product, alignment teams that are core to the work we do at OpenAI. The systems we su…

View Details
Posted 2025-11-25