Technical Research Engineer (AI Safety & Evaluation) (Santa Clara)
Most public safety benchmarks for frontier models test single-turn refusals. We focus on pushing frontier models in multi-turn, high-pressure conversations. A model that cleanly refuses to write an insecure script on Turn 1 will often fail by Turn 8 under sustained pressure from a frustrated senior developer persona. We call this alignment drift, and as the industry shifts from stateless chatbots to long-horizon autonomous agents, it's one of the most consequential open problems in the field.
What we're buildingAtella builds the empirical infrastructure to test AI character and stability under pressure. The company was co-founded by Dr. Roy Perlis (Chair of Psychiatry at Harvard/MGH, Editor of JAMA AI) alongside a team of ML researchers. We build multi-turn, persona-driven adversarial simulation harnesses.
Rather than just prompting models for bad output, we use clinical behavioral science to construct adversarial agents that apply specific psychological pressure over 20+ turns. We then mathematically map the point where a model's safety guardrails collapse, tracking signals like response-length decay, persona sensitivity, and failure-cascade rates.
We run the industry's leading dynamic leaderboards for AI Safety and Code Security, and our data is actively used by safety teams at the frontier labs.
The roleWe're hiring a Technical Research Engineer to help scale STELLA, our multi-turn evaluation engine. The work sits at the intersection of ML research, automated red teaming, and serious software engineering.
What you'll do:
- Scale the harness. Build and optimize the infrastructure that runs LLM-driven adversarial personas against frontier models for thousands of turns concurrently.
- Design adaptive attacks. Implement novel automated red-teaming strategies from recent literature — tree-based search, multi-agent debate, dynamic prompt generation — to surface failure modes more efficiently.
- Extract signal from noise. Build analysis pipelines over thousands of raw transcripts: failure-cascade probabilities, behavioral-drift metrics, persona sensitivity scores.
- Publish and open-source. Co-author methodology papers (in the lineage of our recent medRxiv preprints) and ship open-source tooling for the broader AI safety community.
Who you are:
- A strong software engineer. You write clean, scalable Python and are fluent with LLM APIs, async programming, and data pipelines.
- You can read a paper on Constitutional AI or persona modeling, extract the core math or architecture, and have a working implementation shortly after.
- You're intellectually aggressive about breaking things. You care deeply about AI safety but prefer empirical, transcript-level evidence over abstract alignment debates.
- Bonus: experience with RLHF, automated red teaming, or evaluation of long-horizon agentic workflows.
- You get an unusually direct look at the failure modes of the world's most advanced AI systems.
- You'll work alongside top clinical scientists from Harvard/MGH and collaborate closely with the safety and red teams at the frontier labs.
Compensation: $250,000–$300,000 base + 0.5%–1% equity
Recommended Jobs
Database Administrator (DBA)
Your colleagues will include internationally recognized experts in artificial intelligence and machine learning research as well as highly experienced finance and technology professionals. The people …
Front End Developer
Location While our core team and headquarters are in Sacramento, California, we welcome remote workers from all over the country. We've built a strong culture to foster valuable team relationshi…
Field Sales Mortgage Lending Manager, Temecula Valley, CA
Job Description Join our exceptional team of high-energy leaders as a Lending Manager in Chase Home Lending and put your knowledge and understanding of the home lending industry to good use by lea…
Staff Manufacturing Engineer
The Staff Manufacturing Engineer provides manufacturing engineering leadership supporting the development, commercialization, and production of medical devices. This role oversees a portfolio of prod…
Room Attendant - Hilton San Diego Bayfront
The Hilton San Diego Bayfront is looking for room attendants to join the team. The 30-story hotel with 1190 hotel rooms and 170,000 sq ft of meeting space offers a unique urban coastal experience…
Marriage and Family Therapist - ISUDT California Institution for Men (CIM)
Job Description and Duties Effective July 1, 2025, in accordance with the applicable Memorandum of Understanding, the Personal Leave Program 2025 (PLP 2025) was implemented in response to the st…
Airworthiness Engineer (Qualification / Certification)
Job Summary: The Airworthiness Engineer (Qualification / Certification) performs formal qualification and regulatory compliance oversight test activities for standard and custom hardware and softw…
Psychologist - Additional Position
Job Description and Duties This vacancy is for current, State civil service employees looking for an additional position. This is a limited term/temporary appointment. Applications are being ac…
Engineer III
POSITION SUMMARY Respond and attend to guest repair requests. Communicate with guests/customers to resolve maintenance issues with little to no supervision. Perform preventive maintenance on tools…
Senior, Sales Rep Style Specialty NA
Become a Part of the NIKE, Inc. Team NIKE, Inc. does more than outfit the world’s best athletes. It’s a place where passionate individuals come together to create the futur…