Applied Scientist / Research Engineer — Speech AI

GTN Technical Staffing
San Francisco, CA

Applied Scientist / Research Engineer — Speech AI Â

HIGHLIGHTS

Location: Â San Francisco, CA OR REMOTE

Position Type: Â Direct Hire

Salary: Â Based on Experience

Residency Status: Â US Citizen or Green Card Holder Only

We are looking for a senior technical contributor to help develop the next generation of real-time speech and conversational AI systems. This person will work across applied research, model development, training infrastructure, evaluation, and production deployment.

The role is ideal for someone who has deep experience with modern machine learning for audio, speech, and language, and who enjoys moving beyond prototypes into systems that must perform reliably in live environments. You will work closely with engineering and product teams to improve model quality, speed, reliability, and scalability for speech-driven user experiences.

This is a hands-on position. You should be comfortable experimenting with new model architectures, training and evaluating large models, improving inference performance, and translating research results into production-ready capabilities.

Responsibilities

Develop High-Quality Speech Generation Systems

  • Build and improve machine learning models for natural, expressive speech generation.
  • Work on controllability, speaker consistency, pacing, tone, and conversational timing.
  • Explore model architectures that improve output quality while keeping latency low.
  • Improve performance for real-time use cases where responsiveness and reliability matter.
  • Partner with infrastructure and product teams to move successful approaches into production.

Â

Improve Speech Understanding and Recognition

  • Train, adapt, and evaluate models that convert speech into accurate, usable text.
  • Improve recognition quality across varied speakers, accents, noisy conditions, phone-quality audio, interruptions, and mixed-language conversations.
  • Use large-scale audio data, weak labels, self-supervised methods, and targeted fine-tuning strategies.
  • Improve downstream usefulness of transcripts for conversation analysis, structured output, and intent understanding.

Â

Advance Audio Representation and Compression Methods

  • Research and implement model components that represent speech efficiently and preserve perceptual quality.
  • Explore learned audio representations that support generation, recognition, and efficient deployment.
  • Evaluate different approaches for balancing quality, speed, compute cost, and scalability.
  • Build systems that can support both experimentation and production use.

Â

Build Training and Evaluation Workflows

  • Create reliable pipelines for collecting, cleaning, processing, and evaluating speech data.
  • Design evaluation methods that combine automated metrics, model diagnostics, and human quality review.
  • Support large-scale training jobs across modern accelerator infrastructure.
  • Improve throughput, reproducibility, monitoring, and cost efficiency of model development workflows.

Â

Run Rigorous Experiments

  • Design controlled experiments to test model, data, and training improvements.
  • Compare approaches using clear benchmarks and production-relevant quality measures.
  • Move quickly from hypothesis to implementation, measurement, and iteration.
  • Communicate results clearly to research, engineering, and product stakeholders.

Â

What We’re Looking For

Â

Machine Learning Depth

  • Strong background in modern machine learning, especially speech, audio, language, generative modeling, multimodal systems, or large-scale model training.
  • Ability to implement new model ideas efficiently and evaluate them with technical rigor.
  • Strong understanding of current techniques used in speech and language systems.

Â

Speech and Audio Experience

  • Practical experience building or improving systems for speech generation, speech recognition, audio modeling, or related areas.
  • Experience working with large and varied audio datasets.
  • Strong judgment around speech quality, naturalness, latency, robustness, and user-facing model behavior.
  • Familiarity with real-world audio issues such as background noise, channel quality, interruptions, speaker variation, and conversational dynamics.

Â

Production Awareness

  • Experience training, deploying, or serving large models on modern compute infrastructure.
  • Understanding of practical inference constraints, including latency, memory use, throughput, quantization, and serving efficiency.
  • Comfort working with systems that need to operate reliably in live, low-latency environments.

Â

Experimental Discipline

  • Experience designing benchmarks, ablation studies, data experiments, and quality evaluations.
  • Ability to use both offline metrics and live product signals to guide model decisions.
  • Strong technical judgment when deciding which ideas are worth scaling and which should be abandoned.

Â

Ownership and Execution

  • Comfortable working in a fast-moving technical environment with ambiguous problems.
  • Able to own projects from early exploration through deployment.
  • Strong collaboration skills across research, engineering, infrastructure, and product.
  • High standards for model quality, reliability, and operational performance.

Â

Ideal Background

The strongest candidate will have experience building speech or audio AI systems that are both technically advanced and practical to deploy. You should be motivated by improving natural conversation, real-time responsiveness, model robustness, and scalable production performance.

Â

You may come from a research lab, AI startup, speech technology company, communications platform, infrastructure team, or another environment where speech models were trained, evaluated, deployed, and improved at scale.

Â

Nice to Have

  • Experience with distributed training for large models.
  • Publications, patents, or open-source work in speech, audio, language, or machine learning.
  • Experience with real-time communication, streaming audio, contact-center technology, or other latency-sensitive speech products.
  • PhD in Machine Learning, Computer Science, Electrical Engineering, Artificial Intelligence, or a related field; equivalent research or industry impact is also acceptable.

Benefits

  • Competitive health, dental, and vision coverage.
  • Equity participation.
  • Access to modern compute, tooling, and research resources.
  • Opportunity to work on advanced speech AI systems used in production environments.

Â

"We are GTN – The Go To Network "

Posted 2026-06-27

Recommended Jobs

New Locums Interventional Radiologist opening in California - Emergency privileges available with a

Optigy Group
Lakewood, CA

Job Description Job Description Specialty -Interventional Radiology Coverage needs: -Start: As soon as credentialed -End:Ongoing Shifts: -Exact dates: TBD -Monday through Friday -8 am to 5 pm Set…

View Details
Posted 2026-06-26

Education Coordinator

San Diego Visual Arts Network
Encinitas, CA

Resource Description Lux Art Institute, a non-profit visual art organization in Encinitas, has an immediate opening for an Education Coordinator to provide critical support in bringing the organizati…

View Details
Posted 2026-05-27

Housekeeping Supervisor - WorldMark Angels Camp Resort

Wyndham Destinations
California

We Put the World on Vacation Travel + Leisure Co. is the world’s leading vacation ownership and travel membership company, with a dynamic and growing portfolio of resort, travel club, and lifestyl…

View Details
Posted 2026-05-15

Social Media Manager

Finding Mastery
Los Angeles, CA

The Social Media Manager at Finding Mastery will be responsible for overseeing the company's social media profile across various platforms. The individual will be in charge of creating, managing, and…

View Details
Posted 2026-01-30

Assistant Server

Little Mountain
Montecito, CA

SERVER ASSISTANT — LITTLE MOUNTAIN | MONTECITO, CA Little Mountain is a wood-fired kitchen rooted in Montecito's coastal culture, recognized by the California Michelin Guide within our first six m…

View Details
Posted 2026-06-27

System Engineer

Symbotic
California

Who We Are With its A.I.-powered robotic technology platform, Symbotic is changing the way consumer goods move through the supply chain. Intelligent software orchestrates advanced robots in a high…

View Details
Posted 2026-06-27

Senior Project Manager, R&D - NPD

Stryker
Irvine, CA

We are looking for a seasoned Senior Project Manager who thrives as an individual contributor while operating confidently in highly cross‑functional, regulated environments. This role is ideal for so…

View Details
Posted 2026-05-15

Forensic Investigations Accountant

Deloitte LLP
California

Our Deloitte Regulatory, Risk & Forensic team helps client leaders translate multifaceted risk and an evolving regulatory environment into defensible actions that strengthen, protect, and transform th…

View Details
Posted 2026-06-06

Sr. UI Designer

BXGI
Burlingame, CA

Overview We are conducting a search for a Mexico-based Senior UI Designer to join product team at IXL Learning in San Mateo, CA. This position is a long-term contract with a rapidly growing American …

View Details
Posted 2026-05-27

Field Research Assistant (Entry)

Belcan
Woodland, CA

Job Title: Field Research Assistant (Entry) Pay: $28.55/hr. Location: Woodland, CA Zip code: 95695 Start Date: Immediate Tags: #FieldResearchAssistantjobs; #Woodlandjobs; We provide a competit…

View Details
Posted 2026-06-24