Audio Data Engineer Speech Cleaning & Pipeline Automation (TTS)
About Us:
Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthcare accessibility and health outcomes in the world by bringing deep healthcare expertise to every human. No other technology has the potential to have this level of global impact on health.
About the Role
Hippocratic AI is seeking a skilled Audio Data Engineer to help us scale and improve our speech datasets for use in Text-to-Speech (TTS) and speech synthesis systems. In this role, you will clean and enhance real-world audio data, build automation pipelines for processing, and ensure our voice models are trained on the highest quality inputs. This work will directly shape the clarity and expressiveness of the voices used in healthcare AI applications.
Responsibilities
Clean, denoise, and enhance large volumes of recorded speech data for use in TTS and voice synthesis pipelines.
Build and maintain automated audio preprocessing pipelines using scripting tools and open-source libraries.
Apply techniques such as background noise removal, silence trimming, gain normalization, and sample rate conversion.
Integrate tools like ffmpeg, sox, or Python-based scripts (pydub, torchaudio, librosa) into scalable workflows.
Collaborate with ML researchers and speech scientists to deliver high-quality, ready-to-train datasets.
Evaluate audio quality using perceptual and quantitative metrics, and maintain audio QA checklists.
Required Qualifications
Strong experience with speech/audio cleaning using tools such as iZotope RX, Audacity, Adobe Audition, or SoX.
Proficiency in Python and audio-related scripting for automation and batch processing.
Familiarity with digital audio principles, including sample rates, bit depth, frequency bands, and compression artifacts.
Experience designing or operating scalable, automated workflows for handling audio at volume.
Meticulous attention to detail in audio quality control and error spotting.
Nice to Have
Experience working on TTS model pipelines (e.g., Tacotron, VITS, FastSpeech) or speech synthesis datasets.
Background in audio engineering, phonetics, or signal processing.
Familiarity with real-time or low-latency audio processing constraints.
Experience with cloud platforms and tools for automation (e.g., AWS, Airflow, or containerized audio workflows).
Why Join Our Team:
Innovative Mission: We are developing a safe, healthcare-focused large language model (LLM) designed to revolutionize health outcomes on a global scale.
Visionary Leadership: Hippocratic AI was co-founded by CEO Munjal Shah, alongside a group of physicians, hospital administrators, healthcare professionals, and artificial intelligence researchers from leading institutions, including El Camino Health, Johns Hopkins, Stanford, Microsoft, Google, and NVIDIA.
Strategic Investors: We have raised a total of $278 million in funding, backed by top investors such as Andreessen Horowitz, General Catalyst, Kleiner Perkins, NVIDIA’s NVentures, Premji Invest, SV Angel, and six health systems.
World-Class Team: Our team is composed of leading experts in healthcare and artificial intelligence, ensuring our technology is safe, effective, and capable of delivering meaningful improvements to healthcare delivery and outcomes.
For more information, visit .
Our team values in-person collaboration, with on-site presence expected five days a week in Palo Alto, CA.
Recommended Jobs
Travel Program Manager
About The Role ABOUT ROCKET LAB Rocket Lab is an end-to-end space company delivering responsive launch services, complete spacecraft design and manufacturing, payloads, satellite components, and mor…
HR Coordinator
Description We are looking for a meticulous and organized HR Coordinator to join our team in Gardena, California. This is a contract position where you will play a key role in managing essential human…
Service Writer
We’re looking for a Service Writer to join our dealership’s service team. In this role, you’ll be the primary point of contact for customers needing vehicle service and repairs, ensuring a smooth, eff…
Data Scientist, Product
About the Team Our Applied team brings OpenAI technologies to consumers and businesses around the world. We collaborate across research, engineering, design and business functions to turn cutting-ed…
Floating Leasing Manager - San Diego
ABOUT GREYSTAR Greystar is a leading, fully integrated global real estate platform offering expertise in property management, investment management, development, and construction services in ins…
Specialist, Health & Safety
The Specialist of Health & Safety is an integral part of supporting Red Bull North America's (RBNA) implementation of Talent (HR) compliance and health & safety (H&S) initiatives. The position ensure…
Staff Accountant
ANINE BING is looking for a Staff Accountant to join our Accounting team based in Los Angeles. This role is an opportunity to help shape the next chapter of ANINE BING. As a key contributor wit…
Lead Backend Engineer, Content Management (PHP/Laravel)
Full-time Description Lead Backend Engineer, Content Management (PHP/Laravel) Description EMPIRE is a premier and dynamic music company that prides itself on the development and su…
Endocrinologist - Los Angeles County, CA - 2974966
Endocrinologist – Los Angeles County (Mission Hills, CA area) Join a well-established, physician-owned multi-specialty group with over 200+ providers and a 100-year history of clinical excellence…