Staff Machine Learning Engineer, LLM Fine Tuning (Verilog/RTL Applications)
Staff Machine Learning Engineer, LLM FineâTuning (Verilog/RTL Applications)
HIGHLIGHTS
Location:Â San Jose, CA (Onsite/Hybrid)Â
Schedule : Full TimeÂ
Position Type: Â ContractÂ
Hourly : BOE
Overview:
Our client is building privacyâpreserving LLM capabilities that help hardware design teams reason over Verilog/SystemVerilog and RTL artifactsâcode generation, refactoring, lint explanation, constraint translation, and specâtoâRTL assistance. Our client is looking for a Staffâlevel engineer to technically lead a small, highâleverage team that fineâtunes and productizes LLMs for these workflows in a strict enterprise dataâprivacy environment.
You donât need to be a Verilog/RTL expert to start; curiosity, drive, and deep LLM craftsmanship matter most. Any HDL/EDA fluency is a strong plus.
Â
What youâll do (Responsibilities)
Own the technical roadmap for Verilog/RTLâfocused LLM capabilitiesâfrom model selection and adaptation to evaluation, deployment, and continuous improvement.
Lead a handsâon team of applied scientists/engineers: set direction, unblock technically, review designs/code, and raise the bar on experimentation velocity and reliability.
Fineâtune and customize models using stateâofâtheâart techniques (LoRA/QLoRA, PEFT, instruction tuning, preference optimization/RLAIF) with robust HDLâspecific evals:
Compileâ/lintâ/simulateâbased pass rates, pass@k for code generation, constrained decoding to enforce syntax, and âdoesâitâsynthesizeâ checks.
Design privacyâfirst ML pipelines on AWS :
Training/customization and hosting using Amazon Bedrock (including Anthropic models) where appropriate; SageMaker (or EKS + KServe/Triton/DJL) for bespoke training needs.
Artifacts in S3 with KMS CMKs; isolated VPC subnets & PrivateLink (including Bedrock VPC endpoints ), IAM leastâprivilege, CloudTrail auditing, and Secrets Manager for credentials.
Enforce encryption in transit/at rest, data minimization, no public egress for customer/RTL corpora.
Stand up dependable model serving : Bedrock model invocation where it fits, and/or lowâlatency selfâhosted inference (vLLM/TensorRTâLLM), autoscaling, and canary/blueâgreen rollouts.
Build an evaluation culture : automatic regression suites that run HDL compilers/simulators, measure behavioral fidelity, and detect hallucinations/constraint violations; model cards and experiment tracking (MLflow/Weights & Biases).
Partner deeply with hardware design, CAD/EDA, Security, and Legal to source/prepare datasets (anonymization, redaction, licensing), define acceptance gates, and meet compliance requirements.
Drive productization : integrate LLMs with internal developer tools (IDEs/plugâins, code review bots, CI), retrieval (RAG) over internal HDL repos/specs, and safe toolâuse/functionâcalling.
Mentor & uplevel : coach ICs on LLM best practices, reproducible training, critical paper reading, and building secureâbyâdefault systems.
What youâll bring (Minimum qualifications)
10+ years total engineering experience with 5+ years in ML/AI or largeâscale distributed systems; 3+ years working directly with transformers/LLMs.
Proven track record shipping LLMâpowered features in production and leading ambiguous, crossâfunctional initiatives at Staff level.
Deep handsâon skill with PyTorch , Hugging Face Transformers/PEFT/TRL , distributed training (DeepSpeed/FSDP), quantizationâaware fineâtuning (LoRA/QLoRA), and constrained/grammarâguided decoding.
AWS expertise to design and defend secure enterprise deployments, including:
Amazon Bedrock (model selection, Anthropic model usage, model customization, Guardrails, Knowledge Bases, Bedrock runtime APIs, VPC endpoints)
SageMaker (Training, Inference, Pipelines), S3 , EC2/EKS/ECR , VPC/Subnets/Security Groups , IAM , KMS , PrivateLink , CloudWatch/CloudTrail , Step Functions , Batch , Secrets Manager .
Strong software engineering fundamentals: testing, CI/CD, observability, performance tuning; Python a must (bonus for Go/Java/C++).
Demonstrated ability to set technical vision and influence across teams; excellent written and verbal communication for execs and engineers.
Nice to have (Preferred qualifications)
Familiarity with Verilog/SystemVerilog/RTL workflows: lint, synthesis, timing closure, simulation, formal, test benches, and EDA tools (Synopsys/Cadence/Mentor).
Experience integrating static analysis/ASTâaware tokenization for code models or grammarâconstrained decoding.
RAG at scale over code/specs (vector stores, chunking strategies), toolâuse/functionâcalling for code transformation.
Inference optimization: TensorRTâLLM , KVâcache optimization, speculative decoding; throughput/latency tradeâoffs at batch and token levels.
Model governance/safety in the enterprise: model cards, redâteaming, secure eval data handling; exposure to SOC2/ISO 27001/NIST frameworks.
Data anonymization, DLP scanning, and code deâidentification to protect IP.
What success looks like
90 days
Baseline an HDLâaware eval harness that compiles/simulates; establish secure AWS training & serving environments (VPCâonly, KMSâbacked, no public egress).
Ship an initial fineâtuned/customized model with measurable gains vs. base (e.g., +X% compileâpass rate, âY% lint findings per K LOC generated).
180 days
Expand customization/training coverage (Bedrock for managed FMs including Anthropic; SageMaker/EKS for bespoke/open models).
Add constrained decoding + retrieval over internal design specs; productionize inference with SLOs (p95 latency, availability) and audited rollout to pilot hardware teams.
12 months
Demonstrably reduce review/iteration cycles for RTL tasks with clear metrics (defect reduction, timeâtoâlintâclean, % autoâfix suggestions accepted), and a stable MLOps path for continuous improvement.
 (Security & privacy by design)
Customer and internal design data remain within private AWS VPCs ; access via IAM roles and audited by CloudTrail; all artifacts encrypted with KMS .
No public internet calls for sensitive workloads; Bedrock access via VPC interface endpoints/PrivateLink with endpoint policies; SageMaker and/or EKS run in private subnets.
Data pipelines enforce minimization, tagging, retention windows , and reproducibility; DLP scanning and redaction are firstâclass steps.
We produce model cards , data lineage , and evaluation artifacts for every release.
Tech youâll touch
Modeling: PyTorch, HF Transformers/PEFT/TRL, DeepSpeed/FSDP, vLLM, TensorRTâLLM
AWS & MLOps: Amazon Bedrock (Anthropic and other FMs, Guardrails, Knowledge Bases, Runtime APIs), SageMaker (Training/Inference/Pipelines), MLflow/W&B, ECR, EKS/KServe/Triton, Step Functions
Platform/Security: S3 + KMS, IAM, VPC/PrivateLink (incl. Bedrock), CloudWatch/CloudTrail, Secrets Manager
Tooling (nice to have):
HDL toolchains for compile/simulate/lint, vector stores (pgvector/OpenSearch), GitHub/GitLab CI
"We are GTN â The Go To Network"Â
Recommended Jobs
Locum Neonatal Nurse Practitioner
Palm Health Resources is hiring an experienced Neonatal Nurse Practitioner or Physician Assistant in Los Angeles, CA! Seeking ongoing mix of weekday and weekend shifts, both mornings and nights, incl…
Product Integration Lead, Incident Recovery Operations (Unit 42)
Company Description Our Mission At Palo Alto Networks® everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vi…
Senior tax associate
Important: Applicant should be proficent in English both in verbal and writing. All resume should be submitted in English. Hito LLC is a growing tax consulting firm that helps innovative busin…
Solar Apprentice
Solar Apprentice Location Lemoore, CA : Chestnut solar, situated on 1,050 acres in Stratford, CA, features a solar photovoltaic (PV) power generation facility, battery energy storage system (BESS), a…
Venue Technology Assistant
Venue Technology Assistant Location Los Angeles, CA : In this role, you will provide Point-of-Sale support for the Legends Venues on event day. You will be responsible for the deployment and tear down…
Outside Machinist (New Construction)
Job Posting End Date: Until filled Shift: ALL Security Clearance: No Clearance Job Summary Installs, maintains, and repairs ship's propulsion and auxiliary systems; includi…
Part-Time Environmental Compliance Aide (Hourly City Worker) (20667615)
Location 9600 Santa Rosa Road, Camarillo, 93012 Description Highlights of the position: ~Make a direct impact on your community ~Protect the environment and water quality ~Work in a s…
Staff Product Manager, Edge Computing
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, spee…
Culinary Experience Manager
Culinary Experience Manager About Marine Beach Marina Beach — opening this winter in San Francisco’s Marina District — was founded to fill a gap in the Bay Area dining scene — a place for South…
Digital Design Engineer
About Us Red Cell Partners is an incubation firm building and investing in rapidly scalable technology-led companies that are bringing revolutionary advancements to market in three distinct practi…