Member of Technical Staff (MTS) - Multimodal Foundation Models
Focus
Multimodal Foundation Models · Representation Learning · Method Innovation
We are looking for strong technical builders and researchers who deeply understand foundation models and representation learning beyond simply applying existing frameworks.Ideal candidates should have:
- Strong experimental rigor
- Solid systems and modeling intuition
- Hands-on engineering ability
- Interest in scalable multimodal AI systems for real-world autonomy
We value people who can bridge research and production, and who care about robustness, scalability, efficiency, and practical deployment in large-scale autonomous driving systems.
Responsibilities1. Large-Scale Foundation Model Pretraining
- Develop scalable pretraining pipelines for large-scale multimodal driving data
- Design and optimize training strategies for:
- Vision-language-action models
- Video foundation models
- Long-context temporal modeling
- Multimodal representation alignment
- Improve:
- Training stability
- Data efficiency
- Scaling efficiency
- Representation robustness
- Work on distributed training systems and large-scale model optimization using frameworks such as:
- PyTorch Distributed
- DeepSpeed
- Megatron-LM
2. Representation Learning & Method Innovation
- Design and improve self-supervised and multimodal learning methods for real-world autonomous driving systems
- Conduct architecture-level research on:
- Vision Transformers (ViT)
- Video / temporal architectures
- Multimodal fusion and alignment
- Embedding and retrieval systems
- Long-context and memory-efficient architectures
- Explore and improve:
- Pretraining objectives
- Loss functions
- Training paradigms
- Generalization and robustness
- Analyze model behavior through:
- Rigorous ablation studies
- Failure case analysis
- Representation probing and evaluation
3. Efficient Foundation Models & Scalable Deployment
- Improve the efficiency, scalability, and deployability of large multimodal foundation models for real-world autonomous driving systems
- Work on areas such as:
- Model quantization
- Knowledge distillation
- Efficient attention mechanisms
- Sparse architectures and Mixture-of-Experts (MoE)
- Long-context and memory-efficient modeling
- Inference acceleration and serving optimization
- Training and inference system efficiency
- Optimize model throughput, latency, memory usage, and deployment performance for large-scale production environments
Requirements
- MS or PhD in:
- Computer Vision
- Machine Learning
- Robotics
- Computer Science
- Related fields
- Strong understanding of:
- Foundation models
- Self-supervised learning
- Representation learning
- Multimodal learning
- Large-scale pretraining
- Hands-on experience with methods such as:
- CLIP
- DINO / DINOv2
- MAE
- Contrastive learning
- Masked modeling
- MoE or scalable transformer architectures
- Experience with one or more of the following is highly valued:
- Video foundation models
- Long-context modeling
- Retrieval systems
- Efficient inference
- Distributed training
- Model compression and deployment optimization
- Strong publication record in top-tier venues is preferred:
- CVPR
- ICCV
- ECCV
- NeurIPS
- ICLR
- ICML
Recommended Jobs
Product Marketing Manager (Remote US East Coast)
Spacelift is an infrastructure orchestration platform that manages your entire infrastructure lifecycle — provisioning, configuration and governance. Spacelift integrates with all your infrastructure…
Storage Engineer
Storage Engineer Salary Range: $100,000 to $115,400 per year Ops Tech Alliance (OTA) is seeking a Storage Engineer to support DLIFLC at the Presidio of Monterey, CA. The Storage Engineer is par…
Customer Service Representative
Job Title: Customer Service Representative Reports to: Club Manager Status: Full Time/Part Time/Non-Supervisor/Non-Exempt Job Summary Responsible f…
Truck Svc Category Specialist
Job Description: Job Summary The Truck Service Category Specialist at BP Products N.A. Inc. will drive the strategic development and performance of the truck service retail category, leveraging…
Multifamily Residential Electrician
We are hiring experienced Multifamily Residential Electricians for a new ground-up apartment construction project just getting started in Morgan Hill. This is a long-term opportunity supporting a…
Technical Sales Manager
Job Summary The Technical Sales Manager will be responsible for developing new OEM accounts, managing existing customer relationships, and coordinating closely with engineering and operations team…
Director, Engineering Project Management
Job Summary: The Director of Engineering Project Management is responsible for leading the strategic planning and execution of hardware developments and key software releases for the RAVE Product.…
Technical Success Manager
At JFrog, we’re reinventing DevOps to help the world’s greatest companies innovate -- and we want you along for the ride. This is a special place with a unique combination of brilliance, spirit and ju…
Field Sales Representative (Dental) - Los Angeles
About Fluent Software Group Fluent Software Group is part of Valsoft Corporation’s family of operating groups—a global leader in acquiring and growing vertical market software companies. We focus …