Software Engineer, Data Infrastructure
Who we are
EvolutionaryScale’s mission is to develop artificial intelligence to understand biology for the benefit of human health and society, through open, safe, and responsible research, and in partnership with the scientific community. Over the next ten years AI will transform biological design, making molecules and entire cells programmable. We will develop the foundation models for biology that enable this.
The EvolutionaryScale team is based in San Francisco and New York. We believe in flexibility around work schedules and locations, but expect that our team members will work half of the days or more of most weeks from one of our offices.
What you’ll do
As a Data Infrastructure Engineer, you will work closely with bioinformatics and research teams to ensure our data jobs are reliable, efficient, and scalable. You'll implement best practices for handling large-scale data processing, select and integrate the right technologies, and drive continuous improvements in performance and quality of our data sets.
The role
- Design, develop, and maintain large-scale batch processing pipelines using tools like Spark and Ray, for acquiring biology datasets.
- Manage data infrastructure components to ensure robust and fault-tolerant operations.
- Optimize data ingestion, storage, and retrieval processes for acquiring large and growing biology datasets, and for efficient pre and post training data ingestion.
- Create systems for easy and reproducible data evaluation and experiments.
- Integrate modern ML based data curation technologies with data processing pipelines.
- Work with researchers and other engineering teams to understand data needs, create solutions that meet modeling requirements.
Preferred qualifications
Apply even if you don’t meet all of these!
- Staff level engineers with 5+ years experience highly preferred
- Proven experience with large-scale data processing systems using technologies such as Hadoop, Spark, or Ray.
- Knowledge of streaming data frameworks like Kafka Streams, Spark Streaming, or Flink.
- Understanding of data processing principles and best practices.
- Strong problem-solving skills, including the ability to research, debug, and resolve complex technical problems.
- Experience with major cloud providers (AWS, GCP, or Azure), including familiarity with data warehousing tools is a plus.
- Knowledge of biology and biology datasets is a big plus but not required.
- Experience with large scale distributed systems or machine learning is also not required but a plus.
Recommended Jobs
Data Scientist, Metrics Team
Established in 2017, WeRide (NASDAQ: WRD) is a leading global commercial-stage company that develops autonomous driving technologies from Level 2 to Level 4. WeRide is the only tech company in the wo…
Research Engineer
About Anyscale: At Anyscale , we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We’re commercializing Ray , a popular o…
Jr Sous Chef
Position Title: Jr Sous Chef Pay 11 Reports To: Salary: 75,000-85,000 Other Forms of Compensation: Our Passion is Food! At Bon Appetit Management Company we are committed…
Front-End Web Developer
EyeTell is a new startup brought to you by Chad Hurley, the Co-Founder and former CEO of YouTube. We are building something new at the intersection of AI and mixed media and are looking for a passion…
Software Engineer, Frontend
ABOUT US Pickle is building the future of human connection, starting with your AI clone. In a world where reach is abundant but depth is rare, we’re rethinking how people express themselves and f…
Customer Care Representative
Must reside in California We are seeking employees to join our company in working remotely across California. Due to COVID-19 we have been able to restructure our day to day work and now can be do…
Senior Backend Engineer, AI Infrastructure
The opportunity At Unity, we’re shaping the future of real-time 3D by applying machine learning to revolutionize how games are created and experienced. From neural rendering to on-device inference…
Customer Success Manager
Overview First Resonance is seeking a dynamic and customer-oriented individual to join our team as a Customer Success Manager. The Customer Success Manager will play a pivotal role in ensuring the s…
Data Scientist, Innovation
Lead Data Scientist, Innovation Job Location: Santa Clara, CA (preferred) or Remote- US-based Job Term: Full-Time The Opportunity Picarro is transforming gas utility operations wi…
Alcohol Compliance Supervisor - SAP Center @ San Jose-Conc.
Job Description The Security Worker Lead is responsible for maintaining the security of people and/or property in the assigned area(s). Responsibilities may include but are not limited to assistin…