We invest in people who change the way the world works.

Interested in working with them?
Tell us about your professional DNA and get discovered by the amazing companies in our network.

Data Engineer, Research

Together AI

Together AI

Data Science
San Francisco, CA, USA
Posted on Friday, May 17, 2024


As a data engineer on the Research team, you will be developing both the data infrastructure and datasets that will fuel the next generation of open models created by Together and the community. You will be working closely with the modeling team and unveil the recipe of creating state-of-the-art foundation models. You will be responsible for creatively finding data sources that can improve model quality and safety, rigorously understand their quality, and conduct data preparation and processing over petabyte scales. You will also work with customers to help them in their journey of training, using, and improving their AI applications using open models. Your research skills will be vital in staying up-to-date with the latest advancements in data processing for foundation models, ensuring that we stay at the cutting edge of open model innovations.


  • Strong background in data engineering
  • Experience such as building state-of-the-art models at large scale and/or understanding data quality and data mixture for ML models. A big plus for experience that combines these two aspects
  • Experience in processing data at large scale, and familiarity with at least one popular data processing framework
  • Passion in contributing to the open model ecosystem and pushing the frontier of open models
  • Excellent problem-solving and analytical skills
  • Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field


  • Develop data infrastructure and datasets for foundation model training
  • Be creative in high-quality data sources and data transformations that can push the frontier of the quality of open models
  • Understand and improve the full lifecycle of building open models; release and publish your insights (blogs, academic papers etc.)
  • Collaborate with cross-functional teams to deploy your model and make available to a wider community and customer base
  • Stay up-to-date with the latest advancements in data engineering for foundation models

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.


We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $220,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at https://www.together.ai/privacy