Job Description
Liquid AI, an MIT spin-off, is a foundation model company headquartered in Boston, Massachusetts. Our mission is to build capable and efficient general-purpose AI systems at every scale.
Our goal at Liquid is to build the most capable AI systems to solve problems at every scale, such that users can build, access, and control their AI solutions. This is to ensure that AI will get meaningfully, reliably and efficiently integrated at all enterprises. Long term, Liquid will create and deploy frontier-AI-powered solutions that are available to everyone.
Key Responsibilities
- Curate, clean, and validate large-scale real-world datasets
- Design and implement comprehensive data processing strategies for foundation model training
- Create advanced data augmentation and transformation pipelines
- Develop data generation techniques that enhance model performance and diversity
- Ensure data quality, ethical considerations, and bias mitigation in data generation
- Develop tools and frameworks for analyzing and filtering large-scale data sources
- Monitor and assess the impact of data selection on model performance
Preferred Skills
- Experience with large language models or multimodal foundation models
- Knowledge of differential privacy and data anonymization techniques
- Experience with data ethics and bias detection
- Publications or research in synthetic data generation
- Understanding of scalable data processing architectures
Required Qualifications
- Design and implement comprehensive data processing strategies for foundation model training
- Develop data cleaning/filtering/augmenation/generation techniques that enhance model performance and diversity
- Curate, clean, and validate large-scale real-world datasets
- Create advanced data transformation pipelines
- Ensure data quality, ethical considerations, and bias mitigation in data generation
- Develop tools and frameworks for reproducible and scalable data processing
- Monitor and assess the impact of data selection on model performance