Job Description
sync. is a team of artists, engineers, and scientists building foundation models to edit and modify people in video. Founded by the creators of Wav2lip and backed by legendary investors, including YC, Google, and visionaries Nat Friedman and Daniel Gross, we've raised 6 million dollars in our seed round to evolve how we create and consume media.
Within months of launch our flagship lipsync API scaled to millions in revenue and powers video translation, dubbing, and dialogue replacement workflows for thousands of editors, developers, and businesses around the world.
That's only the beginning, we're building a creative suite to give anyone Photoshop-like control over humans video – zero-shot understanding and fine-grained editing of expressions, gestures, movement, identity, and more.
Everyone has a story to tell, but not everyone's a storyteller – yet. We're looking for talented and driven individuals from all backgrounds to build inspired tools that amplify human creativity.
About the role
We're seeking an exceptional ML Engineer to expand the boundaries of what's possible with AI video editing. You'll work with the creators of Wav2lip to build and extend computer vision pipelines giving users unprecedented control over humans in video.
What you'll do
- Create novel CV features that unlock new forms of video manipulation
- Build ML pipelines that understand and modify humans in video
- Transform research breakthroughs into production capabilities
- Design systems that make complex AI feel like magic to users
- Pioneer new approaches to fine-grained video control
What you'll need
- 5+ years implementing computer vision and ML systems that users love
- Deep expertise in PyTorch and video processing pipelines
- Track record of shipping novel ML features from concept to production
- Ability to bridge cutting-edge research with practical applications
- Strong collaboration skills across research and engineering teams
Preferred qualifications
- Experience with face/human detection and tracking
- Background in generative AI or video understanding
- History working with large-scale video datasets
- Open source contributions to CV/ML projects
Our goal is to keep the team lean, hungry, and shipping fast.
These are the qualities we embody and look for:
[1] Raw intelligence: we tackle complex problems and push the boundaries of what's possible.
[2] Boundless curiosity: we're always learning, exploring new technologies, and questioning assumptions.
[3] Exceptional resolve: we persevere through challenges and never lose sight of our goals.
[4] High agency: we take ownership of our work and drive initiatives forward autonomously.
[5] Outlier hustle: we work smart and hard, going above and beyond to achieve extraordinary results.
[6] Obsessively data-driven: we base our decisions on solid data and measurable outcomes.
[7] Radical candor: we communicate openly and honestly, providing direct feedback to help each other grow.
We’re a team of artists, engineers, and researchers building controllable AI video editing tools to unbound human creative potential. Our research team build AI video models to understand and affect fine-grained, controllable edits over any human in any video. Our product team makes these models accessible to editors, animators, developers, and businesses to edit and repurpose any video for any audience. Our technology is used to automate lip-dubbing in localization processes in entertainment, create dynamic marketing campaigns personalized to individuals or communities, animate new characters to life in minutes instead of days, affect word-level edits in studio-grade videos to fix mistakes in post-production avoiding having to rerecord entire scenes, and more. Our models are used by everyday people, prosumers, developers, and businesses large and small to tell outstanding stories. In just the last year we graduated at the top of our YC batch (W24), raised a $5.5M seed backed by GV, won the AI grant from Nat Friedman and Daniel Gross, scaled to millions in revenue – and this is only the beginning.