Jobs at DatologyAI
Discover open positions, company details, and resources to help you research and prepare for a career at DatologyAI. Browse roles, learn about the team, and get interview-ready.
About DatologyAI

Rising Star
About DatologyAI
About DatologyAI
Companies want to train their own large models on their own data. The current industry standard is to train on a random sample of your data, which is inefficient at best and actively harmful to model quality at worst. There is compelling research showing that smarter data selection can train better models faster—we know because we did much of this research. Given the high costs of training, this presents a huge market opportunity. We founded DatologyAI to translate this research into tools that enable enterprise customers to identify the right data on which to train, resulting in better models for cheaper. Our team has pioneered deep learning data research, built startups, and created tools for enterprise ML. For more details, check out our recent blog posts sharing our high-level results for text models and image-text models.
We've raised over $57M in funding from top investors like Radical Ventures, Amplify Partners, Felicis, Microsoft, Amazon, and notable angels like Jeff Dean, Geoff Hinton, Yann LeCun and Elad Gil. We're rapidly scaling our team and computing resources to revolutionize data curation across modalities.
Why It's a Rocketship
Why It's a Rocketship
As more teams move from prototyping with third-party APIs to deploying self-hosted or fine-tuned models, the limiting factor has shifted from model architecture to data quality. DatologyAI is building the infrastructure to address this. Its fully automated data curation platform helps AI teams prune, deduplicate, and sequence large, unstructured datasets—improving performance while reducing training cost and compute footprint. In an ecosystem where trillion-token training runs are the norm, this is quickly becoming a critical need.
The founding team brings rare technical credibility to this challenge. CEO Ari Morcos is one of the field’s leading researchers in data pruning and efficiency, with stints at FAIR and DeepMind. He’s joined by Matthew Leavitt, who led data research at MosaicML, and Bogdan Gaza, who built language infrastructure at Twitter. Their shared belief—that better data beats more data—is now gaining traction across the LLM ecosystem. Early conversations with top AI teams suggest strong demand for Datology’s tooling, especially among groups looking to reduce reliance on foundation model APIs and improve fine-tuning quality on proprietary datasets.
Datology is still early but well positioned. It is the first platform purpose-built for large-scale, unlabeled, generative AI data. With increasing pressure on AI teams to differentiate through domain-specific training and control their own model performance curves, Datology has the potential to become the standard layer for data curation in the modern AI stack.
Written as of July 2025
Company Details
Company Details
Founded2023
HeadquartersRedwood City, CA
Team Size1 - 50 employees
IndustryInfrastructure
Total Funding$58M
Latest RoundSeries A
Websitewww.datologyai.com
Insider Resources
Research & Interview Prep
Curated resources to help you research DatologyAI and prepare for interviews
No resources available yet. Check back soon!
Open Roles
Jobs at DatologyAI