Highlights:
- Since some AI builders currently use datasets with more than 10 billion visual elements, manual quality control is no longer practicable for humans.
- Quality automation is used to fix image labels, eliminate duplicates, find abnormalities, and more.
The startup company for data quality, Visual Layer Inc., announced that it has completed a USD 7 million seed investment round, which Madrona Venture Group and Insight Partners headed. With the funds, Visual Layer Inc. plans to expand its managed service for selecting large-scale image collections for computer vision model training.
Gathering photos for training datasets is an issue that researchers are all too familiar with. The quality of the data that artificial intelligence models are trained on is directly associated with its quality. The most effective computer vision models are developed using datasets of billions of images; however, this does not guarantee their superiority.
According to Visual Layer, up to 30% of the photos and videos used in training datasets can be “messy.” These disorganized images lead to biased AI models being used in real products and services, creating issues with AI bias and losing economic prospects.
Visual Layer refers to incorrectly labeled, missing, broken, or duplicate photos and videos as “messy,” which claims that all of these factors lower the quality of the AI models trained on them. Manual quality control is no longer practicable since some AI builders currently use datasets with more than 10 billion visual elements.
The Visual Layer steps in at this point. It has developed a tool called Fastdup, that aids data scientists in cleaning their datasets before model training, and is based on an open-source project. Quality automation is used to fix image labels, eliminate duplicates, find abnormalities, and more.
It will fix or remove the image entirely when it discovers an incorrect or unclear image label. The dataset is efficiently cleaned in this way by Visual Layer, which also contributes to increasing the overall accuracy of the model that will be trained on it.
Visual data can be one of the most complicated sorts of data to handle, according to Danny Bickson, Chief Executive and Co-founder of Visual Layer. But for creating valuable AI-based services, it’s essential to comprehend, manage, and curate this content. “Companies are struggling with those huge amounts of data; they often have no clue where their data is and what is inside it. They develop homegrown tools since there is no infrastructure or common standards,” Bickson added.
The Fastdup open-source package has already gathered a community of more than 200,000 early users, including the Indian social commerce network Meesho Inc., which houses more than 13 million resellers. Visual Layer is currently coming out of stealth mode. Srinvassa Rao Jami, Meesho’s lead computer vision manager, said, “Meesho is using Fastdup to improve the quality of our image gallery of 200 million products and automatically detect and fix data quality issues.”
A partner at Madrona Ventures named Jon Turow said that he believes Visual Layer is part of a larger movement among AI developers that demand higher-quality data rather than just more data overall. Earlier this year, he also reported the ample opportunities available for AI-driven services.