Training an AI system is time consuming, but this startup says it has a solution – Morning Brew

Posted: March 17, 2022 at 3:08 am

In northeast England, halfway between Norfolk and Yorkshire, an AI-powered robot spends its days looking at strawberries. Its not as easy as it sounds.

A human farmer can gauge a strawberrys ripeness level by sight and weight, but the process involves putting each strawberry on a scale, which can be destructive and time-consuming. The robot can do the same job for up to 4 million strawberries a day by performing a simple scan of the fruit, undisturbed.

FruitCast, the agricultural AI startup behind the robots, taught its bots how to do their jobs with data from V7 Labs, a London-based startup that helps AI companies automate the training-data process for models. Training can be one of the most labor-intensive parts of getting an AI system off the ground, since it often calls for not only time and resources, but also vetted and relevant data.

The robots are kind of stupid until you put the intelligence on them, Raymond Tunstill, CTO of FruitCast, which was spun off from the University of Lincolns food-tech institute, told Emerging Tech Brew. He added, Its all about taking examples from the real worldis it a ripe strawberry, or is it unripeand showing that to our neural networks so that the neural networks can, essentially, learn. And without V7, we never wouldve been able to classify [them].

Since its 2018 debut, V7 has used its computer vision platform to train AI models to identify everything from lame cows to grapevine bunches, depending on the clients needs. In 2020, V7 raised a $10 million total seed round, and so far, its clients include more than 300 AI companies, as well as academic institutions like Stanford, MIT, and Harvard.

The secret behind V7 is this system that we call AutoAnnotate, the startups CEO Alberto Rizzoli told us. He and his cofounder, Simon Edwardsson, thought it up based on obstacles encountered in their previous business venture: Aipoly, a computer-vision startup that allowed blind users to identify objects using their phone cameras. Though the software worked decently well, Rizzoli recalled, training data was the really difficult part to create.

So they created AutoAnnotate, a general-purpose AI model for computer vision. When a client comes to V7 with training dataimages or videos theyd like an AI model to learn fromV7 detects the objects boundaries in each frame (like strawberries, for instance), and then uses AutoAnnotate to label it. According to its internal measurements, labeling a high-quality piece of training data could take a human up to 2 minutes, said Rizzoli, compared to about 2.5 seconds for AutoAnnotate.

Drones, automation, AI, and more. The technologies that will shape the future of business, all in one newsletter.

To create that training data, V7s model starts off with a continual learning approach. That could begin with subject matter experts in, say, horticulture, drawing boxes around images of fruit and classifying it by ripeness level (e.g., a level-3 strawberry). They then either accept or correct each of the models attempts to do the same.

After about 100 human-guided examples, a model is able to make relatively confident classifications, so it transitions into what Rizzoli calls a co-pilot approachfor any given choice, the AI provides its confidence score and the human makes corrections.

Because its training data, we always have a human verify it, but it becomes a faster process, Rizzoli said. Later, he added, When they find something that is low-confidence, they fix it, otherwise it can go into the knowledge of the modelof the training set.

The company finds human experts via a network of business process outsourcing companies, agencies, and consultants, which Rizzoli claims can find a group of labelers on most topics within 48 hours.

Think of it like sending your pup to dog training camp and still having responsibilities upon its return. When a customer develops their fully-trained model through V7, theyll still need to keep an eye on it and correct any glaring mistakes, but it should, in theory, be much more capable than before. For example, a newly-trained model may be well-equipped to detect strawberry ripeness levels, but if its somehow presented with a photo of a strawberry keychain, it wont know how to proceed.

Even if a model does become an expert in its domain, its risky to use it for tasks besides what its specifically trained for, since results could be unpredictable.

If you have a car that is trained on data from the United States, its able to have certain weather conditions, it's not able to do certain road signs, and to figure out whether it can actually drive on snow or desert, you need to test ityou need to run it on a data set of desert-driving footage and check the accuracy, Rizzoli said. Believe it or not, this sounds pretty straightforward, but there are almost no tools for doing this. And very few people are actually doing benchmarking on training data, because its a new thing.

Continue reading here:

Training an AI system is time consuming, but this startup says it has a solution - Morning Brew

Related Posts