A machine-learning model for image classification thats trained using synthetic data can rival one trained on the real thing, a study shows.
Huge amounts of data are needed to train machine-learning models to perform image classification tasks, such as identifying damage in satellite photos following a natural disaster. However, these data are not always easy to come by. Datasets may cost millions of dollars to generate, if usable data exist in the first place, and even the best datasets often contain biases that negatively impact a models performance.
To circumvent some of the problems presented by datasets, MIT researchers developed a method for training a machine learning model that, rather than using a dataset, uses a special type of machine-learning model to generate extremely realistic synthetic data that can train another model for downstream vision tasks.
Their results show that a contrastive representation learning model trained using only these synthetic data is able to learn visual representations that rival or even outperform those learned from real data.
MIT researchers have demonstrated the use of a generative machine-learning model to create synthetic data, based on real data, that can be used to train another model for image classification. This image shows examples of the generative models transformation methods. Credit: Courtesy of the researchers
This special machine-learning model, known as a generative model, requires far less memory to store or share than a dataset. Using synthetic data also has the potential to sidestep some concerns around privacy and usage rights that limit how some real data can be distributed. A generative model could also be edited to remove certain attributes, like race or gender, which could address some biases that exist in traditional datasets.
We knew that this method should eventually work; we just needed to wait for these generative models to get better and better. But we were especially pleased when we showed that this method sometimes does even better than the real thing, says Ali Jahanian, a research scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and lead author of the paper.
Jahanian wrote the paper with CSAIL grad students Xavier Puig and Yonglong Tian, and senior author Phillip Isola, an assistant professor in the Department of Electrical Engineering and Computer Science. The research will be presented at the International Conference on Learning Representations.
Once a generative model has been trained on real data, it can generate synthetic data that are so realistic they are nearly indistinguishable from the real thing. The training process involves showing the generative model millions of images that contain objects in a particular class (like cars or cats), and then it learns what a car or cat looks like so it can generate similar objects.
Essentially by flipping a switch, researchers can use a pretrained generative model to output a steady stream of unique, realistic images that are based on those in the models training dataset, Jahanian says.
But generative models are even more useful because they learn how to transform the underlying data on which they are trained, he says. If the model is trained on images of cars, it can imagine how a car would look in different situations situations it did not see during training and then output images that show the car in unique poses, colors, or sizes.
Having multiple views of the same image is important for a technique called contrastive learning, where a machine-learning model is shown many unlabeled images to learn which pairs are similar or different.
The researchers connected a pretrained generative model to a contrastive learning model in a way that allowed the two models to work together automatically. The contrastive learner could tell the generative model to produce different views of an object, and then learn to identify that object from multiple angles, Jahanian explains.
This was like connecting two building blocks. Because the generative model can give us different views of the same thing, it can help the contrastive method to learn better representations, he says.
The researchers compared their method to several other image classification models that were trained using real data and found that their method performed as well, and sometimes better, than the other models.
One advantage of using a generative model is that it can, in theory, create an infinite number of samples. So, the researchers also studied how the number of samples influenced the models performance. They found that, in some instances, generating larger numbers of unique samples led to additional improvements.
The cool thing about these generative models is that someone else trained them for you. You can find them in online repositories, so everyone can use them. And you dont need to intervene in the model to get good representations, Jahanian says.
But he cautions that there are some limitations to using generative models. In some cases, these models can reveal source data, which can pose privacy risks, and they could amplify biases in the datasets they are trained on if they arent properly audited.
He and his collaborators plan to address those limitations in future work. Another area they want to explore is using this technique to generate corner cases that could improve machine learning models. Corner cases often cant be learned from real data. For instance, if researchers are training a computer vision model for a self-driving car, real data wouldnt contain examples of a dog and his owner running down a highway, so the model would never learn what to do in this situation. Generating that corner case data synthetically could improve the performance of machine learning models in some high-stakes situations.
The researchers also want to continue improving generative models so they can compose images that are even more sophisticated, he says.
Reference: Generative Models as a Data Source for Multiview Representation Learning by Ali Jahanian, Xavier Puig, Yonglong Tian and Phillip Isola.PDF
This research was supported, in part, by the MIT-IBM Watson AI Lab, the United States Air Force Research Laboratory, and the United States Air Force Artificial Intelligence Accelerator.
Read more from the original source:
When It Comes to AI, Can We Ditch the Datasets? Using Synthetic Data for Training Machine-Learning Models - SciTechDaily
- Are We Overly Infatuated With Deep Learning? - Forbes [Last Updated On: August 18th, 2024] [Originally Added On: December 28th, 2019]
- CMSWire's Top 10 AI and Machine Learning Articles of 2019 - CMSWire [Last Updated On: August 18th, 2024] [Originally Added On: December 28th, 2019]
- Can machine learning take over the role of investors? - TechHQ [Last Updated On: August 18th, 2024] [Originally Added On: December 28th, 2019]
- Pear Therapeutics Expands Pipeline with Machine Learning, Digital Therapeutic and Digital Biomarker Technologies - Business Wire [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Dell's Latitude 9510 shakes up corporate laptops with 5G, machine learning, and thin bezels - PCWorld [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Limits of machine learning - Deccan Herald [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Forget Machine Learning, Constraint Solvers are What the Enterprise Needs - - RTInsights [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Tiny Machine Learning On The Attiny85 - Hackaday [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Finally, a good use for AI: Machine-learning tool guesstimates how well your code will run on a CPU core - The Register [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- How Will Your Hotel Property Use Machine Learning in 2020 and Beyond? | - Hotel Technology News [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Technology Trends to Keep an Eye on in 2020 - Built In Chicago [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- AI and machine learning trends to look toward in 2020 - Healthcare IT News [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- The 4 Hottest Trends in Data Science for 2020 - Machine Learning Times - machine learning & data science news - The Predictive Analytics Times [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- The Problem with Hiring Algorithms - Machine Learning Times - machine learning & data science news - The Predictive Analytics Times [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Going Beyond Machine Learning To Machine Reasoning - Forbes [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Doctor's Hospital focused on incorporation of AI and machine learning - EyeWitness News [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Being human in the age of Artificial Intelligence - Deccan Herald [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Raleys Drive To Be Different Gets an Assist From Machine Learning - Winsight Grocery Business [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Break into the field of AI and Machine Learning with the help of this training - Boing Boing [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- BlackBerry combines AI and machine learning to create connected fleet security solution - Fleet Owner [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- What is the role of machine learning in industry? - Engineer Live [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Seton Hall Announces New Courses in Text Mining and Machine Learning - Seton Hall University News & Events [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Christiana Care offers tips to 'personalize the black box' of machine learning - Healthcare IT News [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Leveraging AI and Machine Learning to Advance Interoperability in Healthcare - - HIT Consultant [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Essential AI & Machine Learning Certification Training Bundle Is Available For A Limited Time 93% Discount Offer Avail Now - Wccftech [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Educate Yourself on Machine Learning at this Las Vegas Event - Small Business Trends [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- 2020: The year of seeing clearly on AI and machine learning - ZDNet [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- How machine learning and automation can modernize the network edge - SiliconANGLE [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Five Reasons to Go to Machine Learning Week 2020 - Machine Learning Times - machine learning & data science news - The Predictive Analytics Times [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Don't want a robot stealing your job? Take a course on AI and machine learning. - Mashable [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Adventures With Artificial Intelligence and Machine Learning - Toolbox [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Optimising Utilisation Forecasting with AI and Machine Learning - Gigabit Magazine - Technology News, Magazine and Website [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Machine Learning: Higher Performance Analytics for Lower ... [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Machine Learning Definition [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Machine Learning Market Size Worth $96.7 Billion by 2025 ... [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Difference between AI, Machine Learning and Deep Learning [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Machine Learning in Human Resources Applications and ... [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Pricing - Machine Learning | Microsoft Azure [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Looking at the most significant benefits of machine learning for software testing - The Burn-In [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- New York Institute of Finance and Google Cloud Launch A Machine Learning for Trading Specialization on Coursera - PR Web [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Uncover the Possibilities of AI and Machine Learning With This Bundle - Interesting Engineering [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Red Hat Survey Shows Hybrid Cloud, AI and Machine Learning are the Focus of Enterprises - Computer Business Review [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Machine learning - Wikipedia [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Vectorspace AI Datasets are Now Available to Power Machine Learning (ML) and Artificial Intelligence (AI) Systems in Collaboration with Elastic -... [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Learning that Targets Millennial and Generation Z - HR Exchange Network [Last Updated On: August 18th, 2024] [Originally Added On: January 23rd, 2020]
- Machine learning and eco-consciousness key business trends in 2020 - Finfeed [Last Updated On: August 18th, 2024] [Originally Added On: January 24th, 2020]
- Jenkins Creator Launches Startup To Speed Software Testing with Machine Learning -- ADTmag - ADT Magazine [Last Updated On: August 18th, 2024] [Originally Added On: January 24th, 2020]
- Research report investigates the Global Machine Learning In Finance Market 2019-2025 - WhaTech Technology and Markets News [Last Updated On: August 18th, 2024] [Originally Added On: January 25th, 2020]
- Expert: Don't overlook security in rush to adopt AI - The Winchester Star [Last Updated On: August 18th, 2024] [Originally Added On: January 25th, 2020]
- Federated machine learning is coming - here's the questions we should be asking - Diginomica [Last Updated On: August 18th, 2024] [Originally Added On: January 25th, 2020]
- I Know Some Algorithms Are Biased--because I Created One - Scientific American [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Iguazio Deployed by Payoneer to Prevent Fraud with Real-time Machine Learning - Business Wire [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Want To Be AI-First? You Need To Be Data-First. - Forbes [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- How Machine Learning Will Lead to Better Maps - Popular Mechanics [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Technologies of the future, but where are AI and ML headed to? - YourStory [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- In Coronavirus Response, AI is Becoming a Useful Tool in a Global Outbreak - Machine Learning Times - machine learning & data science news - The... [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- This tech firm used AI & machine learning to predict Coronavirus outbreak; warned people about danger zones - Economic Times [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- 3 books to get started on data science and machine learning - TechTalks [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- JP Morgan expands dive into machine learning with new London research centre - The TRADE News [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Euro machine learning startup plans NYC rental platform, the punch list goes digital & other proptech news - The Real Deal [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- The ML Times Is Growing A Letter from the New Editor in Chief - Machine Learning Times - machine learning & data science news - The Predictive... [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Top Machine Learning Services in the Cloud - Datamation [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Combating the coronavirus with Twitter, data mining, and machine learning - TechRepublic [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Itiviti Partners With AI Innovator Imandra to Integrate Machine Learning Into Client Onboarding and Testing Tools - PRNewswire [Last Updated On: August 18th, 2024] [Originally Added On: February 2nd, 2020]
- Iguazio Deployed by Payoneer to Prevent Fraud with Real-time Machine Learning - Yahoo Finance [Last Updated On: August 18th, 2024] [Originally Added On: February 2nd, 2020]
- ScoreSense Leverages Machine Learning to Take Its Customer Experience to the Next Level - Yahoo Finance [Last Updated On: August 18th, 2024] [Originally Added On: February 2nd, 2020]
- How Machine Learning Is Changing The Future Of Fiber Optics - DesignNews [Last Updated On: August 18th, 2024] [Originally Added On: February 2nd, 2020]
- How to handle the unexpected in conversational AI - ITProPortal [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- SwRI, SMU fund SPARKS program to explore collaborative research and apply machine learning to industry problems - TechStartups.com [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- Reinforcement Learning (RL) Market Report & Framework, 2020: An Introduction to the Technology - Yahoo Finance [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- ValleyML Is Launching a Series of 3 Unique AI Expo Events Focused on Hardware, Enterprise and Robotics in Silicon Valley - AiThority [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- REPLY: European Central Bank Explores the Possibilities of Machine Learning With a Coding Marathon Organised by Reply - Business Wire [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- VUniverse Named One of Five Finalists for SXSW Innovation Awards: AI & Machine Learning Category - PRNewswire [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- AI, machine learning, robots, and marketing tech coming to a store near you - TechRepublic [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- Putting the Humanity Back Into Technology: 10 Skills to Future Proof Your Career - HR Technologist [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- Twitter says AI tweet recommendations helped it add millions of users - The Verge [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- Artnome Wants to Predict the Price of a Masterpiece. The Problem? There's Only One. - Built In [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- Machine Learning Patentability in 2019: 5 Cases Analyzed and Lessons Learned Part 1 - Lexology [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- The 17 Best AI and Machine Learning TED Talks for Practitioners - Solutions Review [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- Overview of causal inference in machine learning - Ericsson [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]