Microcontrollers, miniature computers that can run simple commands, are the basis for billions of connected devices, from internet-of-things (IoT) devices to sensors in automobiles. But cheap, low-power microcontrollers have extremely limited memory and no operating system, making it challenging to train artificial intelligence models on edge devices that work independently from central computing resources.
Training a machine-learning model on an intelligent edge device allows it to adapt to new data and make better predictions. For instance, training a model on a smart keyboard could enable the keyboard to continually learn from the users writing. However, the training process requires so much memory that it is typically done using powerful computers at a data center, before the model is deployed on a device. This is more costly and raises privacy issues since user data must be sent to a central server.
To address this problem, researchers at MIT and the MIT-IBM Watson AI Lab developed a new technique that enables on-device training using less than a quarter of a megabyte of memory. Other training solutions designed for connected devices can use more than 500 megabytes of memory, greatly exceeding the 256-kilobyte capacity of most microcontrollers (there are 1,024 kilobytes in one megabyte).
The intelligent algorithms and framework the researchers developed reduce the amount of computation required to train a model, which makes the process faster and more memory efficient. Their technique can be used to train a machine-learning model on a microcontroller in a matter of minutes.
This technique also preserves privacy by keeping data on the device, which could be especially beneficial when data are sensitive, such as in medical applications. It also could enable customization of a model based on the needs of users. Moreover, the framework preserves or improves the accuracy of the model when compared to other training approaches.
Our study enables IoT devices to not only perform inference but also continuously update the AI models to newly collected data, paving the way for lifelong on-device learning. The low resource utilization makes deep learning more accessible and can have a broader reach, especially for low-power edge devices, says Song Han, an associate professor in the Department of Electrical Engineering and Computer Science (EECS), a member of the MIT-IBM Watson AI Lab, and senior author of the paper describing this innovation.
Joining Han on the paper are co-lead authors and EECS PhD students Ji Lin and Ligeng Zhu, as well as MIT postdocs Wei-Ming Chen and Wei-Chen Wang, and Chuang Gan, a principal research staff member at the MIT-IBM Watson AI Lab. The research will be presented at the Conference on Neural Information Processing Systems.
Han and his team previously addressed the memory and computational bottlenecks that exist when trying to run machine-learning models on tiny edge devices, as part of their TinyML initiative.
Lightweight training
A common type of machine-learning model is known as a neural network. Loosely based on the human brain, these models contain layers of interconnected nodes, or neurons, that process data to complete a task, such as recognizing people in photos. The model must be trained first, which involves showing it millions of examples so it can learn the task. As it learns, the model increases or decreases the strength of the connections between neurons, which are known as weights.
The model may undergo hundreds of updates as it learns, and the intermediate activations must be stored during each round. In a neural network, activation is the middle layers intermediate results. Because there may be millions of weights and activations, training a model requires much more memory than running a pre-trained model, Han explains.
Han and his collaborators employed two algorithmic solutions to make the training process more efficient and less memory-intensive. The first, known as sparse update, uses an algorithm that identifies the most important weights to update at each round of training. The algorithm starts freezing the weights one at a time until it sees the accuracy dip to a set threshold, then it stops. The remaining weights are updated, while the activations corresponding to the frozen weights dont need to be stored in memory.
Updating the whole model is very expensive because there are a lot of activations, so people tend to update only the last layer, but as you can imagine, this hurts the accuracy. For our method, we selectively update those important weights and make sure the accuracy is fully preserved, Han says.
Their second solution involves quantized training and simplifying the weights, which are typically 32 bits. An algorithm rounds the weights so they are only eight bits, through a process known as quantization, which cuts the amount of memory for both training and inference. Inference is the process of applying a model to a dataset and generating a prediction. Then the algorithm applies a technique called quantization-aware scaling (QAS), which acts like a multiplier to adjust the ratio between weight and gradient, to avoid any drop in accuracy that may come from quantized training.
The researchers developed a system, called a tiny training engine, that can run these algorithmic innovations on a simple microcontroller that lacks an operating system. This system changes the order of steps in the training process so more work is completed in the compilation stage, before the model is deployed on the edge device.
We push a lot of the computation, such as auto-differentiation and graph optimization, to compile time. We also aggressively prune the redundant operators to support sparse updates. Once at runtime, we have much less workload to do on the device, Han explains.
A successful speedup
Their optimization only required 157 kilobytes of memory to train a machine-learning model on a microcontroller, whereas other techniques designed for lightweight training would still need between 300 and 600 megabytes.
They tested their framework by training a computer vision model to detect people in images. After only 10 minutes of training, it learned to complete the task successfully. Their method was able to train a model more than 20 times faster than other approaches.
Now that they have demonstrated the success of these techniques for computer vision models, the researchers want to apply them to language models and different types of data, such as time-series data. At the same time, they want to use what theyve learned to shrink the size of larger models without sacrificing accuracy, which could help reduce the carbon footprint of training large-scale machine-learning models.
AI model adaptation/training on a device, especially on embedded controllers, is an open challenge. This research from MIT has not only successfully demonstrated the capabilities, but also opened up new possibilities for privacy-preserving device personalization in real-time, says Nilesh Jain, a principal engineer at Intel who was not involved with this work. Innovations in the publication have broader applicability and will ignite new systems-algorithm co-design research.
On-device learning is the next major advance we are working toward for the connected intelligent edge. Professor Song Hans group has shown great progress in demonstrating the effectiveness of edge devices for training, adds Jilei Hou, vice president and head of AI research at Qualcomm. Qualcomm has awarded his team an Innovation Fellowship for further innovation and advancement in this area.
This work is funded by the National Science Foundation, the MIT-IBM Watson AI Lab, the MIT AI Hardware Program, Amazon, Intel, Qualcomm, Ford Motor Company, and Google.
More:
Learning on the edge | MIT News | Massachusetts Institute of Technology - MIT News
- Machine learning provides a new picture of the great gray owl - Phys.org - April 2nd, 2024
- What is Machine Learning? Definition, Types, Tools & More - April 2nd, 2024
- Revolutionizing Industries: The Convergence of RFID, AI, and Machine Learning - yTech - April 2nd, 2024
- Layerwise Importance Sampled AdamW (LISA): A Machine Learning Optimization Algorithm that Randomly Freezes Layers of LLM Based on a Given Probability... - April 2nd, 2024
- Dimensionality reduction for images of IoT using machine learning | Scientific Reports - Nature.com - April 2nd, 2024
- 3 Machine Learning Stocks That Could Be Multibaggers in the Making: March Edition - InvestorPlace - April 2nd, 2024
- Researchers use machine learning to improve the taste of Belgian beers Physics World - physicsworld.com - April 2nd, 2024
- PM Modi Emphasizes The Importance Of Incorporating AI & Machine Learning To Enhance Digital Infra - Business Today - April 2nd, 2024
- Accurate and rapid antibiotic susceptibility testing using a machine learning-assisted nanomotion technology platform - Nature.com - March 21st, 2024
- Machine Learning Accelerates the Simulation of Dynamical Fields - Eos - March 21st, 2024
- Quantum Machine Learning: Exploring the Intersection of New Frontiers - DataScientest - March 21st, 2024
- Advancements in Pancreatic Cancer Detection: Integrating Biomarkers, Imaging Technologies, and Machine Learning ... - Cureus - March 21st, 2024
- Google Health Researchers Propose HEAL: A Methodology to Quantitatively Assess whether Machine Learning-based Health Technologies Perform Equitably -... - March 21st, 2024
- A change in the machine learning landscape - InfoWorld - March 21st, 2024
- Informing immunotherapy with multi-omics driven machine learning | npj Digital Medicine - Nature.com - March 21st, 2024
- Crypto Entities That Neglect AI and Machine Learning Investment Will Lag Behind, Warns Binance CTO Bitcoin News - Bitcoin.com News - March 21st, 2024
- MIT Researchers Developed an Image Dataset that Allows Them to Simulate Peripheral Vision in Machine Learning Models - MarkTechPost - March 21st, 2024
- BurstAttention: A Groundbreaking Machine Learning Framework that Transforms Efficiency in Large Language Models with Advanced Distributed Attention... - March 21st, 2024
- A machine learning system to identify progress level of dry rot disease in potato tuber based on digital thermal image ... - Nature.com - January 24th, 2024
- Mind the Gap Machine Learning, Dataset Shift, and History in the Age of Clinical Algorithms | NEJM - nejm.org - January 24th, 2024
- Cracking the Business Code of Clusters Machine Learning Times - The Machine Learning Times - January 24th, 2024
- Machine-learning-based models found to have predictive abilities no better than chance in out-of-sample evaluations - 2 Minute Medicine - January 24th, 2024
- Hybrid machine learning method boosts resolution of electrical impedance tomography - Tech Xplore - January 24th, 2024
- Cow moos and burps to be monitored using machine learning - FoodNavigator.com - January 24th, 2024
- Enhancing foveal avascular zone analysis for Alzheimer's diagnosis with AI segmentation and machine learning using ... - Nature.com - January 24th, 2024
- How to Use AI and Machine Learning for Academic Research - Innovation & Tech Today - January 24th, 2024
- Smart Use of Machine Learning Algorithms: Beyond the Hype, Into Real-World Solutions - Medium - January 24th, 2024
- How A.I./Machine Learning Is Boosting the Diversity of U.S. Med Students and Americas Future Doctors - Higher Education Digest - January 24th, 2024
- Weekly AiThority Roundup: Biggest Machine Learning, Robotic And Automation Updates - AiThority - January 24th, 2024
- How to Develop and Deploy Machine Learning Project in Python - Analytics Insight - January 24th, 2024
- Machine learning education | TensorFlow - January 7th, 2024
- How LinkedIn Uses Machine Learning to Address Content-Related Threats and Abuse - InfoQ.com - January 7th, 2024
- What is AI and Machine Learning? - GovernmentCIO Media & Research - January 7th, 2024
- Overview: Machine Learning Specialization by Andrew Ng (Course 1) - Medium - January 7th, 2024
- Study uses new tools, machine learning to investigate major cause of blindness in older adults - Medical Xpress - January 7th, 2024
- Leveraging AI and Machine Learning on AWS | by Be | Jan, 2024 - Medium - January 7th, 2024
- The Future at the Intersection of AI, Machine Learning, and Data Science - Medriva - January 7th, 2024
- Navigating the AI Landscape: From Machine Learning Foundations to Multimodal Advancements - Medium - January 7th, 2024
- Brake Noise And Machine Learning (3 of 4) - The BRAKE Report - January 7th, 2024
- 'Local' machine learning promises to cut the cost of AI development in 2024 - ITPro - January 7th, 2024
- Voice Recognition with Machine Learning on Arduino Nano 33 BLE Sense - Medium - January 7th, 2024
- This Paper from MIT and Microsoft Introduces LASER: A Novel Machine Learning Approach that can Simultaneously Enhance an LLMs Task Performance and... - January 7th, 2024
- How to Choose the Right Advanced Certification Program in AI & Machine Learning - TechGraph - January 7th, 2024
- What Is Machine Learning? | A Beginner's Guide - Scribbr - November 17th, 2023
- AI vs. Machine Learning vs. Deep Learning vs. Neural Networks ... - IBM - January 30th, 2023
- The Latest Google Research Shows how a Machine Learning ML Model that Provides a Weak Hint can Significantly Improve the Performance of an Algorithm... - January 30th, 2023
- What Is Machine Learning and Why Is It Important? - January 22nd, 2023
- Achieving Next-Level Value From AI By Focusing On The Operational Side Of Machine Learning - Forbes - January 22nd, 2023
- UCLA Researcher Develops a Python Library Called ClimateLearn for Accessing State-of-the-Art Climate Data and Machine Learning Models in a... - January 22nd, 2023
- Alto Neuroscience Presents New Data Leveraging EEG and Machine Learning to Predict Individual Response to Antidepressants at the 61st Annual Meeting... - December 12th, 2022
- Apple has released a Set of Optimizations that allow the Stable Diffusion AI Image Generator to be used on Apple Silicon, making use of Core ML,... - December 12th, 2022
- Genomic Testing Cooperative to Present Data at the American Society of Hematology Meeting on New Applications of its Proprietary Tests that Combine... - December 12th, 2022
- Astronomers at Caltech Have Used a Machine Learning Algorithm to Classify 1,000 Supernovae Completely Autonomously - MarkTechPost - December 4th, 2022
- Deep Learning | NVIDIA Developer - November 25th, 2022
- Check Out This Tool That Uses Machine Learning To Animate 3D Models In Real-Time And Will Soon Be Compatible With Unreal Engine - MarkTechPost - November 17th, 2022
- The NFT World is Evolving, and That's No Secret. Machine Learning and Algorithmic Tools ... - Latest Tweet - LatestLY - October 23rd, 2022
- Its Not Just About Accuracy - Five More things to Consider for a Machine Learning Model - AZoM - October 15th, 2022
- Machine learning operations offer agility, spur innovation - MIT Technology Review - October 15th, 2022
- Machine learning to predict the development of recurrent urinary tract infection related to single uropathogen, Escherichia coli | Scientific Reports... - October 15th, 2022
- The more data, the more deep learning capacity - Innovation Origins - October 15th, 2022
- Outlook on the Machine Learning in Life Sciences Global Market to 2027 - Featuring Alteryx, Anaconda, Canon Medical Systems and Imagen Technologies... - October 15th, 2022
- Forensic Discovery Taps Reveal-Brainspace to Bolster its Analytics, AI and Machine Learning Capabilities - Business Wire - October 15th, 2022
- Long-term exposure to particulate matter was associated with increased dementia risk using both traditional approaches and novel machine learning... - October 15th, 2022
- Machine Learning | Google Developers - October 7th, 2022
- Machine Learning in Oracle Database | Oracle - October 7th, 2022
- Study: Few randomized clinical trials have been conducted for healthcare machine learning tools - Mobihealth News - October 7th, 2022
- The Worldwide Industry for Machine Learning in the Life Sciences is Expected to Reach $20.7 Billion by 2027 - ResearchAndMarkets.com - Business Wire - October 7th, 2022
- Dominos MLops release focuses on GPUs and deep learning, offers multicloud preview - VentureBeat - October 7th, 2022
- MLOps Company Iterative Sees Steady Growth in First Half of 2022 - Business Wire - October 7th, 2022
- Machine learning tool could help people in rough situations make sure their water is good to drink - ZME Science - October 7th, 2022
- Developing Machine-Learning Apps on the Raspberry Pi Pico - Design News - October 7th, 2022
- Arctoris welcomes on board globally recognized experts in Machine Learning, Chemical Computation, and Alzheimer's Disease - Business Wire - October 7th, 2022
- Machine vision breakthrough: This device can see 'millions of colors' - Northeastern University - October 7th, 2022
- RBI plans to extensively use artificial intelligence, machine learning to improve regulatory supervision - ETCIO - October 7th, 2022
- Artificial intelligence may improve suicide prevention in the future - EurekAlert - October 7th, 2022
- Google turns to machine learning to advance translation of text out in the real world - TechCrunch - September 29th, 2022
- Machine learning has predicted the winners of the Worlds - CyclingTips - September 29th, 2022
- Peking University released the first open-source dataset for machine learning applications in fast chip design - EurekAlert - September 29th, 2022
- Predicting the effects of winter water warming in artificial lakes on zooplankton and its environment using combined machine learning models |... - September 29th, 2022
- Bryant launches graduate programs in Business Analytics, Data Science, Healthcare Informatics, and Taxation - Bryant University - September 29th, 2022