August 1, 2023 | A group of scientists at the Wyss Institute for Biologically Inspired Engineering at Harvard University and MIT are convinced that automated machine learning (autoML) is going to revolutionize biology by removing many of the technical barriers to using computational models to answer fundamental questions about sequences of nucleic acids, peptides, and glycans. Machine learning can be complicated, but it doesnt have to be, and sometimes simpler is better, according to graduate student Jackie Valeri, a big believer in the power of autoML to solve real-world problems.
AutoML is a method learning concept that helps users transfer data to training algorithms and automatically search for the best ML architecture for a given issue, lowering the demand for expert-level computational knowledge that currently outpaces the supply. It can also be pretty competitive with even the best manually designed ML models that can take months if not years to develop, says Valeri, as she and her colleagues recently demonstrated in a paper published in Cell Systems (DOI: 10.1016/j.cels.2023.05.007).
The article showcased the potential of their novel BioAutoMATED platform which, unlike other autoML tools, accommodates more than one type of ML model and is designed to accept biological sequences. Its intended users are systems and synthetic biologists with little or no ML experience, says Valeri, who works in the lab of Jim Collins, Ph.D. at the Wyss Institute.
The all-in-one BioAutoMATED platform modifies three existing AutoML toolsAutoKeras, which searches for optimal neural networks; DeepSwarm, which looks for convolutional neural networks; and TPOT, which hunts for a variety of other, simpler modeling techniques such as linear regression and random forest classifiersto come up with the most appropriate model for a users dataset, she explains. Standardized output results are presented as a set of folders, each associated with one of those search techniques, revealing the best performing model in graphic and text file format.
The tool is very meta, says Valeri, in that it is learning on the learning. Model selection is often the part of research projects that requires a lot of computational expertise biologists generally do not possess and the task cant be easily passed to an ML specialist even if one is to be found because domain knowledge is needed in the model-building process.
Overall, biological researchers are excited about using machine learning but until now have been stymied by the amount of coding needed to get started, she says, noting that it is not uncommon for ML models to have a codebase of over 750 lines. The installation of packages alone can be a huge barrier.
Interest in ML has skyrocketed over the past year thanks largely to the introduction of ChatGPT with its user-friendly interface, but people have also quickly discovered they cant trust everything the large language model has to offer, says Valeri. Similarly, BioAutoMATED is useful but not a magic bullet that erases data problems and like ML in general should be approached with a healthy amount of skepticism to ensure it is learning whats intended.
BioAutoMATED will in the future likely be used together with ChatGPT, predicts Wyss postdoctoral fellow Luis Soenksen, Ph.D., co-lead author on the Cell Systems paper. Researchers will simply articulate what they want to do and be presented with the best questions, required data, and ML models to get the job done.
When put to the test, BioAutoMATED not only outperformed other autoML tools but also some of the models created by a professional ML expertand did it in under 30 minutes using only 10 lines of input code from the user. The required coding is for the basics, says Valeri, to specify the target folder for results, the file name where input data can be found, the column name where sequences can be found within that file, and run times for these extensions.
Users are instructed to first install Docker on their computer, if they have not done so already, and are walked through the process of doing that, she adds. The open software platform sets up its own environment for running applications, requiring only two lines of code to access the Jupyter notebooks preloaded on BioAutoMATED that contain everything needed to run the autoML tool. Its a quick start for most people accustomed to using a computer.
With a bit more coding, users can access some of the embedded extras, says Valeri. These include the outputs from scrambled control tests where BioAutoMATED generates sequences by shuffling the order of nucleotides, answering the frequently asked question of whether models are picking up on real order-and sequence-specific biology.
Half of the battle in biological research is knowing how to ask the right questions, says Soenksen. The platform helps users do that as well as provides insights leading to new questions, hypotheses, models, and experiments.
Users can also opt for data saturation tests where BioAutoMATED sequentially reduces the dataset size to see the effect on model performance, Valeri says. If you can say the models do great with 20,000 sequences, maybe you dont have to go to the effort of collecting 50,000 or 100,000 sequences, which is a real impactful finding for a biologist actually doing the experiments.
Two of the most exciting outputs from the tool, in Valeris mind, are the interpretation and design results. Interpretation results indicate what a model is learning (e.g., nucleotides of elevated importance), including sequence logos where the larger the size of the letter in the sequence the more important it is to whatever function of interest is being examined. Sequence logos of the raw data can also be done to facilitate comparisons across ML tools.
Biologists using BioAutoMATED in this way can expect some actionable outputs, says Valeri. They might want to pay more attention to a motif that pops up through all these sequence logos, for example, or do a deep mutational scanning of a targeted region of the sequence that appears to be most important.
The other key output is a list of de novo design sequences that are optimized for whatever function the model has been trained on, she says. For the newly published study, this focused on the downstream efficiency of a ribosome binding site to translate RNA into protein in E. coli bacteria.
BioAutoMATED was also used to identify areas of the sequence most important in determining translation efficiency, and to design new sequences that could be tested experimentally. Further, the platform generated highly accurate information about amino acids in a peptide sequence most critical in determining an antibodys ability to bind to the drug ranibizumab (Lucentis), as well as classified different types of glycans into immunogenic and non-immunogenic groups based on their sequences.
Finally, the team had the platform optimize the sequences of RNA-based toehold switches. This informed the design of new toehold switches for experimental testing with minimal input coding required.
The time it takes to obtain results from BioAutoMATED depends on several factors, including the question being asked and the size of the dataset for model training, says Valeri. Weve found the length of the sequence is a really big factor... and the compute resources you have available.
The maximum user-allowed time for obtaining results is another important consideration, adds Soenksen. The platform can search for hours or days, as circumstances dictate. Time constraints are routinely employed when training ML models as a matter of practicality.
Soenksen and Valeri both use BioAutoMATED as a benchmark for their own custom-built models, and friends that have tested the platform on different machines are enthusiastic about its potential, they say. In the manuscript, the platform also had good performance on many different datasets, including ones specific to sequence lengths and types.
I have personally used it for some quick paper explorations, trying to see what data are available... [without] having to take the time to code up my own machine learning models, says Valeri. Although it is too soon to know how the tool will be used by biologists elsewhere, it is already being used regularly by a handful of scientists at Harvard investigating short DNA, RNA, peptide, and glycan sequences.
BioAutoMATED is available to download fromGitHub. If we get a lot of traction [with it], and I think we will, our team will probably put more resources into the user interface, notes Soenksen, a serial entrepreneur in the science and technology space. The long-term goal is to make the tool usable by clicking buttons to further lower barriers to access.
If youre a machine learning expert, youll probably be able to beat the output of BioAutoMATED, adds Valeri. We are just trying to make it easy for people with limited machine learning expertise to [quickly] get to a pretty good model.
Complicated neural networks and big language models, which have a lot of parameters and require large amounts of data, are not always best, she says. The simple-model techniques identified by TPOT can be quite well suited to the often-limited datasets biologists have available and can perform as well as if not better than systems with more advanced ML architecture.
Continue reading here:
Platform Reduces Barriers Biologists Face In Accessing Machine ... - Bio-IT World
- Are We Overly Infatuated With Deep Learning? - Forbes [Last Updated On: August 18th, 2024] [Originally Added On: December 28th, 2019]
- CMSWire's Top 10 AI and Machine Learning Articles of 2019 - CMSWire [Last Updated On: August 18th, 2024] [Originally Added On: December 28th, 2019]
- Can machine learning take over the role of investors? - TechHQ [Last Updated On: August 18th, 2024] [Originally Added On: December 28th, 2019]
- Pear Therapeutics Expands Pipeline with Machine Learning, Digital Therapeutic and Digital Biomarker Technologies - Business Wire [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Dell's Latitude 9510 shakes up corporate laptops with 5G, machine learning, and thin bezels - PCWorld [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Limits of machine learning - Deccan Herald [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Forget Machine Learning, Constraint Solvers are What the Enterprise Needs - - RTInsights [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Tiny Machine Learning On The Attiny85 - Hackaday [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Finally, a good use for AI: Machine-learning tool guesstimates how well your code will run on a CPU core - The Register [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- How Will Your Hotel Property Use Machine Learning in 2020 and Beyond? | - Hotel Technology News [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Technology Trends to Keep an Eye on in 2020 - Built In Chicago [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- AI and machine learning trends to look toward in 2020 - Healthcare IT News [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- The 4 Hottest Trends in Data Science for 2020 - Machine Learning Times - machine learning & data science news - The Predictive Analytics Times [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- The Problem with Hiring Algorithms - Machine Learning Times - machine learning & data science news - The Predictive Analytics Times [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Going Beyond Machine Learning To Machine Reasoning - Forbes [Last Updated On: August 18th, 2024] [Originally Added On: January 11th, 2020]
- Doctor's Hospital focused on incorporation of AI and machine learning - EyeWitness News [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Being human in the age of Artificial Intelligence - Deccan Herald [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Raleys Drive To Be Different Gets an Assist From Machine Learning - Winsight Grocery Business [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Break into the field of AI and Machine Learning with the help of this training - Boing Boing [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- BlackBerry combines AI and machine learning to create connected fleet security solution - Fleet Owner [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- What is the role of machine learning in industry? - Engineer Live [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Seton Hall Announces New Courses in Text Mining and Machine Learning - Seton Hall University News & Events [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Christiana Care offers tips to 'personalize the black box' of machine learning - Healthcare IT News [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Leveraging AI and Machine Learning to Advance Interoperability in Healthcare - - HIT Consultant [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Essential AI & Machine Learning Certification Training Bundle Is Available For A Limited Time 93% Discount Offer Avail Now - Wccftech [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Educate Yourself on Machine Learning at this Las Vegas Event - Small Business Trends [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- 2020: The year of seeing clearly on AI and machine learning - ZDNet [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- How machine learning and automation can modernize the network edge - SiliconANGLE [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Five Reasons to Go to Machine Learning Week 2020 - Machine Learning Times - machine learning & data science news - The Predictive Analytics Times [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Don't want a robot stealing your job? Take a course on AI and machine learning. - Mashable [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Adventures With Artificial Intelligence and Machine Learning - Toolbox [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Optimising Utilisation Forecasting with AI and Machine Learning - Gigabit Magazine - Technology News, Magazine and Website [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Machine Learning: Higher Performance Analytics for Lower ... [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Machine Learning Definition [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Machine Learning Market Size Worth $96.7 Billion by 2025 ... [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Difference between AI, Machine Learning and Deep Learning [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Machine Learning in Human Resources Applications and ... [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Pricing - Machine Learning | Microsoft Azure [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
- Looking at the most significant benefits of machine learning for software testing - The Burn-In [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- New York Institute of Finance and Google Cloud Launch A Machine Learning for Trading Specialization on Coursera - PR Web [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Uncover the Possibilities of AI and Machine Learning With This Bundle - Interesting Engineering [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Red Hat Survey Shows Hybrid Cloud, AI and Machine Learning are the Focus of Enterprises - Computer Business Review [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Machine learning - Wikipedia [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Vectorspace AI Datasets are Now Available to Power Machine Learning (ML) and Artificial Intelligence (AI) Systems in Collaboration with Elastic -... [Last Updated On: August 18th, 2024] [Originally Added On: January 22nd, 2020]
- Learning that Targets Millennial and Generation Z - HR Exchange Network [Last Updated On: August 18th, 2024] [Originally Added On: January 23rd, 2020]
- Machine learning and eco-consciousness key business trends in 2020 - Finfeed [Last Updated On: August 18th, 2024] [Originally Added On: January 24th, 2020]
- Jenkins Creator Launches Startup To Speed Software Testing with Machine Learning -- ADTmag - ADT Magazine [Last Updated On: August 18th, 2024] [Originally Added On: January 24th, 2020]
- Research report investigates the Global Machine Learning In Finance Market 2019-2025 - WhaTech Technology and Markets News [Last Updated On: August 18th, 2024] [Originally Added On: January 25th, 2020]
- Expert: Don't overlook security in rush to adopt AI - The Winchester Star [Last Updated On: August 18th, 2024] [Originally Added On: January 25th, 2020]
- Federated machine learning is coming - here's the questions we should be asking - Diginomica [Last Updated On: August 18th, 2024] [Originally Added On: January 25th, 2020]
- I Know Some Algorithms Are Biased--because I Created One - Scientific American [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Iguazio Deployed by Payoneer to Prevent Fraud with Real-time Machine Learning - Business Wire [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Want To Be AI-First? You Need To Be Data-First. - Forbes [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- How Machine Learning Will Lead to Better Maps - Popular Mechanics [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Technologies of the future, but where are AI and ML headed to? - YourStory [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- In Coronavirus Response, AI is Becoming a Useful Tool in a Global Outbreak - Machine Learning Times - machine learning & data science news - The... [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- This tech firm used AI & machine learning to predict Coronavirus outbreak; warned people about danger zones - Economic Times [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- 3 books to get started on data science and machine learning - TechTalks [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- JP Morgan expands dive into machine learning with new London research centre - The TRADE News [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Euro machine learning startup plans NYC rental platform, the punch list goes digital & other proptech news - The Real Deal [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- The ML Times Is Growing A Letter from the New Editor in Chief - Machine Learning Times - machine learning & data science news - The Predictive... [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Top Machine Learning Services in the Cloud - Datamation [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Combating the coronavirus with Twitter, data mining, and machine learning - TechRepublic [Last Updated On: August 18th, 2024] [Originally Added On: February 1st, 2020]
- Itiviti Partners With AI Innovator Imandra to Integrate Machine Learning Into Client Onboarding and Testing Tools - PRNewswire [Last Updated On: August 18th, 2024] [Originally Added On: February 2nd, 2020]
- Iguazio Deployed by Payoneer to Prevent Fraud with Real-time Machine Learning - Yahoo Finance [Last Updated On: August 18th, 2024] [Originally Added On: February 2nd, 2020]
- ScoreSense Leverages Machine Learning to Take Its Customer Experience to the Next Level - Yahoo Finance [Last Updated On: August 18th, 2024] [Originally Added On: February 2nd, 2020]
- How Machine Learning Is Changing The Future Of Fiber Optics - DesignNews [Last Updated On: August 18th, 2024] [Originally Added On: February 2nd, 2020]
- How to handle the unexpected in conversational AI - ITProPortal [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- SwRI, SMU fund SPARKS program to explore collaborative research and apply machine learning to industry problems - TechStartups.com [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- Reinforcement Learning (RL) Market Report & Framework, 2020: An Introduction to the Technology - Yahoo Finance [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- ValleyML Is Launching a Series of 3 Unique AI Expo Events Focused on Hardware, Enterprise and Robotics in Silicon Valley - AiThority [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- REPLY: European Central Bank Explores the Possibilities of Machine Learning With a Coding Marathon Organised by Reply - Business Wire [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- VUniverse Named One of Five Finalists for SXSW Innovation Awards: AI & Machine Learning Category - PRNewswire [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- AI, machine learning, robots, and marketing tech coming to a store near you - TechRepublic [Last Updated On: August 18th, 2024] [Originally Added On: February 5th, 2020]
- Putting the Humanity Back Into Technology: 10 Skills to Future Proof Your Career - HR Technologist [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- Twitter says AI tweet recommendations helped it add millions of users - The Verge [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- Artnome Wants to Predict the Price of a Masterpiece. The Problem? There's Only One. - Built In [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- Machine Learning Patentability in 2019: 5 Cases Analyzed and Lessons Learned Part 1 - Lexology [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- The 17 Best AI and Machine Learning TED Talks for Practitioners - Solutions Review [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]
- Overview of causal inference in machine learning - Ericsson [Last Updated On: August 18th, 2024] [Originally Added On: February 6th, 2020]