HMUSA 2022 Conference: Using Machine Learning to Understand Production – From Tools to Finished Parts – Today’s Medical Developments

About the presentationModern data science techniques are used across many industries, but the manufacturing industry has been slow in adopting advanced analytics to optimize and accelerate operations. There are a lot of reasons for this lack of adoption, for example, a shortage of data science and analytics talent and an attitude: if its not broke, dont fix it. However, forward-thinking manufacturers are starting to embrace the power of big data and analytics to improve their operations and create a nimbler work environment. Several companies have stepped into the analytics gap, offering software and hardware solutions that provide more data and actionable insights into everything from toolpath optimization to overall production intelligence in real time. With the amount of information production by CNC machines and advanced toolpath optimization solutions, theres a big opportunity to use machine learning and advanced analytics to develop deeper insights that span from tooling to the machine, and all the way to overall factory performance. In this session, Greg McHale, CTO and co-founder of Datanomix, and Rob Caron, founder and CEO of Caron Engineering, discuss the types of data that are available from these manufacturing optimization solutions and the possibilities for driving deep insights into overall factory performance using advanced data science and machine learning.

Meet your presenterGreg McHale founded Datanomix on the premise that the 4th industrial revolution would require turnkey products that integrate seamlessly with how manufacturers work today. He brings enterprise data skills to a market ripe for innovation. McHale held engineering leadership positions at several venture-backed companies and is a graduate of Worcester Polytechnic Institute.

In 1986, after working in the CNC machining industry for many years, President Rob Caron, (P.E.), identified a gap in the market when it came to CNC tool monitoring and machine control. He decided to start his own company in the basement of his Wells, Maine-based home to pursue the development of these conceptual technologies. Today, Caron Engineering has a new 12,000ft2facility with more than 35 employees and a dynamic product line of smart manufacturing solutions that are unmatched in the industry.

About the companiesFounded in 2016 in New Hampshire, Datanomix offers automated production intelligence for precision manufacturers. When we started Datanomix, we met with dozens of manufacturers who were trying to use data from their equipment to optimize operations. Not one company was getting what they wanted out of their existing monitoring systems ? information was either too complicated and cumbersome, or too simple and not insightful. The user interfaces on those systems made it look like those companies didnt understand manufacturers. Based on their input, we built a system designed with a few key principles:?The system would require no human input for the data to be useful ? The information provided by the system would be actionable right now ? The system should be a member of your team, capable of providing answers and insights in the way you think about your business

At Caron Engineering, our mission is to transcend the industry standard by developing advanced sensor and monitoring technology to optimize performance, productivity, and profitability. As an employee-owned entity (ESOP), we work together to bring the best possible service and quality to our customers. Its our goal to provide leading smart manufacturing solutions to reduce cycle times, promote unattended operation, drive down tooling costs, and minimize expensive damage to machines and work-holding. Our people are in the business of developing adaptive solutions for the future of manufacturing through strong leadership, foresight, and diligence. At Caron Engineering, innovation is what drives us.

View original post here:
HMUSA 2022 Conference: Using Machine Learning to Understand Production - From Tools to Finished Parts - Today's Medical Developments

Autonomous Experiments in the Age of Computing, Machine Learning and Automation: Progress and Challenges – Argonne National Laboratory

Abstract:Machine learning has by now become a widely used tool within materials science, spawning entirely new fields such as materials informatics that seek to accelerate the discovery and optimization of material systems through both experiments and computational studies. Similarly, the increasing use of robotic systems has led to the emergence of autonomous systems ranging from chemical synthesis to personal vehicles, which has spurred the scientific community to investigate these directions for their own tasks. This begs the question, when will mainstay scientific synthesis and characterization tools, such as electron and scanning probe microscopes, start to perform experiment autonomously?

In this talk, I will discuss the history of how machine learning, automation and availability of compute has led to nascent autonomous microscopy platforms at the Center for Nanophase Materials Sciences. I will illustrate the challenges to making autonomous experiments happen, as well as the necessity for data, computation, and abstractions to fully realize the potential these systems can offer for scientific discovery. I will then focus on our work on reinforcement learning as a tool that can be leveraged to facilitate autonomous decision making to optimize material characterization (and material properties) on the fly, on a scanning probe microscope. Finally, some workflow and data infrastructure issues will also be discussed. This research was conducted at and supported by the Center for Nanophase Materials Sciences, a US DOE Office of Science User Facility.

Original post:
Autonomous Experiments in the Age of Computing, Machine Learning and Automation: Progress and Challenges - Argonne National Laboratory

Deep learning pioneer Geoffrey Hinton receives prestigious Royal Medal from the Royal Society – University of Toronto

The University of Torontos Geoffrey Hinton has been honoured withthe Royal Societysprestigious Royal Medal for his pioneering work in deep learning a field of artificial intelligence that mimics the way humans acquire certain types of knowledge.

TheU.K.s national academy of sciencessaid it is recognizing Hinton,a University Professor Emeritus in the department of computer science in the Faculty of Arts & Science, for pioneering work on algorithms that learn distributed representations in artificial neural networks and their application to speech and vision, leading to a transformation of the international information technology industry.

Its the latest in along list of accolades for Hinton, who is alsochief scientific adviser at theVector Institute for Artificial Intelligenceand a vice-president and engineering fellow at Google. Others includethe Association for Computing Machinerys A. M. Turing Award, widely considered the Nobel Prize of computing.

It is a great honour to receive the Royal Medal a medal previously awarded to intellectual giants like Darwin, Faraday, Boole and G.I. Taylor, Hinton says.

But unlike them, my success was the result of recruiting and nurturing an extraordinarily talented set of graduate students and post-docs who were responsible for many of the breakthroughs in deep learning that revolutionized artificial intelligence over the last 15 years.

Royal Medalshave been awarded annually since 1826 for advancements in the physical and biological sciences. A third medal for applied sciences has been awarded since 1965.

Previous U of T winners of the Royal Medalinclude Anthony Pawson andNobel Prize-winner John Polanyi.

Hinton, meanwhile,has been a Fellow of the Royal Society since 1998 and a Fellow of the Royal Society of Canada since 1996.

The Royal Medal is one of the most significant acknowledgements of an individuals research and career, says Melanie Woodin, dean of the Faculty of Arts & Science. And Professor Hinton is truly deserving of the distinction for his foundational research and for the exceptional contribution hes made toward shaping the modern world and the future. I am thrilled to congratulate him on this award.

I want to congratulate Geoff on this spectacular achievement, adds Eyal de Lara, chair of the department of computer science. We are very proud of the seminal contributions he has made to the field of computer science, which are fundamentally reshaping our discipline and impacting society at large.

Deep learning is a typeof machine learningthat relies on a neural network modelled on the network of neurons in the human brain. In 1986, Hinton and his collaborators developed the breakthrough approach based on the backpropagation algorithm, a central mechanism by which artificial neural networks learn that would realize the promise of neural networks and form the current foundation of that technology.

Hinton and his colleagues in Toronto built on that initial work with a number of critical developments that enhanced the potential of AI and helped usher in todays revolution in deep learning with applications in speech and image recognition, self-driving vehicles, automated diagnosis of images and language, and more.

I believe that the spectacular recent progress in large language models, image generation and protein structure prediction is evidence that the deep learning revolution has only just started, Hinton says.

Go here to see the original:
Deep learning pioneer Geoffrey Hinton receives prestigious Royal Medal from the Royal Society - University of Toronto

Best Machine Learning Books to Read This Year [2022 List] – CIO Insight

Advertiser disclosure: We may be compensated by vendors who appear on this page through methods such as affiliate links or sponsored partnerships. This may influence how and where their products appear on our site, but vendors cannot pay to influence the content of our reviews. For more info, visit ourTerms of Use page.

Machine learning (ML) books are a valuable resource for IT professionals looking to expand their ML skills or pursue a career in machine learning. In turn, this expertise helps organizations automate and optimize their processes and make data-driven decisions. Machine learning books can help ML engineers learn a new skill or brush up on old ones.

Beginners and seasoned experts alike can benefit from adding machine learning books to their reading lists, though the right book depends on the learners goals. Some books serve as an entry point to the world of machine learning, while others build on existing knowledge.

The books in this list are roughly ranked in order of difficultybeginners should avoid pursuing the books toward the end until theyve mastered the concepts introduced in the books at the top of the list.

Machine Learning for Absolute Beginners is an excellent introduction to the machine learning field of study. Its a clear and concise overview of the high-level concepts that drive machine learning, so its ideal for beginners. The e-book format has free downloadable resources, code exercises, and video tutorials to satisfy a variety of learning styles.

Readers will learn the basic ML libraries and other tools needed to build their first model. In addition, this book covers data scrubbing techniques, data preparation, regression analysis, clustering, and bias/variance. This book may be a bit too basic for readers who are interested in learning more about coding, deep learning, or other advanced skills.

As the name implies, The Hundred-Page Machine Learning Book provides a brief overview of machine learning and the mathematics involved. Its suitable for beginners, but some knowledge of probability, statistics, and applied mathematics will help readers get through the material faster.

The book covers a broad range of ML topics at a high level and focuses on the aspects of ML that are of significant practical value. These include:

Several reviewers said that the text explains complicated topics in a way that is easy for most readers to understand. It doesnt dive into any one topic too deeply, but it provides several practice exercises and links to other resources for further reading.

Introduction to Machine Learning with Python is a starting point for aspiring data scientists who want to learn about machine learning through Python frameworks. It doesnt require any prior knowledge of machine learning or Python, though familiarity with NumPy and matplotlib libraries will enhance the learning experience.

In this book, readers will gain a foundational understanding of machine learning concepts and the benefits and drawbacks of using standard ML algorithms. It also explains how all of the algorithms behind various Python libraries fit together in a way thats easy to understand for even the most novice learners.

Python Machine Learning by Example builds on existing machine learning knowledge for engineers who want to dive deeper into Python programming. Each chapter demonstrates the practical application of common Python ML skills through concrete examples. These skills include:

This book walks through each problem with a step-by-step guide for implementing the right Python technique. Readers should have prior knowledge of both machine learning and Python, and some reviewers recommended supplementing this guide with more theoretical reference materials for advanced comprehension.

Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow provides a practical introduction to machine learning with a focus on three Python frameworks. Readers will gain an understanding of numerous machine learning concepts and techniques, including linear regression, neural networks, and deep learning. Then, readers can apply what they learn to practical exercises throughout the book.

Though this book is marketed toward beginners, some reviewers said it requires a basic understanding of machine learning principles. With this in mind, it may be better suited for readers who want to refresh their existing knowledge through concrete examples.

Machine learning for Hackers is written for experienced programmers who want to maximize the impact of their data. The text builds on existing knowledge of the R programming language to create basic machine learning algorithms and analyze datasets.

Each chapter walks through a different machine learning challenge to illustrate various concepts. These include:

This book is best suited for intermediate learners who are fluent in R and want to learn more about the practical applications of machine learning code. Students looking to delve into machine learning theory should opt for a more advanced book like Deep Learning, Hands-on Machine Learning, or Mathematics for Machine Learning.

Pattern Recognition and Machine Learning is an excellent reference for understanding statistical methods in machine learning. It provides practical exercises to introduce the reader to comprehensive pattern recognition techniques.

The text is broken into chapters that cover the following concepts:

Readers should have a thorough understanding of linear algebra and multivariable calculus, so it may be too advanced for beginners. Familiarity with basic probability theory, decision theory, and information theory will make the material easier to understand as well.

Mathematics for Machine Learning teaches the fundamental mathematical concepts necessary for machine learning. These topics include:

Some reviewers said this book leans more into mathematical theorems than practical application, so its not recommended for those without prior experience in applied mathematics. However, its one of the few resources that bridge the gap between mathematics and machine learning, so its a worthwhile investment for intermediate learners.

For advanced learners, Deep Learning covers the mathematics and concepts that power deep learning, a subset of machine learning that makes human-like decisions. This book walks through deep learning computations, techniques, and research including:

There are about 30 pages that cover practical applications of deep learning like computer vision and natural language processing, but the majority of the book deals with the theory behind deep learning. With this in mind, readers should have a working knowledge of machine learning concepts before delving into this text.

Read next: Ultimate Machine Learning Certification Guide

Read the original here:
Best Machine Learning Books to Read This Year [2022 List] - CIO Insight

PhD Position – Machine learning to increase geothermal energy efficiency, Karlsruhe Institute – ThinkGeoEnergy

The Karlsruhe Institute of Technology in Germany has an open PhD position for a project that will use machine learning to model scaling formation in cascade geothermal operations.

The Karlsruhe Institute of Technology (KIT) in Germany currently has an open PhD position in the upcoming Machine Learning for Enhancing Geothermal energy production (MALEG) project. Interested applicants may visit the official KIT page for more details on the application. Submissions will be accepted only until September 30, 2022.

The target of the MALEG project is the design and optimization of cascade production schemes aiming for the highest possible energy output in geothermal energy facilities by preventing scaling. The enhanced scaling potential of lower return temperatures is one key challenge as geothermal cascade use becomes a more common strategy to increase efficiency.

The research will be focusing on the development of a machine learning tool to quantify the impact of the enhanced cooling on the fluid-mineral equilibrium and to optimize the operations economically. The tool will be based on results from widely-applied deterministic models and experimental data collected at geothermal plants in Germany, Austria and Turkey by our international project partners. Once fully implemented the MALEG-tool will work as a digital twin of the power plant, ready to assess and predict scaling formation processes for geothermal production from different geological settings.

The ideal candidate should hold a masters degree in geosciences or geophysics with sound interest in aqueous geochemistry and experience in numerical modeling.

Source: Karlsruhe Institute of Technology

Read the original:
PhD Position - Machine learning to increase geothermal energy efficiency, Karlsruhe Institute - ThinkGeoEnergy

AI and Machine Learning in Finance: How Bots are Helping the Industry – ReadWrite

Artificial intelligence and ML are making considerable inroads in finance. They are the critical aspect of variousfinancial applications, including evaluating risks, managing assets, calculating credit scores, and approving loans.

Businesses use AI and ML:

Taking the above points into account, its no wonder that companies like Forbes and Venture beat are usingAI to predict the cash flow and detect fraud.

In this article, we present the financial domain areas in which AI and ML have a more significant impact. Well also discuss why financial companies should care about and implement these technologies.

Machine learning is a branch of artificial intelligence that allows learning and improvement without any programming. Simply put, data scientists train the MI model with existing data sets and automatically adjust its parameters to improve the outcome.

According to Statista, digital payments are expected to show an annual growth rate of 12.77% and grow to 20% by 2026. This vast number of global revenues, done online requires an intelligent fraud system.

Source: Mordor Intelligence

Traditionally, to check the authenticity of users, fraud-detection systems analyze websites through factors like location, merchant ID, the amount spent, etc. However, while this method is appropriate for a few transactions, it would not cope with the increased transactional amount.

And, analyzing the surge of digital payments, businesses cant rely on traditional fraud-detection methods to process payments. This gives rise to AI-based systems with advanced features.

An AI and ML-powered payment gateway will look at various factors to evaluate the risk score. These technologies consider a large volume of data (location of the merchant, time zone, IP address, etc.) to detect unexpected anomalies, and verify the authenticity of the customer.

Additionally, the finance industry, through AI, can process transactions in real-time, allowing the payment industry to process large transactions with high accuracy and low error rates.

The financial sector, including the banks, trading, and other fintech firms, are using AI to reduce operational costs, improve productivity, enhance users experience, and improve security.

The benefits of AI and ML revolve around their ability to work with various datasets. So lets have a quick look at some other ways AI and ML are making roads into this industry:

Considering how people invest their money in automation, AI significantly impacts the payment landscape. It improves efficiency and helps businesses to rethink and reconstruct their process. For example, businesses can use AI to decrease the credit card processing (gettrx dot com card processing guide for merchants) time, increase automation and seamlessly improve cash flow.

You can predict credit, lending, security, trading, baking, and process optimization with AI and machine learning.

Human error has always been a huge problem; however, with machine learning models, you can reduce human errors compared to humans doing repetitive tasks.

Incorporating security and ease of use is a challenge that AI can help the payment industry overcome. Merchants and clients want a payment system that is easy to use and authentic.

Until now, the customers have to perform various actions to authenticate themselves to complete a transaction. However, with AI, the payment providers can smooth transactions, and customers have low risk.

AI can efficiently perform high volume; labor-intensive tasks like quickly scrapping data and formatting things. Also, AI-based businesses are focused and efficient; they have minimum operational cost and can be used in the areas like:

Creating more Value:

AI and machine learning models can generate more value for their customers. For instance:

Improved customer experience: Using bots, financial sectors like banks can eliminate the need to stand in long queues. Payment gateways can automatically reach new customers by gathering their historical data and predicting user behavior. Besides, Ai used in credit scoring helps detect fraud activity.

There are various ways in which machine learning and artificial intelligence are being employed in the finance industry. Some of them are:

Process Automation:

Process automation is one of the most common applications as the technology helps automate manual and repetitive work, thereby increasing productivity.

Moreover, AI and ML can easily access data, follow and recognize patterns and interpret the behavior of customers. This could be used for the customer support system.

Minimizing Debit and Credit Card Frauds:

Machine learning algorithms help detect transactional funds by analyzing various data points that mostly get unnoticed by humans. ML also reduces the number of false rejections and improves the real-time approvals by gauging the clients behavior on the Internet.

Apart from spotting fraudulent activity, AI-powered technology is used to identify suspicious account behavior and fraudulent activity in real-time. Today, banks already have a monitoring system trained to catch the historical payment data.

Reducing False Card Declines:

Payment transactions declined at checkout can be frustrating for customers, putting huge repercussions on banks and their reputations. Card transactions are declined when the transaction is flagged as fraud, or the payment amount crosses the limit. AI-based systems are used to identify transaction issues.

The influx of AI in the financial sector has raised new concerns about its transparency and data security. Companies must be aware of these challenges and follow safeguards measures:

One of the main challenges of AI in finance is the amount of data gathered in confidential and sensitive forms. The correct data partner will give various security options and standards and protect data with the certification and regulations.

Creating AI models in finance that provide accurate predictions is only successful if they are explained to and understood by the clients. In addition, since customers information is used to develop such models, they want to ensure that their personal information is collected, stored, and handled securely.

So, it is essential to maintain transparency and trust in the finance industry to make customers feel safe with their transactions.

Apart from simply implementing AI in the online finance industry, the industry leaders must be able to adapt to the new working models with new operations.

Financial institutions often work with substantial unorganized data sets in vertical silos. Also, connecting dozens of data pipeline components and tons of APIS on top of security to leverage a silo is not easy. So, financial institutions need to ensure that their gathered data is appropriately structured.

AI and ML are undoubtedly the future of the financial sector; the vast volume of processes, transactions, data, and interactions involved with the transaction make them ideal for various applications. By incorporating AI, the finance sector will get vast data-processing capabilities at the best prices, while the clients will enjoy the enhanced customer experience and improved security.

Of course, the power of AI can be realized within transaction banking, which sits on the organizations usage. Today, AI is very much in progress, but we can remove its challenges by using the technology. Lastly, AI will be the future of finance you must be ready to embrace its revolution.

Featured Image Credit: Photo by Anna Nekrashevich; Pexels; Thank you!

Read the original:
AI and Machine Learning in Finance: How Bots are Helping the Industry - ReadWrite

Are You Making These Deadly Mistakes With Your AI Projects? – Forbes

Since data is at the heart of AI, it should come as no surprise that AI and ML systems need enough good quality data to learn. In general, a large volume of good quality data is needed, especially for supervised learning approaches, in order to properly train the AI or ML system. The exact amount of data needed may vary depending on which pattern of AI youre implementing, the algorithm youre using, and other factors such as in house versus third party data. For example, neural nets need a lot of data to be trained while decision trees or Bayesian classifiers dont need as much data to still produce high quality results.

So you might think more is better, right? Well, think again. Organizations with lots of data, even exabytes, are realizing that having more data is not the solution to their problems as they might expect. Indeed, more data, more problems. The more data you have, the more data you need to clean and prepare. The more data you need to label and manage. The more data you need to secure, protect, mitigate bias, and more. Small projects can rapidly turn into very large projects when you start multiplying the amount of data. In fact, many times, lots of data kills projects.

Clearly the missing step between identifying a business problem and getting the data squared away to solve that problem is determining which data you need and how much of it you really need. You need enough, but not too much. Goldilocks data is what people often say: not too much, not too little, but just right. Unfortunately, far too often, organizations are jumping into AI projects without first addressing an understanding of their data. Questions organizations need to answer include figuring out where the data is, how much of it they already have, what condition it is in, what features of that data are most important, use of internal or external data, data access challenges, requirements to augment existing data, and other crucial factors and questions. Without these questions answered, AI projects can quickly die, even drowning in a sea of data.

Getting a better understanding of data

In order to understand just how much data you need, you first need to understand how and where data fits into the structure of AI projects. One visual way of understanding the increasing levels of value we get from data is the DIKUW pyramid (sometimes also referred to as the DIKW pyramid) which shows how a foundation of data helps build greater value with Information, Knowledge, Understanding and Wisdom.

DIKW pyramid

With a solid foundation of data, you can gain additional insights at the next information layer which helps you answer basic questions about that data. Once you have made basic connections between data to gain informational insight, you can find patterns in that information to gain understanding of the how various pieces of information are connected together for greater insight. Building on a knowledge layer, organizations can get even more value from understanding why those patterns are happening, providing an understanding of the underlying patterns. Finally, the wisdom layer is where you can gain the most value from information by providing the insights into the cause and effect of information decision making.

This latest wave of AI focuses most on the knowledge layer, since machine learning provides the insight on top of the information layer to identify patterns. Unfortunately, machine learning reaches its limits in the understanding layer, since finding patterns isnt sufficient to do reasoning. We have machine learning, not but the machine reasoning required to understand why the patterns are happening. You can see this limitation in effect any time you interact with a chatbot. While the Machine learning-enabled NLP is really good at understanding your speech and deriving intent, it runs into limitations rying to understand and reason.For example, if you ask a voice assistant if you should wear a raincoat tomorrow, it doesn't understand that youre asking about the weather. A human has to provide that insight to the machine because the voice assistant doesnt know what rain actually is.

Avoiding Failure by Staying Data Aware

Big data has taught us how to deal with large quantities of data. Not just how its stored but how to process, manipulate, and analyze all that data. Machine learning has added more value by being able to deal with the wide range of different types of unstructured, semi-structured or structured data collected by organizations. Indeed, this latest wave of AI is really the big data-powered analytics wave.

But its exactly for this reason why some organizations are failing so hard at AI. Rather than run AI projects with a data-centric perspective, they are focusing on the functional aspects. To gain a handle of their AI projects and avoid deadly mistakes, organizations need a better understanding not only of AI and machine learning but also the Vs of big data. Its not just about how much data you have, but also the nature of that data. Some of those Vs of big data include:

With decades of experience managing big data projects, organizations that are successful with AI are primarily successful with big data. The ones that are seeing their AI projects die are the ones who are coming at their AI problems with application development mindsets.

Too Much of the Wrong Data, and Not Enough of the Right Data is Killing AI Projects

While AI projects start off on the right foot, the lack of the necessary data and the lack of understanding and then solving real problems are killing AI projects. Organizations are powering forward without actually having a real understanding of the data that they need and the quality of that data. This poses real challenges.

One of the reasons why organizations are making this data mistake is that they are running their AI projects without any real approach to doing so, other than using Agile or app dev methods. However, successful organizations have realized that using data-centric approaches focus on data understanding as one of the first phases of their project approaches. The CRISP-DM methodology, which has been around for over two decades, specifies data understanding as the very next thing to do once you determine your business needs. Building on CRISP-DM and adding Agile methods, the Cognitive Project Management for AI (CPMAI) Methodology requires data understanding in its Phase II. Other successful approaches likewise require a data understanding early in the project, because after all, AI projects are data projects. And how can you build a successful project on a foundation of data without running your projects with an understanding of data? Thats surely a deadly mistake you want to avoid.

See the original post here:
Are You Making These Deadly Mistakes With Your AI Projects? - Forbes

Prediction of mortality risk of health checkup participants using machine learning-based models: the J-SHC study | Scientific Reports – Nature.com

Participants

This study was conducted as part of the ongoing Study on the Design of a Comprehensive Medical System for Chronic Kidney Disease (CKD) Based on Individual Risk Assessment by Specific Health Examination (J-SHC Study). A specific health checkup is conducted annually for all residents aged 4074years, covered by the National Health Insurance in Japan. In this study, a baseline survey was conducted in 685,889 people (42.7% males, age 4074years) who participated in specific health checkups from 2008 to 2014 in eight regions (Yamagata, Fukushima, Niigata, Ibaraki, Toyonaka, Fukuoka, Miyazaki, and Okinawa prefectures). The details of this study have been described elsewhere11. Of the 685,889 baseline participants, 169,910 were excluded from the study because baseline data on lifestyle information or blood tests were not available. In addition, 399,230 participants with a survival follow-up of fewer than 5years from the baseline survey were excluded. Therefore, 116,749 patients (42.4% men) with a known 5-year survival or mortality status were included in this study.

This study was conducted in accordance with the Declaration of Helsinki guidelines. This study was approved by the Ethics Committee of Yamagata University (Approval No. 2008103). All data were anonymized before analysis; therefore, the ethics committee of Yamagata University waived the need for informed consent from study participants.

For the validation of a predictive model, the most desirable way is a prospective study on unknown data. In this study, the data on health checkup dates were available. Therefore, we divided the total data into training and test datasets to build and test predictive models based on health checkup dates. The training dataset consisted of 85,361 participants who participated in the study in 2008. The test dataset consisted of 31,388 participants who participated in this study from 2009 to 2014. These datasets were temporally separated, and there were no overlapping participants. This method would evaluate the model in a manner similar to a prospective study and has an advantage that can demonstrate temporal generalizability. Clipping was performed for 0.01% outliers for preprocessing, and normalization was performed.

Information on 38 variables was obtained during the baseline survey of the health checkups. When there were highly correlated variables (correlation coefficient greater than 0.75), only one of these variables was included in the analysis. High correlations were found between body weight, abdominal circumference, body mass index, hemoglobin A1c (HbA1c), fasting blood sugar, and AST and alanine aminotransferase (ALT) levels. We then used body weight, HbA1c level, and AST level as explanatory variables. Finally, we used the following 34 variables to build the prediction models: age, sex, height, weight, systolic blood pressure, diastolic blood pressure, urine glucose, urine protein, urine occult blood, uric acid, triglycerides, high-density lipoprotein cholesterol (HDL-C), LDL-C, AST, -glutamyl transpeptidase (GTP), estimated glomerular filtration rate (eGFR), HbA1c, smoking, alcohol consumption, medication (for hypertension, diabetes, and dyslipidemia), history of stroke, heart disease, and renal failure, weight gain (more than 10kg since age 20), exercise (more than 30min per session, more than 2days per week), walking (more than 1h per day), walking speed, eating speed, supper 2h before bedtime, skipping breakfast, late-night snacks, and sleep status.

The values of each item in the training data set for the alive/dead groups were compared using the chi-square test, Student t-test, and MannWhitney U test, and significant differences (P<0.05) were marked with an asterisk (*) (Supplementary Tables S1 and S2).

We used two machine learning-based methods (gradient boosting decision tree [XGBoost], neural network) and one conventional method (logistic regression) to build the prediction models. All the models were built using Python 3.7. We used the XGBoost library for GBDT, TensorFlow for neural network, and Scikit-learn for logistic regression.

The data obtained in this study contained missing values. XGBoost can be trained to predict even with missing values because of its nature; however, neural network and logistic regression cannot be trained to predict with missing values. Therefore, we complemented the missing values using the k-nearest neighbor method (k=5), and the test data were complemented using an imputer trained using only the training data.

The parameters required for each model were determined for the training data using the RandomizedSearchCV class of the Scikit-learn library and repeating fivefold cross-validation 5000 times.

The performance of each prediction model was evaluated by predicting the test dataset, drawing a ROC curve, and using the AUC. In addition, the accuracy, precision, recall, F1 scores (the harmonic mean of precision and recall), and confusion matrix were calculated for each model. To assess the importance of explanatory variables for the predictive models, we used SHAP and obtained SHAP values that express the influence of each explanatory variable on the output of the model4,12. The workflow diagram of this study is shown in Fig.5.

Workflow diagram of development and performance evaluation of predictive models.

See original here:
Prediction of mortality risk of health checkup participants using machine learning-based models: the J-SHC study | Scientific Reports - Nature.com

This Smart Doorbell Responds to Meowing Cats Using Machine Learning and IoT – Hackster.io

Those who own an outdoor cat or even several might run into the occasional problem of having to let them back in. Due to finding it annoying when having to constantly monitor for when his cat wanted to come inside the house, GitHub user gamename opted for a more automated system.

The solution gamename came up with involves listening to ambient sounds with a single Raspberry Pi and an attached USB microphone. Whenever the locally-running machine learning model detects a meow, it sends a message to an AWS service over the internet where it can then trigger a text to be sent. This has the advantage of limiting false events while simultaneously providing an easy way for the cat to be recognized at the door.

This project started by installing the AWS command-line interface (CLI) onto the Raspberry Pi 4 and then signing in with an account. From here, gamename registered a new IoT device, downloaded the resulting configuration files, and ran the setup script. After quickly updating some security settings, a new function was created that waits for new messages coming from the MQTT service and causes a text message to be sent with the help of the SNS service.

After this plethora of services and configurations had been made to the AWS project, gamename moved onto the next step of testing to see if messages are sent at the right time. His test script simply emulates a positive result by sending the certificates, key, topic, and message to the endpoint, where the user can then watch as the text appears on their phone a bit later.

The Raspberry Pi and microSD card were both placed into an off-the-shelf chassis, which sits just inside the house's entrance. After this, the microphone was connected with the help of two RJ45-to-USB cables that allow the microphone to sit outside inside of a waterproof housing up to 150 feet away.

Running on the Pi is a custom bash script that starts every time the board boots up, and its role is to launch the Python program. This causes the Raspberry Pi to read samples from the microphone and pass them to a TensorFlow audio classifier, which attempts to recognize the sound clip. If the primary noise is a cat, then the AWS API is called in order to publish the message to the MQTT topic. More information about this project can be found here in gamename's GitHub repository.

View post:
This Smart Doorbell Responds to Meowing Cats Using Machine Learning and IoT - Hackster.io

Tackling the reproducibility and driving machine learning with digitisation – Scientific Computing World

Dr Birthe Nielsen discusses the role of the Methods Database in supporting life sciences research by digitising methods data across different life science functions.

Reproducibility of experiment findings and data interoperability are two of the major barriers facing life sciences R&D today. Independently verifying findings by re-creating experiments and generating the same results is fundamental to progressing research to the next stage in its lifecycle, be it advancing a drug to clinical development, or a product to market. Yet, in the field of biology alone, one study found that 70 per cent of researchers are unable to reproduce the findings of other scientists, and 60 per cent are unable to reproduce their own findings.

This causes delays to the R&D process throughout the life sciences ecosystem. For example, biopharmaceutical companies often use an external Contract Research Organisation (CROs) to conduct clinical studies. Without a centralised repository to provide consistent access, analytical methods are often shared with CROs via email or even by physical documents, and not in a standard format but using an inconsistent terminology. This leads to unnecessary variability and several versions of the same analytical protocol. This makes it very challenging for a CRO to re-establish and revalidate methods without a labour-intensive process that is open to human interpretation and thus error.

To tackle issues like this, the Pistoia Alliance launched the Methods Hub project. The project aims to overcome the issue of reproducibility by digitising methods data across different life science functions, and ensuring data is FAIR (Findable, Accessible, Interoperable, Reusable) from the point of creation. This will enable seamless and secure sharing within the R&D ecosystem, reduce experiment duplication, standardise formatting to make data machine-readable, and increase reproducibility and efficiency. Robust data management is also the building block for machine learning and is the stepping-stone to realising the benefits of AI.

Digitisation of paper-based processes increases the efficiency and quality of methods data management. But it goes beyond manually keying in method parameters on a computer or using an Electronic Lab Notebook; A digital and automated workflow increases efficiency, instrument usages and productivity. Applying a shared data standards ensures consistency and interoperability in addition to fast and secure transfer of information between stakeholders.

One area that organisations need to address to comply with FAIR principles, and a key area in which the Methods Hub project helps, is how analytical methods are shared. This includes replacing free-text data capture with a common data model and standardised ontologies. For example, in a High-Performance Liquid Chromatography (HPLC) experiment, rather than manually typing out the analytical parameters (pump flow, injection volume, column temperature etc. etc.), the scientist will simply download a method which will automatically populate the execution parameters in any given Chromatographic Data System (CSD). This not only saves time during data entry, but the common format eliminates room for human interpretation or error.

Additionally, creating a centralised repository like the Methods Hub in a vendor-neutral format is a step towards greater cyber-resiliency in the industry. When information is stored locally on a PC or an ELN and is not backed up, a single cyberattack can wipe it out instantly. Creating shared spaces for these notes via the cloud protects data and ensures it can be easily restored.

A proof of concept (PoC) via the Methods Hub project was recently successfully completed to demonstrate the value of methods digitisation. The PoC involved the digital transfer via cloud of analytical HPLC methods, proving it is possible to move analytical methods securely between two different companies and CDS vendors with ease. It has been successfully tested in labs at Merck and GSK, where there has been an effective transfer of HPLC-UV information between different systems. The PoC delivered a series of critical improvements to methods transfer that eliminated the manual keying of data, reduces risk, steps, and error, while increasing overall flexibility and interoperability.

The Alliance project team is now working to extend the platforms functionality to connect analytical methods with results data, which would be an industry first. The team will also be adding support for columns and additional hardware and other analytical techniques, such as mass spectrometry and nuclear magnetic resonance spectroscopy (NMR). It also plans to identify new use cases, and further develop the cloud platform that enables secure methods transfer.

If industry-wide data standards and approaches to data management are to be agreed on and implemented successfully, organisations must collaborate. The Alliance recognises methods data management is a big challenge for the industry, and the aim is to make Methods Hub an integral part of the system infrastructure in every analytical lab.

Tackling issues such as digitisation of methods data doesnt just benefit individual companies but will have a knock-on effect for the whole life sciences industry. Introducing shared standards accelerates R&D, improves quality, and reduces the cost and time burden on scientists and organisations. Ultimately this ensures that new therapies and breakthroughs reach patients sooner. We are keen to welcome new contributors to the project, so we can continue discussing common barriers to successful data management, and work together to develop new solutions.

Dr Birthe Nielsen is the Pistoia Alliance Methods Database project manager

View original post here:
Tackling the reproducibility and driving machine learning with digitisation - Scientific Computing World