Can machine learning algorithms perform better than multiple linear regression in predicting nitrogen excretion from lactating dairy cows

All the experiments were conducted at the Agri-Food and Biosciences Institute (AFBI) farm at Hillsborough, County Down, UK. All the experiments and procedures complied with the requirements of the UK Animals (Scientific Procedures) Act 1986 and were approved by the AFBI Hillsborough Ethical Review Group. All the experiments were performed in accordance with relevant guidelines and regulations (following the ARRIVE guidelines26).

Data used were collated from 43 total diet digestibility studies with 951 lactating dairy cows undertaken at Agri-Food and Biosciences Institute in Northern Ireland over a period of 26years (19902015). The data from studies undertaken between 1990 and 2002 were used as the training dataset (n=564) and undertaken between 2005 and 2015 as the testing dataset (n=387). The training data were used to develop prediction models for MN using MLR and the three selected machine learning algorithms (ANN, RFR and SVR). These new models were then tested for their predictive performance using the training dataset by tenfold cross validation. The testing dataset were used for the independent evaluation and comparison of predictive ability of different modeling approaches. The information of the two datasets on numbers of experiments, cow genotypes and forage types offered are presented in Table 10. Data on live weight, milk production, feed intake, N intake and outputs are presented in Table 11. The datasets used in the present study showed a various cow genetic merit and a broad range in LW (379781kg), MY (5.140.2kg/d), total dry matter intake (7.5426.6kg/d), FP (0.211.00%), DNC (19.038.0g/kg DM), diet metabolizable energy concentration (DMEC, 9.6819.4MJ/kg DM) and NI (155874g/d), which represents typical dairy production conditions managed within grassland-based dairy systems in the West and North Europe.

Cows were housed in free-stall cubicle accommodation for at least 20 d before commencing digestibility trials in metabolism units for 8 d with feed intake, milk production and feces and urine collected during the final 6 d. Throughout the whole experiment, cows were offered experimental diets ad libitum and had free access to water. During the final 6 d, the following measurements for each individual cows were carried out to generate total digestibility data used in the present study. Forages and concentrates offered and refused were recorded daily and sampled for analysis of feed dry matter (DM), N concentration and forage proportion. Feces and urine outputs were collected daily and sampled for DM (feces only) and N concentration. Milk yield was recorded daily and sampled for analysis fat, protein and lactose concentrations. Live weight was measured on the first and last days in the metabolism unit. Details in feed intake, feces and urine collection and methods used for analysis of feed, feces, urine and milk samples were described by Yan et al.6.

Because features (variables) in raw data may have different dynamic ranges, which may result in poor model performance, it is recommended to normalize them to make ANN training more efficient by performing normalization process for the raw inputs10. In the present study, all the input data for ANN models were normalized into the interval [0, 1] by performing MinMax normalization technique27 using Eq.(1):

$$X_{norm} = frac{{X {-} X_{min} }}{{X_{max} - X_{min} }}$$

(1)

where Xnorm or X is the normalized or original value, Xmin or Xmax is the minimum or maximum values of the input data.

After finding the optimal tuning parameter, all normalized data for MN obtained by ANN models were denormalized into their original scale using Eq.(2) 27:

$$Y = Y_{norm} * , left( {Y_{max} - , Y_{min} } right) , + , Y_{min}$$

(2)

where Ynorm or Y is the normalized or demoralized value, Ymin or Ymax is the minimum or maximum values of the output data.

Feature selection is an essential step during development of models, which can hugely impact the generalization and predictive ability of models10,28. In the present study, a hybrid knowledge-based and data driven approach was developed and implemented to selecting features. Knowledge in animal science and the process of digestibility trial were applied to diagnosing and removing irrelevant features before the implementing of data driven feature selection process. For instance, the features of feces N output (FN) and urine N output (UN) were excluded from the set of features in present study according to prior background and expert knowledge. Because the data of UN and FN were obtained from analyzing urine and feces samples and then they were summed up and treated as new feature MN, both FN and UN are heavily correlated with MN. Their inclusion in the features list might cause poor generalization performance of the models. Furthermore, the optimal features selected from data driven approach may need to be diagnosed based on background knowledge in animal science according to the scenarios of model application. For instance, several variables (e.g. NI and FP) included in datasets used in this study may not be available in commercial farms. Therefore, alternative feature (concentrate dry matte intake, CDMI) was selected and included into the feature list in this study based on the domain knowledge and then new ANN model suits for commercial farms was developed.

The filter method was applied for feature selection using the Pearson correlation matrix and variance inflation factor (VIF) technique. The first step was to use the Pearson correlation matrix to identify features which might correlate each other for prediction of MN excretion, because using correlated features in models could influence performance of these models with a biased outcome. If two features were heavily correlated, the less important one was removed from the set of features to minimize adverse effects on model performance. Afterwards, the VIF analysis was applied to detect multicollinearity, which has been widely used as a measure of the degree of multicollinearity among input features. A VIF score was calculated for each feature and those with high values were removed. The threshold score for the VIF analysis was 5 and features with a VIF score below this threshold were selected. The VIF score was computed by VIF function in R29.

In the present study, four models based on the MLR ANN, RFR and SVR were developed using the training dataset and these new models were tested using the testing dataset for comparison of their prediction performance for MN outputs in lactating dairy cows (presented later). The MLR with the stepwise procedure for selection of independent variables was used as benchmark model since it is a well-known technique and has been applied for modelling in a wide range of applications in animal science research. Alternative modeling approaches proposed in the present study were ANN, RFR and SVR. To compare the performance, models developed with different approaches and ensure that the same resampling sets were used between calls, the same random number seeds were set prior to perform the process of training, fitting and testing models. All statistical analyses were performed with R29.

The MLR model (Eq.3) selected in the present study for the prediction of MN output was published in 20066 which was developed using the same training dataset listed in Table 2. To improve the estimation of the regression parameters, experiment was included as a random factor during the development of MLR model. The dataset had a large range within each dependent or independent variable, e.g., MN, NI, LW, MY, FP and DNC, which is vital to ensure the development of robust regression model applicable under various farming conditions10.

$${text{MN }}left( {{text{g}}/{text{d}}} right) , = , 0.{text{749 NI }} + , 0.0{text{65 LW }}{-}{ 1}.{text{515 MY }}{-}{ 17}.0$$

(3)

where NI, LW and MY are N intake (g/d), live weight (kg) and milk yield (kg/d), respectively.

In the present study, ANN was fitted using R package neuralnet which was built to train neural networks in the context of regression analyses. The details of ANN training and application of neuralnet were described by Gnther and Fritsch30. Multilayer perceptron networks trained with backpropagation learning algorithms were used and consist of an input layer, hidden layer(s) and an output layer. The input variables were obtained by using the feature selection algorithm described in the section Knowledge-based and data driven feature selection, and the neuron in output layer represents MN. The ANN models were trained based on the selection of training algorithms and learning parameters including the number of hidden layer(s), number of neurons in hidden layer(s), error function, threshold for partial derivatives of the error function as stopping criteria, and activation function etc.. The optimized number of hidden layer(s), number of neuron(s) in the hidden layer(s), learning algorithms, learning rate and other learning parameters were obtained on the basis of prediction performance measured as relative root mean square error (RRMSE, Eq.6) with tenfold cross validation and then the best topology/architecture was finalized.

The RFR is an ensemble machine learning method and a nonparametric technique derived from classification and regression trees which are constructed using a bootstrap aggregating (bagging) method from the training data31. In RFR, prediction is conducted by averaging the individual tree predictions. A detailed description of RFR theory can be found in the report by Breiman32. The RFR was implemented by the randomForest function in the R package (version 3.6.1). To select the optimal hyperparameters for learning algorithm, tuning process was performed based on the R package ranger. The hyperparameters include number of trees to grow (ntree), number of randomly drawn candidate variables (mtry), sample size and node size. Grid search strategy was used to choose the candidate hyperparameter values and the performances of the trained algorithm with different values of the hyperparameters were evaluated as RRMSE (Eq.6) by using tenfold cross validation.

The SVR uses similar principles as support vector machine, a supervised non-parametrical statistical learning technique that uses the kernel functions and the maximum margin algorithm to solve the nonlinear problem33. The detailed theoretical background and description of SVR can be found in the report by Cristianini and Shawe-Taylor34. The SVR model performs the regression estimation by risk minimization where the risk is measured by a loss function. In this study, R package e1071 was used and the svm function was implemented to fit SVR model. The radial basis kernels function, the most commonly used kernels types, was employed in training and predicting process. Parameter tuning was performed by using grid search over supplied parameter ranges and the best combination of parameters (lowest RMSE) were selected. The performance of SVR model was measured as RRMSE (Eq.6) with tenfold cross validation.

The MLR model and the three new models (ANN, RFR and SVR) was developed and compared in terms of their prediction performance for MN outputs in lactating dairy cows based on the datasets listed in Table 2. The predictive performance of models were evaluated using coefficient of determination (R2), root mean square error (RMSE), relative root mean square error (RRMSE) and concordance correlation coefficient (CCC), based on the actual and predicted values. The R2 was calculated using Eq.(4). The RMSE and RRMSE were produced in a tenfold cross validation process (10 RMSE data generated) using Eq.(5)35 and Eq.(6)36, respectively. The concordance correlation coefficient (CCC), a further measure of the agreement between observed and predicted values, was given by Eq.(7)37. The tenfold cross validation was used to evaluate prediction performance of these models (MLR, ANN, RFR and SVR)The obtained RMSE, RRMSE and CCC values (n=10) through the tenfold cross validation were compared among the 4 models using one-way analysis of variance and then followed by Tukeys honest significant difference (HSD) test (=0.05). The same cross validation folds were used for all modeling scenarios to compare cross all of the models performance.

$$R^{2} = 1 - frac{{sum left( {y_{i} - hat{y}} right)^{2} }}{{sum left( {y_{i} - overline{y}} right)^{2} }}$$

(4)

$$RMSE = sqrt { frac{1}{n}mathop sum limits_{i = 1}^{n} left( {y_{i} - hat{y}} right)^{2} }$$

(5)

$$RRMSE = (RMSE/overline{y}) times , 100$$

(6)

$$CCC = frac{{2 cdot ,r cdot ,S_{{widehat{y}}} cdot,S_{y} }}{{S_{{widehat{y}}}^{2} + S_{y}^{2} + left( {mathop sum nolimits_{i = 1}^{n} frac{{left( {y_{i} - widehat{y}} right)}}{n}} right)^{2} }}$$

(7)

where (y_{i}) is actual MN, (widehat{{y_{i} }}) is predicted MN, (overline{y}) is the mean of actual MN and n is the number of observations, r is the Pearson correlation coefficient between (widehat{{y_{i} }}) and (overline{y}), (S_{{hat{y}}}) and (S_{y}) are the respective standard divisions.

See the article here:
Can machine learning algorithms perform better than multiple linear regression in predicting nitrogen excretion from lactating dairy cows | Scientific...

Microsoft reveals how it caught mutating Monero mining malware with machine learning - The Next Web [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
The role of machine learning in IT service management - ITProPortal [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
Workday talks machine learning and the future of human capital management - ZDNet [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
Verification In The Era Of Autonomous Driving, Artificial Intelligence And Machine Learning - SemiEngineering [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
Synthesis-planning program relies on human insight and machine learning - Chemical & Engineering News [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
Here's why machine learning is critical to success for banks of the future - Tech Wire Asia [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
The 10 Hottest AI And Machine Learning Startups Of 2019 - CRN: The Biggest Tech News For Partners And The IT Channel [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
Onica Showcases Advanced Internet of Things, Artificial Intelligence, and Machine Learning Capabilities at AWS re:Invent 2019 - PR Web [Last Updated On: December 3rd, 2019] [Originally Added On: December 3rd, 2019]
Machine Learning Answers: If Caterpillar Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 3rd, 2019] [Originally Added On: December 3rd, 2019]
Amazons new AI keyboard is confusing everyone - The Verge [Last Updated On: December 5th, 2019] [Originally Added On: December 5th, 2019]
Exploring the Present and Future Impact of Robotics and Machine Learning on the Healthcare Industry - Robotics and Automation News [Last Updated On: December 5th, 2019] [Originally Added On: December 5th, 2019]
3 questions to ask before investing in machine learning for pop health - Healthcare IT News [Last Updated On: December 5th, 2019] [Originally Added On: December 5th, 2019]
Amazon Wants to Teach You Machine Learning Through Music? - Dice Insights [Last Updated On: December 5th, 2019] [Originally Added On: December 5th, 2019]
Measuring Employee Engagement with A.I. and Machine Learning - Dice Insights [Last Updated On: December 6th, 2019] [Originally Added On: December 6th, 2019]
The NFL And Amazon Want To Transform Player Health Through Machine Learning - Forbes [Last Updated On: December 11th, 2019] [Originally Added On: December 11th, 2019]
Scientists are using machine learning algos to draw maps of 10 billion cells from the human body to fight cancer - The Register [Last Updated On: December 11th, 2019] [Originally Added On: December 11th, 2019]
Appearance of proteins used to predict function with machine learning - Drug Target Review [Last Updated On: December 11th, 2019] [Originally Added On: December 11th, 2019]
Google is using machine learning to make alarm tones based on the time and weather - The Verge [Last Updated On: December 11th, 2019] [Originally Added On: December 11th, 2019]
10 Machine Learning Techniques and their Definitions - AiThority [Last Updated On: December 11th, 2019] [Originally Added On: December 11th, 2019]
Taking UX and finance security to the next level with IBM's machine learning - The Paypers [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
Government invests 49m in data analytics, machine learning and AI Ireland, news for Ireland, FDI,Ireland,Technology, - Business World [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
Machine Learning Answers: If Nvidia Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
Bing: To Use Machine Learning; You Have To Be Okay With It Not Being Perfect - Search Engine Roundtable [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
IQVIA on the adoption of AI and machine learning - OutSourcing-Pharma.com [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
Schneider Electric Wins 'AI/ Machine Learning Innovation' and 'Edge Project of the Year' at the 2019 SDC Awards - PRNewswire [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
Industry Call to Define Universal Open Standards for Machine Learning Operations and Governance - MarTech Series [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
Qualitest Acquires AI and Machine Learning Company AlgoTrace to Expand Its Offering - PRNewswire [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
Automation And Machine Learning: Transforming The Office Of The CFO - Forbes [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
Machine learning results: pay attention to what you don't see - STAT [Last Updated On: December 12th, 2019] [Originally Added On: December 12th, 2019]
The challenge in Deep Learning is to sustain the current pace of innovation, explains Ivan Vasilev, machine learning engineer - Packt Hub [Last Updated On: December 15th, 2019] [Originally Added On: December 15th, 2019]
Israelis develop 'self-healing' cars powered by machine learning and AI - The Jerusalem Post [Last Updated On: December 15th, 2019] [Originally Added On: December 15th, 2019]
Theres No Such Thing As The Machine Learning Platform - Forbes [Last Updated On: December 15th, 2019] [Originally Added On: December 15th, 2019]
Global Contextual Advertising Markets, 2019-2025: Advances in AI and Machine Learning to Boost Prospects for Real-Time Contextual Targeting -... [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
Machine Learning Answers: If Twitter Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
Tech connection: To reach patients, pharma adds AI, machine learning and more to its digital toolbox - FiercePharma [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
Machine Learning Answers: If Seagate Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
MJ or LeBron Who's the G.O.A.T.? Machine Learning and AI Might Give Us an Answer - Built In Chicago [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
Amazon Releases A New Tool To Improve Machine Learning Processes - Forbes [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
AI and machine learning platforms will start to challenge conventional thinking - CRN.in [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
What is Deep Learning? Everything you need to know - TechRadar [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
Machine Learning Answers: If BlackBerry Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
QStride to be acquired by India-based blockchain, analytics, machine learning consultancy - Staffing Industry Analysts [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
Dotscience Forms Partnerships to Strengthen Machine Learning - Database Trends and Applications [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
The Machines Are Learning, and So Are the Students - The New York Times [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
Kubernetes and containers are the perfect fit for machine learning - JAXenter [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
Data science and machine learning: what to learn in 2020 - Packt Hub [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
What is Machine Learning? A definition - Expert System [Last Updated On: December 20th, 2019] [Originally Added On: December 20th, 2019]
Want to dive into the lucrative world of deep learning? Take this $29 class. - Mashable [Last Updated On: December 24th, 2019] [Originally Added On: December 24th, 2019]
Another free web course to gain machine-learning skills (thanks, Finland), NIST probes 'racist' face-recog and more - The Register [Last Updated On: December 24th, 2019] [Originally Added On: December 24th, 2019]
TinyML as a Service and machine learning at the edge - Ericsson [Last Updated On: December 24th, 2019] [Originally Added On: December 24th, 2019]
Machine Learning in 2019 Was About Balancing Privacy and Progress - ITPro Today [Last Updated On: December 24th, 2019] [Originally Added On: December 24th, 2019]
Ten Predictions for AI and Machine Learning in 2020 - Database Trends and Applications [Last Updated On: December 25th, 2019] [Originally Added On: December 25th, 2019]
The Value of Machine-Driven Initiatives for K12 Schools - EdTech Magazine: Focus on Higher Education [Last Updated On: December 25th, 2019] [Originally Added On: December 25th, 2019]
CMSWire's Top 10 AI and Machine Learning Articles of 2019 - CMSWire [Last Updated On: December 25th, 2019] [Originally Added On: December 25th, 2019]
Machine Learning Market Accounted for US$ 1,289.5 Mn in 2016 and is expected to grow at a CAGR of 49.7% during the forecast period 2017 2025 - The... [Last Updated On: December 27th, 2019] [Originally Added On: December 27th, 2019]
Are We Overly Infatuated With Deep Learning? - Forbes [Last Updated On: December 27th, 2019] [Originally Added On: December 27th, 2019]
Can machine learning take over the role of investors? - TechHQ [Last Updated On: December 27th, 2019] [Originally Added On: December 27th, 2019]
Dr. Max Welling on Federated Learning and Bayesian Thinking - Synced [Last Updated On: December 28th, 2019] [Originally Added On: December 28th, 2019]
2010 2019: The rise of deep learning - The Next Web [Last Updated On: January 4th, 2020] [Originally Added On: January 4th, 2020]
Machine Learning Answers: Sprint Stock Is Down 15% Over The Last Quarter, What Are The Chances It'll Rebound? - Trefis [Last Updated On: January 4th, 2020] [Originally Added On: January 4th, 2020]
Sports Organizations Using Machine Learning Technology to Drive Sponsorship Revenues - Sports Illustrated [Last Updated On: January 4th, 2020] [Originally Added On: January 4th, 2020]
What is deep learning and why is it in demand? - Express Computer [Last Updated On: January 4th, 2020] [Originally Added On: January 4th, 2020]
Byrider to Partner With PointPredictive as Machine Learning AI Partner to Prevent Fraud - CloudWedge [Last Updated On: January 4th, 2020] [Originally Added On: January 4th, 2020]
Stare into the mind of God with this algorithmic beetle generator - SB Nation [Last Updated On: January 5th, 2020] [Originally Added On: January 5th, 2020]
US announces AI software export restrictions - The Verge [Last Updated On: January 5th, 2020] [Originally Added On: January 5th, 2020]
How AI And Machine Learning Can Make Forecasting Intelligent - Demand Gen Report [Last Updated On: January 5th, 2020] [Originally Added On: January 5th, 2020]
Fighting the Risks Associated with Transparency of AI Models - EnterpriseTalk [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
NXP Debuts i.MX Applications Processor with Dedicated Neural Processing Unit for Advanced Machine Learning at the Edge - GlobeNewswire [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
Cerner Expands Collaboration with Amazon Web as its Preferred Machine Learning Provider - Story of Future [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
Can We Do Deep Learning Without Multiplications? - Analytics India Magazine [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
Machine learning is innately conservative and wants you to either act like everyone else, or never change - Boing Boing [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
Pear Therapeutics Expands Pipeline with Machine Learning, Digital Therapeutic and Digital Biomarker Technologies - Business Wire [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
FLIR Systems and ANSYS to Speed Thermal Camera Machine Learning for Safer Cars - Business Wire [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
SiFive and CEVA Partner to Bring Machine Learning Processors to Mainstream Markets - PRNewswire [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
Tiny Machine Learning On The Attiny85 - Hackaday [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
Finally, a good use for AI: Machine-learning tool guesstimates how well your code will run on a CPU core - The Register [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
AI, machine learning, and other frothy tech subjects remained overhyped in 2019 - Boing Boing [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
Chemists are training machine learning algorithms used by Facebook and Google to find new molecules - News@Northeastern [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
AI and machine learning trends to look toward in 2020 - Healthcare IT News [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]
What Is Machine Learning? | How It Works, Techniques ... [Last Updated On: January 7th, 2020] [Originally Added On: January 7th, 2020]