Accelerating large-scale neural network training on CPUs with ThirdAI and AWS Graviton | Amazon Web Services – AWS Blog

Posted on March 2, 2024 by Danzig

This guest post is written by Vihan Lakshman, Tharun Medini, and Anshumali Shrivastava from ThirdAI.

Large-scale deep learning has recently produced revolutionary advances in a vast array of fields. Although this stunning progress in artificial intelligence remains remarkable, the financial costs and energy consumption required to train these models has emerged as a critical bottleneck due to the need for specialized hardware like GPUs. Traditionally, even modestly sized neural models have required costly hardware accelerators for training, which limits the number of organizations with the financial means to take full advantage of this technology.

Founded in 2021, ThirdAI Corp. is a startup dedicated to the mission of democratizing artificial intelligence technologies through algorithmic and software innovations that fundamentally change the economics of deep learning. We have developed a sparse deep learning engine, known as BOLT, that is specifically designed for training and deploying models on standard CPU hardware as opposed to costly and energy-intensive accelerators like GPUs. Many of our customers have reported strong satisfaction with ThirdAIs ability to train and deploy deep learning models for critical business problems on cost-effective CPU infrastructure.

In this post, we investigate of potential for the AWS Graviton3 processor to accelerate neural network training for ThirdAIs unique CPU-based deep learning engine.

At ThirdAI, we achieve these breakthroughs in efficient neural network training on CPUs through proprietary dynamic sparse algorithms that activate only a subset of neurons for a given input (see the following figure), thereby side-stepping the need for full dense computations. Unlike other approaches to sparse neural network training, ThirdAI uses locality-sensitive hashing to dynamically select neurons for a given input as shown in the bold lines below. In certain cases, we have even observed that our sparse CPU-based models train faster than the comparable dense architecture on GPUs.

Given that many of our target customers operate in the cloudand among those, the majority use AWSwe were excited to try out the AWS Graviton3 processor to see if the impressive price-performance improvements of Amazons silicon innovation would translate to our unique workload of sparse neural network training and thereby provide further savings for customers. Although both the research community and the AWS Graviton team have delivered exciting advances in accelerating neural network inference on CPU instances, we at ThirdAI are, to our knowledge, the first to seriously study how to train neural models on CPUs efficiently.

As shown in our results, we observed a significant training speedup with AWS Graviton3 over the comparable Intel and NVIDIA instances on several representative modeling workloads.

For our evaluation, we considered two comparable AWS CPU instances: a c6i.8xlarge machine powered by Intels Ice Lake processor and a c7g.8xlarge powered by AWS Graviton3. The following table summarizes the details of each instance.

For our first evaluation, we focus on the problem of extreme multi-label classification (XMC), an increasingly popular machine learning (ML) paradigm with a number of practical applications in search and recommendations (including at Amazon). For our evaluation, we focus on the public Amazon-670K product recommendation task, which, given an input product, identifies similar products from a collection of over 670,000 items.

In this experiment, we benchmark ThirdAIs BOLT engine against TensorFlow 2.11 and PyTorch 2.0 on the aforementioned hardware choices: Intel Ice Lake, AWS Graviton3, and an NVIDIA T4G GPU. For our experiments on Intel and AWS Graviton, we use the AWS Deep Learning AMI (Ubuntu 18.04) version 59.0. For our GPU evaluation, we use the NVIDIA GPU-Optimized Arm64 AMI, available via the AWS Marketplace. For this evaluation, we use the SLIDE model architecture, which achieves both competitive performance on this extreme classification task and strong training performance on CPUs. For our TensorFlow and PyTorch comparisons, we implement the analogous version of the SLIDE multi-layer perceptron (MLP) architecture with dense matrix multiplications. We train each model for five epochs (full passes through the training dataset) with a fixed batch size of 256 and learning rate of 0.001. We observed that all models achieved the same test accuracy of 33.6%.

The following chart compares the training time of ThirdAIs BOLT to TensorFlow 2.11 and PyTorch 2.0 on the Amazon670k extreme classification benchmark. All models achieve the same test precision. We observe that AWS Graviton3 considerably accelerates the performance of BOLT out of the box with no customizations neededby approximately 40%. ThirdAIs BOLT on AWS Graviton3 also achieves considerably faster training than the TensorFlow or PyTorch models trained on the GPU. Note that there is no ThirdAI result on the NVIDIA GPU benchmark because BOLT is designed to run on CPUs. We do not include TensorFlow and PyTorch CPU benchmarks because of the prohibitively long training time.

The following table summarizes the training time and test accuracy for each processor/specialized processor(GPU).

For our second evaluation, we focus on the popular Yelp Polarity sentiment analysis benchmark, which involves classifying a review as positive or negative. For this evaluation, we compare ThirdAIs Universal Deep Transformers (UDT) model against a fine-tuned DistilBERT network, a compressed pre-trained language model that achieves near-state-of-the-art performance with reduced inference latency. Because fine-tuning DistilBERT models on a CPU would take a prohibitively long time (at least several days), we benchmark ThirdAIs CPU-based models against DistilBERT fine-tuned on a GPU. We train all models with a batch size of 256 for a single pass through the data (one epoch). We note that we can achieve slightly higher accuracy with BOLT with additional passes through the data, but we restrict ourselves to a single pass in this evaluation for consistency.

As shown in the following figure, AWS Graviton3 again accelerates ThirdAIs UDT model training considerably. Furthermore, UDT is able to achieve comparable test accuracy to DistilBERT with a fraction of the training time and without the need for a GPU. We note that there has also been recent work in optimizing the fine-tuning of Yelp Polarity on CPUs. Our models, however, still achieve greater efficiency gains and avoid the cost of pre-training, which is substantial and requires the use of hardware accelerators like GPUs.

The following table summarizes the training time, test accuracy, and inference latency.

For our final evaluation, we focus on the problem of multi-class text classification, which involves assigning a label to a given input text from a set of more than two output classes. We focus on the DBPedia benchmark, which consists of 14 possible output classes. Again, we see that AWS Graviton3 accelerates UDT performance over the comparable Intel instance by roughly 40%. We also see that BOLT achieves comparable results to the DistilBERT transformer-based model fine-tuned on a GPU while achieving sub-millisecond latency.

The following table summarizes the training time, test accuracy, and inference latency.

We have designed our BOLT software for compatibility with all major CPU architectures, including AWS Graviton3. In fact, we didnt have to make any customizations to our code to run on AWS Graviton3. Therefore, you can use ThirdAI for model training and deployment on AWS Graviton3 with no additional effort. In addition, as detailed in our recent research whitepaper, we have developed a set of novel mathematical techniques to automatically tune the specialized hyperparameters associated with our sparse models, allowing our models to work well immediately out of the box.

We also note that our models primarily work well for search, recommendation, and natural language processing tasks that typically feature large, high-dimensional output spaces and a requirement of extremely low inference latency. We are actively working on extending our methods to additional domains, such as computer vision, but be aware that our efficiency improvements do not translate to all ML domains at this time.

In this post, we investigated the potential for the AWS Graviton3 processor to accelerate neural network training for ThirdAIs unique CPU-based deep learning engine. Our benchmarks on search, text classification, and recommendations benchmarks suggest that AWS Graviton3 can accelerate ThirdAIs model training workloads by 3040% over the comparable x86 instances with a price-performance improvement of nearly 50%. Furthermore, because AWS Graviton3 instances are available at a lower cost than the analogous Intel and NVIDIA machines and enable shorter training and inference times, you can further unlock the value of the AWS pay-as-you-go usage model by using lower-cost machines for shorter durations of time.

We are very excited by the price and performance savings of AWS Graviton3 and will look to pass on these improvements to our customers so they can enjoy faster ML training and inference with improved performance on low-cost CPUs. As customers of AWS ourselves, we are delighted by the speed at which AWS Graviton3 allows us to experiment with our models, and we look forward to using more cutting-edge silicon innovation from AWS going forward. Graviton Technical Guide is a good resource to consider while evaluating your ML workloads to run on Graviton. You can also try Graviton t4g instances free trial.

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post. At the time of writing the blog the most current instance were c6i and hence the comparison was done with c6i instances.

Vihan Lakshman Vihan Lakshman is a research scientist at ThirdAI Corp. focused on developing systems for resource-efficient deep learning. Prior to ThirdAI, he worked as an Applied Scientist at Amazon and receivedundergraduate and masters degrees from Stanford University. Vihan is also a recipient of a National Science Foundation research fellowship.

Tharun Medini Tharun Medini is the co-founder and CTO of ThirdAI Corp. He did his PhD in Hashing Algorithms for Search and Information Retrieval at Rice University. Prior to ThirdAI, Tharun worked at Amazon and Target. Tharun is the recipient of numerous awards for his research, including the Ken Kennedy Institute BP Fellowship, the American Society of Indian Engineers Scholarship, and a Rice University Graduate Fellowship.

Anshumali Shrivastava Anshumali Shrivastavais an associate professor in the computer science department at Rice University. He is also the Founder and CEO of ThirdAI Corp, a company that is democratizing AI to commodity hardware through software innovations. His broad research interests include probabilistic algorithms for resource-frugal deep learning. In 2018, Science news named him one of the Top-10 scientists under 40 to watch. He is a recipient of the National Science Foundation CAREER Award, a Young Investigator Award from the Air Force Office of Scientific Research, a machine learning research award from Amazon, and a Data Science Research Award from Adobe. He has won numerous paper awards, including Best Paper Awards at NIPS 2014 andMLSys 2022, as well as the Most Reproducible Paper Award at SIGMOD 2019. His work on efficient machine learning technologies on CPUs has been covered by popular press including Wall Street Journal, New York Times, TechCrunch, NDTV, etc.

Read the original post:
Accelerating large-scale neural network training on CPUs with ThirdAI and AWS Graviton | Amazon Web Services - AWS Blog

AI meets green: The future of environmental protection with ChatGPT – EurekAlert

Posted on March 2, 2024 by Danzig

image:

Graphical abstract.

Credit: Eco-Environment & Health

A recent study introduce a novel paradigm combining ChatGPT with machine learning (ML) to significantly ease the application of ML in environmental science. This approach promises to bridge knowledge gaps and democratize the use of complex ML models for environmental sustainability.

The rapid growth of environmental data presents a significant challenge in analyzing complex pollution networks. While ML has been a pivotal tool, its widespread adoption has been hindered by a steep learning curve and a significant knowledge gap among environmental scientists.

A new study(doi: https://doi.org/10.1016/j.eehl.2024.01.006), published in Eco-Environment & Health on February 3, 2024, has developed a groundbreaking approach that merges ChatGPT with machine learning to streamline its use in environmental science..

This research introduces a user-friendly framework, aptly named "ChatGPT + ML + Environment," designed to democratize the application of machine learning in environmental studies. By simplifying the complex processes of data handling, model selection, and algorithm training, this paradigm empowers environmental scientists, regardless of their computational expertise, to leverage machine learning's full potential. The method involves using ChatGPT's intuitive conversational interface to guide users through the intricate steps of machine learning, from initial data analysis to the interpretation of results.

Highlights A new paradigm of ChatGPT + Machine learning (ML) + Environment is presented. The novelty and knowledge gaps of ML for decoupling the complexity of environmental big data are discussed. The new paradigm guided by GPT reduces the threshold of using Machine Learning in environmental research. The importance of secondary training for using ChatGPT + ML + Environment in the future is highlighted.

Lead researcher Haoyuan An states, "This new paradigm not only simplifies the application of ML in our field but also opens up untapped potential for environmental research, making it accessible to a broader range of scientists without the need for deep technical knowledge."

The integration of ChatGPT with ML can dramatically lower the barriers to employing advanced data analysis in environmental science, allowing for more efficient pollution monitoring, policy-making, and sustainability research. It marks a significant step toward more informed environmental decision-making and the potential for groundbreaking discoveries in the field.

###

References

DOI

10.1016/j.eehl.2024.01.006

Original Source URL

https://doi.org/10.1016/j.eehl.2024.01.006

Funding information

This work was financially supported by the National Key R&D Program of China (No. 2023YFF0614200), National Natural Science Foundation of China (Nos. 22222610, 22376202, 22193051), and the Chinese Academy of Sciences (Nos. ZDBS-LY-DQC030, YSBR-086). D. L. acknowledges the support from the Youth Innovation Promotion Association of CAS.

About Eco-Environment & Health

Eco-Environment & Health (EEH) is an international and multidisciplinary peer-reviewed journal designed for publications on the frontiers of the ecology, environment and health as well as their related disciplines. EEH focuses on the concept of "One Health" to promote green and sustainable development, dealing with the interactions among ecology, environment and health, and the underlying mechanisms and interventions. Our mission is to be one of the most important flagship journals in the field of environmental health.

Eco-Environment & Health

Not applicable

A new ChatGPT-empowered, easy-to-use machine learning paradigm for environmental science

3-Feb-2024

The authors declare that they have no competing interests

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

Link:
AI meets green: The future of environmental protection with ChatGPT - EurekAlert

Exploration and machine learning model development for T2 NSCLC with bronchus infiltration and obstructive … – Nature.com

Posted on March 2, 2024 by Danzig

Clinical characteristics of T2 stage NSCLC patients in different groups

Variations in clinical characteristics between the MBI/(P/ATL) and non-MBI/(P/ATL) groups were prominently attributed to the diameter linked to the T2 stage (Table 1). Notable disparities existed in gender distribution, with the MBI/(P/ATL) group demonstrating a higher proportion of males (58.4%/55.3% vs. 53.4%) and a heightened occurrence of Squamous Cell Carcinoma (46.0%/40.8% vs. 32.7%). Significantly, a larger proportion of primary sites in the main bronchus were identified in the MBI/(P/ATL) group (14.1%/7.8% vs. 1.7%), accompanied by a more advanced histologic grading (p<0.001).

The MBI/(P/ATL) group, especially the P/ATL subgroup, exhibited higher incidences of lymph nodes (N0: 41.8%/34.0% vs. 53.0%). Regarding treatment modalities, the MBI/(P/ATL) group displayed a stronger propensity to undergo chemotherapy (48.0%/51.1% vs. 41.7%) and radiation therapy (43.2%/46.8% vs. 38.2%). Compared to MBI/None group, the incidence of surgery was markedly lower in the P/ATL subgroup (26.5% vs. 49.9%/46.1%). Moreover, we counted those who underwent surgery and found that compared to surgery alone, the MBI/(P/ATL) group experienced a much higher proportion of preoperative induction therapy or postoperative adjuvant therapy than the non-MBI/(P/ATL) group (41.3%/54.7% vs. 36.6%).

In relation to tumor diameter, the non-MBI/(P/ATL) group had a larger diameter due to the incorporation of cases surpassing 3cm. In general, profound differences in clinical characteristics were observed between the groups, with the MBI/(P/ATL) group manifesting extensive disparities, especially within the P/ATL subgroup, compared to the non-MBI/(P/ATL) group.

Through KaplanMeier survival analysis, it was discerned that the OS for the MBI (Diameter>3) group was adversely impacted in comparison to the non-MBI/(P/ATL) group (p=0.012) (Fig.1A). Notably, regardless of the diameter size, the OS for the non-MBI/(P/ATL) group was significantly superior to that of the P/ATL group (p<0.0001) (Fig.1B).

KaplanMeier analysis of patients with different T2 types of NSCLC. (A,B) KaplanMeier analysis of overall survival (OS) in the Pneumonia or Atelectasis (P/ATL) and Main Bronchus Infiltration (MBI) groups versus the groups without P/ATL and MBI, prior to propensity score matching (PSM). (C,D) KaplanMeier analysis of OS in the P/ATL and MBI groups versus the non-MBI and P/ATL groups following PSM. (E,F) KaplanMeier analysis of cancer-specific survival (CSS) in the P/ATL and MBI groups versus the non-MBI and P/ATL groups after PSM.

Given the pronounced heterogeneity in clinical characteristics among the three groups, we adopted the Propensity Score Matching (PSM) method to mitigate the impact of diverse background variables, thereby harmonizing potential prognostic factors between the P/ATL and MBI groups compared to the non-MBI/(P/ATL) group. This approach ensured that the p-values from t-tests or chi-square tests for all clinical characteristics between the respective groups exceeded 0.1, indicating a balanced comparison (Supplementary data 1). Following this adjustment, we analyzed OS and cancer-specific survival (CSS) using the KM method for the P/ATL vs. None groups and the MBI vs. None groups, respectively. Our findings revealed that the P/ATL group exhibited a significantly poorer prognosis than the None group, with p of 0.00015 for OS and 0.00021 for CSS (Fig.1C,E). Conversely, the MBI group's prognosis was marginally inferior compared to the None group, with p of 0.037 for OS and 0.016 for CSS (Fig.1D,F).

Our findings indicate that at the T2 stage, both the MBI and P/ATL groups demonstrate an elevated risk for lymph node metastasis. To ascertain whether MBI and P/ATL act as independent risk factors for these lymph node metastase, we employed a multifactorial logistic regression analysis. The results illuminated those individuals in the MBI/(P/ATL) group had a notably higher risk of lymph node metastasis compared to those in the non-MBI/(P/ATL) group. In detail, MBI was found to be an independent risk factor for lymph node metastasis (OR=1.69, 95% CI 1.551.85, p<0.001), as was P/ATL (OR=2.10, 95% CI 1.932.28, p<0.001) (Table 2).

To evaluate the optimal treatment for NSCLC patients with two specific types of T2 tumors, we integrated seven treatment modalities: None, Radiation Therapy Alone, Chemotherapy Alone, Radiation+Chemotherapy, Surgery Alone, Initial Surgery Followed by Adjuvant Treatment, and Induction Therapy Followed by Surgery. We conducted a multifactorial Cox regression analysis of OS to assess the prognostic impact of these treatments in patients with P/ATL and MBI, respectively, using Surgery Alone as the reference group (Table 3). The results indicated that surgical treatments significantly outperformed both Radiotherapy Alone and Chemotherapy Alone, as well as the combination of Radiotherapy and Chemotherapy, in both subgroups. Specifically, in patients with MBI, Initial Surgery Followed by Adjuvant Treatment (HR=0.77, 95% CI 0.670.90, p=0.001) and Induction Therapy Followed by Surgery (HR=0.65, 95% CI 0.480.87, p=0.003) were significantly more effective than Surgery Alone. Conversely, for patients with P/ATL, neither Initial Surgery Followed by Adjuvant Treatment (HR=1.17, 95% CI 0.991.37, p=0.067) nor Induction Therapy Followed by Surgery (HR=1.05, 95% CI 0.781.40, p=0.758) showed any advantage over Surgery Alone.

Given the limited therapeutic options for patients with distant metastases, we analyzed the KM survival with different therapeutic strategies for patients with P/ATL and MBI at stages N0-1M0 and N2-3M0, respectively. In patients with MBI at the N2-3M0 stage, preoperative Induction Therapy significantly improved prognosis, illustrating a marked enhancement in outcomes. For the N0-1M0 stage in MBI patients, while there was a clear improvement in median survival with preoperative Induction Therapy, this improvement did not reach statistical significance. Additionally, postoperative Adjuvant Therapy substantially improved outcomes over Surgery Alone for MBI patients across both N0-1M0 and N2-3M0 stages (Fig.2A,B). Conversely, these treatments did not yield significant benefits for patients with P/ATL (Fig.2C,D). Moreover, in both subgroups for the N0-1M0 stage, prognosis following Surgery Alone was significantly better than with Chemoradiotherapy, whereas at the N2-3M0 stage, Surgery Alone did not show superiority over Chemoradiotherapy in terms of prognosis (Fig.2).

KaplanMeier analysis comparing the effectiveness of various treatment modalities in patients with Main Bronchus Infiltration (MBI) or Pneumonia/Atelectasis (P/ATL) based on nodal involvement. (A) Overall Survival (OS) associated with different treatment approaches in MBI patients classified as N0-1M0. (B) OS associated with different treatment approaches in MBI patients classified as N2-3M0. (C) OS associated with different treatment approaches in P/ATL patients classified as N0-1M0. (D) OS associated with different treatment approaches in P/ATL patients classified as N2-3M0.

Given the potential notable disparities in clinicopathologic variables and prognoses across the MBI and P/ATL subgroups, we aimed to delve deeper into the varying impacts that different factors might exhibit on mortality within these subgroups. Accordingly, multifactorial logistic regression was applied to analyze the 5-year OS rate within the MBI and P/ATL subgroups. In the MBI group, sex, histologic type, grade, age, N stage, M stage, site, marital status and treatment type were identified as independent factors associated with 5-year OS. In the P/ATL group, histologic type, grade, age, race, N stage, M stage and treatment type were recognized as independent factors associated with 5-year OS (Supplementary data 2).

We incorporated the factors independently correlated with 5-year OS from the MBI and P/ATL groups for prognostic modeling. The patients were randomized into training and test data groups at a 7:3 ratio. Subsequently, the best parameters for each model were adjusted and training was conducted within the training set to optimize performance. In the validation set, we performed ROC and DCA analyses of MBI and P/ATL groups for all models (Fig.3A,B). The XGBoost model also demonstrated optimal AUC with 0.814 and 0.853 respectively in both MBI and P/ATL groups, and the DCA curves further affirmed that the XGBoost model secures a higher net benefit compared to other models across varying threshold ranges (Fig.3C,D). The specific performance of each model in the test set is shown in Supplementary Data 3. In addition, we performed the Delong test and found that the XGBoost model significantly outperforms the rest of the models in both MBI and P/ATL (Supplementary Data 4).

Receiver Operating Characteristic Curve (ROC) and Decision Curve Analysis (DCA) analyses of Main Bronchus Infiltration (MBI) and Pneumonia/Atelectasis (P/ATL) groups. (A) ROC curves for each model in the MBI group. (B) ROC curves for each model in the P/ATL group. (C) DCA curves for each model in the MBI group. (D) DCA curves for each model in the P/ATL group.

Consequently, the calibration curves for the XGBoost model in both the MBI and P/ATL groups within the test set were also plotted, revealing commendable predictive performance of the model (Fig.4A,B). Additionally, we scrutinized the importance scores of the variables in both models (Fig.4C,D).

Calibration curves and feature significance plots of the XGBoost model for Main Bronchus Infiltration (MBI) and Pneumonia/Atelectasis (P/ATL) groups. (A) Calibration curve of the XGBoost model for the MBI group. (B) Calibration curve of the XGBoost model for the P/ATL group. (C) Feature significance plot of the XGBoost model for the MBI group. (D) Feature significance plot of the XGBoost model for the P/ATL group.

To assist researchers and clinicians in utilizing our prognostic model, we developed user-friendly web applications for stage T2 NSCLC MBI and P/ATL groups (Fig.5A,B), respectively. The web interface allows users to input clinical features of new samples, and the application can then help predict survival probabilities and survival status based on the patient's information. And the model can help clinicians to develop appropriate treatment strategies for this subgroup of patients by first selecting other parameters of a particular patient and focusing on the change of their 5-year survival by adjusting different treatments. For example, a 6574year old male with T2N3M0 stage lung adenocarcinoma, graded as grade III located in the upper lobe of a married MBI patient, his 5-year OS was 19.07% if he received Chemoradiotherapy, 23.83% if he received only surgery, and 5-year OS if he received Induction therapy followed by surgery was 35.51%, and 31.28% for those who received Initial surgery followed by adjuvant treatment.

See the original post here:
Exploration and machine learning model development for T2 NSCLC with bronchus infiltration and obstructive ... - Nature.com

Putting AI into the hands of people with problems to solve – MIT News

Posted on March 2, 2024 by Danzig

As Media Lab students in 2010, Karthik Dinakar SM 12, PhD 17 and Birago Jones SM 12 teamed up for a class project to build a tool that would help content moderation teams at companies like Twitter (now X) and YouTube. The project generated a huge amount of excitement, and the researchers were invited to give a demonstration at a cyberbullying summit at the White House they just had to get the thing working.

The day before the White House event, Dinakar spent hours trying to put together a working demo that could identify concerning posts on Twitter. Around 11 p.m., he called Jones to say he was giving up.

Then Jones decided to look at the data. It turned out Dinakars model was flagging the right types of posts, but the posters were using teenage slang terms and other indirect language that Dinakar didnt pick up on. The problem wasnt the model; it was the disconnect between Dinakar and the teens he was trying to help.

We realized then, right before we got to the White House, that the people building these models should not be folks who are just machine-learning engineers, Dinakar says. They should be people who best understand their data.

The insight led the researchers to develop point-and-click tools that allow nonexperts to build machine-learning models. Those tools became the basis for Pienso, which today is helping people build large language models for detecting misinformation, human trafficking, weapons sales, and more, without writing any code.

These kinds of applications are important to us because our roots are in cyberbullying and understanding how to use AI for things that really help humanity, says Jones.

As for the early version of the system shown at the White House, the founders ended up collaborating with students at nearby schools in Cambridge, Massachusetts, to let them train the models.

The models those kids trained were so much better and nuanced than anything I couldve ever come up with, Dinakar says. Birago and I had this big Aha! moment where we realized empowering domain experts which is different from democratizing AI was the best path forward.

A project with purpose

Jones and Dinakar met as graduate students in the Software Agents research group of the MIT Media Lab. Their work on what became Pienso started in Course 6.864 (Natural Language Processing) and continued until they earned their masters degrees in 2012.

It turned out 2010 wasnt the last time the founders were invited to the White House to demo their project. The work generated a lot of enthusiasm, but the founders worked on Pienso part time until 2016, when Dinakar finished his PhD at MIT and deep learning began to explode in popularity.

Were still connected to many people around campus, Dinakar says. The exposure we had at MIT, the melding of human and computer interfaces, widened our understanding. Our philosophy at Pienso couldnt be possible without the vibrancy of MITs campus.

The founders also credit MITs Industrial Liaison Program (ILP) and Startup Accelerator (STEX) for connecting them to early partners.

One early partner was SkyUK. The companys customer success team used Pienso to build models to understand their customers most common problems. Today those models are helping to process half a million customer calls a day, and the founders say they have saved the company over 7 million pounds to date by shortening the length of calls into the companys call center.

The difference between democratizing AI and empowering people with AI comes down to who understands the data best you or a doctor or a journalist or someone who works with customers every day? Jones says. Those are the people who should be creating the models. Thats how you get insights out of your data.

In 2020, just as Covid-19 outbreaks began in the U.S., government officials contacted the founders to use their tool to better understand the emerging disease. Pienso helped experts in virology and infectious disease set up machine-learning models to mine thousands of research articles about coronaviruses. Dinakar says they later learned the work helped the government identify and strengthen critical supply chains for drugs, including the popular antiviral remdesivir.

Those compounds were surfaced by a team that did not know deep learning but was able to use our platform, Dinakar says.

Building a better AI future

Because Pienso can run on internal servers and cloud infrastructure, the founders say it offers an alternative for businesses being forced to donate their data by using services offered by other AI companies.

The Pienso interface is a series of web apps stitched together, Dinakar explains. You can think of it like an Adobe Photoshop for large language models, but in the web. You can point and import data without writing a line of code. You can refine the data, prepare it for deep learning, analyze it, give it structure if its not labeled or annotated, and you can walk away with fine-tuned, large language model in a matter of 25 minutes.

Earlier this year, Pienso announced a partnership with GraphCore, which provides a faster, more efficient computing platform for machine learning. The founders say the partnership will further lower barriers to leveraging AI by dramatically reducing latency.

If youre building an interactive AI platform, users arent going to have a cup of coffee every time they click a button, Dinakar says. It needs to be fast and responsive.

The founders believe their solution is enabling a future where more effective AI models are developed for specific use cases by the people who are most familiar with the problems they are trying to solve.

No one model can do everything, Dinakar says. Everyones application is different, their needs are different, their data is different. Its highly unlikely that one model will do everything for you. Its about bringing a garden of models together and allowing them to collaborate with each other and orchestrating them in a way that makes sense and the people doing that orchestration should be the people who understand the data best.

See the article here:
Putting AI into the hands of people with problems to solve - MIT News

Untangling Truths and Myths of Machine Learning – TechiExpert.com

Posted on March 2, 2024 by Danzig

In the tech world, people often use words like AI and Machine Learning like they mean the same thing, but they dont. This mix-up causes problems and especially when businesses try to use Machine Learning in their operations.

Sure, AI sounds impressive. It conjures images of futuristic robots and advanced intelligence. But here is the truth: most of what we call AI today is not really that intelligent. It is mostly about doing math and guessing what might happen, rather than thinking like a person.

The problem arises when businesses buy into the hype without understanding what they are getting into. They hear AI and think it is a magic bullet that will solve all their problems. But the reality is far from it. Many Machine Learning projects never make it past the modeling phase, let alone into actual deployment where the value lies.

Take self-driving cars, for example. A few years ago, they were touted as the future of transportation, but now? They are more like this decades jetpackcool in theory, but far from reality. Why? Because we did not realize how hard it would be to put these things into action.

And it is not just self-driving cars. Across industries, businesses struggle to deploy Machine Learning models because they lack the proper infrastructure or simply dont understand the value.

But here is the deal: things can be different. If businesses plan well and know what Machine Learning can and cant do, they can use these models successfully and get good results. It is not simple, but it can happen.

So let us stop making AI sound fancy and start thinking about what is important: using Machine Learning to actually help businesses in a practical way.

Go here to read the rest:
Untangling Truths and Myths of Machine Learning - TechiExpert.com

Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting – Nature.com

Posted on March 2, 2024 by Danzig

We start with a brief review of transfer learning and a formal description of our problem setting. This is followed by a section covering the preliminaries of graph neural networks (GNNs), including standard and adaptive readouts, as well as our supervised variational graph autoencoder architecture. Next, we formally introduce the considered transfer learning strategies, while also providing a brief overview of the frequently used approach for transfer learning in deep learning a two stage learning mechanism consisting of pre-training and fine-tuning of a part or the whole (typically non-geometric) neural network14. In Results section, we perform an empirical study validating the effectiveness of the proposed approaches relative to the latter and state-of-the-art baselines for learning with multi-fidelity data.

Let ({{{{{{{mathcal{X}}}}}}}}) be an instance space and (X={{x}_{1},ldots,{x}_{n}}subset {{{{{{{mathcal{X}}}}}}}}) a sample from some marginal distribution ({rho }_{{{{{{{{mathcal{X}}}}}}}}}). A tuple ({{{{{{{mathcal{D}}}}}}}}=({{{{{{{mathcal{X}}}}}}}},{rho }_{{{{{{{{mathcal{X}}}}}}}}})) is called a domain. Given a specific domain ({{{{{{{mathcal{D}}}}}}}}), a task ({{{{{{{mathcal{T}}}}}}}}) consists of a label space ({{{{{{{mathcal{Y}}}}}}}}) and an objective predictive function (f:{{{{{{{mathcal{X}}}}}}}}to {{{{{{{mathcal{Y}}}}}}}}) that is unknown and needs to be learnt from training data given by examples (({x}_{i},{y}_{i})in {{{{{{{mathcal{X}}}}}}}}times {{{{{{{mathcal{Y}}}}}}}}) with i=1,,n. To simplify the presentation, we restrict ourselves to the setting where there is a single source domain ({{{{{{{{mathcal{D}}}}}}}}}_{S}), and a single target domain ({{{{{{{{mathcal{D}}}}}}}}}_{T}). We also assume that ({{{{{{{{mathcal{X}}}}}}}}}_{T}subseteq {{{{{{{{mathcal{X}}}}}}}}}_{S}), and denote with ({{{{{{{{mathcal{D}}}}}}}}}_{S}={({x}_{{S}_{1}},{y}_{{S}_{1}}),ldots,({x}_{{S}_{n}},{y}_{{S}_{n}})}) and ({{{{{{{{mathcal{D}}}}}}}}}_{T}={({x}_{{T}_{1}},{y}_{{T}_{1}}),ldots,({x}_{{T}_{m}},{y}_{{T}_{m}})}), the observed examples from source and target domains. While the source domain task is associated with low-fidelity data, the target domain task is considered to be sparse and high-fidelity, i.e., it holds that mn.

(54,55). Given a source domain ({{{{{{{{mathcal{D}}}}}}}}}_{S}) and a learning task ({{{{{{{{mathcal{T}}}}}}}}}_{S}), a target domain ({{{{{{{{mathcal{D}}}}}}}}}_{T}) and learning task ({{{{{{{{mathcal{T}}}}}}}}}_{T}), transfer learning aims to help improve the learning of the target predictive function fT in ({{{{{{{{mathcal{D}}}}}}}}}_{T}) using the knowledge in ({{{{{{{{mathcal{D}}}}}}}}}_{S}) and ({{{{{{{{mathcal{T}}}}}}}}}_{S}), where ({{{{{{{{mathcal{D}}}}}}}}}_{S}, ne , {{{{{{{{mathcal{D}}}}}}}}}_{T}) or ({{{{{{{{mathcal{T}}}}}}}}}_{S}, ne , {{{{{{{{mathcal{T}}}}}}}}}_{T}).

The goal in our problem setting is, thus, to learn the objective function fT in the target domain ({{{{{{{{mathcal{D}}}}}}}}}_{T}) by leveraging the knowledge from low-fidelity domain ({{{{{{{{mathcal{D}}}}}}}}}_{S}). The main focus is on devising a transfer learning approach for graph neural networks based on feature representation transfer. We propose extensions for two different learning settings: transductive and inductive learning. In the transductive transfer learning setup considered here, the target domain is constrained to the set of instances observed in the source dataset, i.e., ({{{{{{{{mathcal{X}}}}}}}}}_{T}subseteq {{{{{{{{mathcal{X}}}}}}}}}_{S}). Thus, the task in the target domain requires us to make predictions only at points observed in the source task/domain. In the inductive setting, we assume that source and target domains could differ in the marginal distribution of instances, i.e., ({rho }_{{{{{{{{{mathcal{X}}}}}}}}}_{S}}, ne , {rho }_{{{{{{{{{mathcal{X}}}}}}}}}_{T}}). For both learning settings, we assume that the source domain dataset is significantly larger as it is associated with low-fidelity simulations/approximations.

Here, we follow the brief description of GNNs from8. A graph G is represented by a tuple (G=({{{{{{{mathcal{V}}}}}}}},{{{{{{{mathcal{E}}}}}}}})), where ({{{{{{{mathcal{V}}}}}}}}) is the set of nodes (or vertices) and ({{{{{{{mathcal{E}}}}}}}}subseteq {{{{{{{mathcal{V}}}}}}}}times {{{{{{{mathcal{V}}}}}}}}) is the set of edges. Here, we assume that the nodes are associated with feature vectors xu of dimension d for all (uin {{{{{{{mathcal{V}}}}}}}}). The graph structure is represented by A, the adjacency matrix of a graph G such that Auv=1 if ((u,v)in {{{{{{{mathcal{E}}}}}}}}) and Auv=0 otherwise. For a node (uin {{{{{{{mathcal{V}}}}}}}}) the set of neighbouring nodes is denoted by ({{{{{{{{mathcal{N}}}}}}}}}_{u}={v| (u,, v)in {{{{{{{mathcal{E}}}}}}}}vee (v,, u)in {{{{{{{mathcal{E}}}}}}}}}). Assume also that a collection of graphs with corresponding labels ({{({G}_{i},{y}_{i})}}_{i=1}^{n}) has been sampled independently from a target probability measure defined over ({{{{{{{mathcal{G}}}}}}}}times {{{{{{{mathcal{Y}}}}}}}}), where ({{{{{{{mathcal{G}}}}}}}}) is the space of graphs and ({{{{{{{mathcal{Y}}}}}}}}subset {mathbb{R}}) is the set of labels. From now on, we consider that a graph G is represented by a tuple (XG,AG), with XG denoting the matrix with node features as rows and AG the adjacency matrix. The inputs of graph neural networks consist of such tuples, outputting predictions over the label space. In general, GNNs learn permutation invariant hypotheses that have consistent predictions for the same graph when presented with permuted nodes. This property is achieved through neighbourhood aggregation schemes and readouts that give rise to permutation invariant hypotheses. Formally, a function f defined over a graph G is called permutation invariant if there exists a permutation matrix P such that f(PXG,PAGP)=f(XG,AG). The node features XG and the graph structure (adjacency matrix) AG are used to first learn representations of nodes hv, for all (vin {{{{{{{mathcal{V}}}}}}}}). Permutation invariance in the neighbourhood aggregation schemes is enforced by employing standard pooling functions sum, mean, or maximum. As succinctly described in56, typical neighbourhood aggregation schemes characteristic of GNNs can be described by two steps:

$${{{{{{{{bf{a}}}}}}}}}_{v}^{(k)}={{{{{{{rm{AGGREGATE}}}}}}}}({{{{{{{{{bf{h}}}}}}}}}_{u}^{(k-1)}, | , uin {{{{{{{{mathcal{N}}}}}}}}}_{v}})quad ,{{mbox{and}}},quad \ {{{{{{{{bf{h}}}}}}}}}_{v}^{(k)}={{{{{{{rm{COMBINE}}}}}}}}({{{{{{{{bf{h}}}}}}}}}_{v}^{(k-1)},, {{{{{{{{bf{a}}}}}}}}}_{v}^{(k-1)})$$

(1)

where ({{{{{{{{bf{h}}}}}}}}}_{u}^{(k)}) is a representation of node (uin {{{{{{{mathcal{V}}}}}}}}) at the output of the kth iteration.

After k iterations the representation of a node captures the information contained in its k-hop neighbourhood. For graph-level tasks such as molecular prediction, the last iteration is followed by a readout (also called pooling) function that aggregates the node features hv into a graph representation hG. To enforce a permutation invariant hypotheses, it is again common to employ the standard pooling functions as readouts, namely sum, mean, or maximum.

Standard readout functions (i.e., sum, mean, and maximum) in graph neural networks do not have any parameters and are, thus, not amenable for transfer learning between domains. Motivated by this, we build on our recent work8 that proposes a neural network architecture to aggregate learnt node representations into graph embeddings. This allows for freezing the part of a GNN architecture responsible for learning effective node representations and fine-tuning the readout layer in small-sample downstream tasks. In the remainder of the section, we present a Set Transformer readout that retains the permutation invariance property characteristic of standard pooling functions. Henceforth, suppose that after completing a pre-specified number of neighbourhood aggregation iterations, the resulting node features are collected into a matrix ({{{{{{{bf{H}}}}}}}}in {{mathbb{R}}}^{Mtimes D}), where M is the maximal number of nodes that a graph can have in the dataset and D is the dimension of the output node embedding. For graphs with less than M vertices, H is padded with zeros.

Recently, an attention-based neural architecture for learning on sets has been proposed by Lee et al.57. The main difference compared to the classical attention model proposed by Vaswani et al.9 is the absence of positional encodings and dropout layers. As graphs can be seen as sets of nodes, we leverage this architecture as a readout function in graph neural networks. For the sake of brevity, we omit the details of classical attention models9 and summarise only the adaptation to sets (and thus graphs). The Set Transformer (ST) takes as input matrices with set items (in our case, graph nodes) as rows and generates graph representations by composing encoder and decoder modules implemented using attention:

$${{{{{{{rm{ST}}}}}}}}({{{{{{{bf{H}}}}}}}})=frac{1}{K}mathop{sum }limits_{k=1}^{K}{left[{{{{{{{rm{Decoder}}}}}}}}, left({{{{{{{rm{Encoder}}}}}}}}, left({{{{{{{bf{H}}}}}}}}right)right)right]}_{k}$$

(2)

where ({left[cdot right]}_{k}) refers to a computation specific to head k of a multi-head attention module. The encoder-decoder modules follow the definition of Lee et al.57:

$${{{{{{{rm{Encoder}}}}}}}}, left({{{{{{{bf{H}}}}}}}}right)=, {{{{{{{{rm{MAB}}}}}}}}}^{n}, left({{{{{{{bf{H}}}}}}}},, {{{{{{{bf{H}}}}}}}}right)$$

(3)

$${{{{{{{rm{Decoder}}}}}}}}, ({{{{{{{bf{Z}}}}}}}})={{{{{{{rm{FF}}}}}}}}left({{{{{{{{rm{MAB}}}}}}}}}^{m}, left({{{{{{{rm{PMA}}}}}}}}, ({{{{{{{bf{Z}}}}}}}}),, {{{{{{{rm{PMA}}}}}}}}, ({{{{{{{bf{Z}}}}}}}})right)right)$$

(4)

$${{{{{{{rm{PMA}}}}}}}}({{{{{{{bf{Z}}}}}}}})={{{{{{{rm{MAB}}}}}}}}({{{{{{{bf{s}}}}}}}},, {{{{{{{rm{FF}}}}}}}}({{{{{{{bf{Z}}}}}}}}))$$

(5)

$${{{{{{{rm{MAB}}}}}}}}({{{{{{{bf{X}}}}}}}},, {{{{{{{bf{Y}}}}}}}})={{{{{{{bf{A}}}}}}}}+{{{{{{{rm{FF}}}}}}}}({{{{{{{bf{A}}}}}}}})$$

(6)

$${{{{{{{bf{A}}}}}}}} ={{{{{{{bf{X}}}}}}}}+{{{{{{{rm{Multi}}}}}}}}{{{{{{{rm{Head}}}}}}}}({{{{{{{bf{X}}}}}}}},, {{{{{{{bf{Y}}}}}}}},, {{{{{{{bf{Y}}}}}}}}).$$

(7)

Here, H denotes the node features after neighbourhood aggregation and Z is the encoder output. The encoder is a chain of n classical multi-head attention blocks (MAB) without positional encodings. The decoder component consists of a pooling by multi-head attention block (PMA) (which uses a learnable seed vector s within a multi-head attention block to create an initial readout vector) that is further processed via a chain of m self-attention modules and a linear projection block (also called feedforward, FF). In contrast to typical set-based neural architectures that process individual items in isolation (most notably deep sets58), the presented adaptive readouts account for interactions between all the node representations generated by the neighbourhood aggregation scheme. A particularity of this architecture is that the dimension of the graph representation can be disentangled from the node output dimension and the aggregation scheme.

We start with a review of variational graph autoencoders (VGAEs), originally proposed by Kipf and Welling59, and then introduce a variation that allows for learning of a predictive model operating in the latent space of the encoder. More specifically, we propose to jointly train the autoencoder together with a small predictive model (multi-layer perceptron) operating in its latent space by including an additional loss term that accounts for the target labels. Below, we follow the brief description of6.

A variational graph autoencoder consists of a probabilistic encoder and decoder, with several important differences compared to standard architectures operating on vector-valued inputs. The encoder component is obtained by stacking graph convolutional layers to learn the parameter matrices and that specify the Gaussian distribution of a latent space encoding. More formally, we have that

$$q({{{{{{{bf{Z}}}}}}}}, | , {{{{{{{bf{X}}}}}}}},, {{{{{{{bf{A}}}}}}}})=mathop{prod }limits_{i=1}^{N}q({{{{{{{{bf{z}}}}}}}}}_{i}, | , {{{{{{{bf{X}}}}}}}},{{{{{{{bf{A}}}}}}}})quad ,{{mbox{and}}},quad q({{{{{{{{bf{z}}}}}}}}}_{i}, | , {{{{{{{bf{X}}}}}}}},, {{{{{{{bf{A}}}}}}}})={{{{{{{mathcal{N}}}}}}}}({{{{{{{{bf{z}}}}}}}}}_{i}, | , {{{{{{{{boldsymbol{mu }}}}}}}}}_{i},,{{mbox{diag}}},({{{{{{{{boldsymbol{sigma }}}}}}}}}_{i}^{2})),$$

(8)

with =GCN,n(X,A) and (log {{{{{{{boldsymbol{sigma }}}}}}}}={{{mbox{GCN}}}}_{sigma,n}({{{{{{{bf{X}}}}}}}},{{{{{{{bf{A}}}}}}}})). Here, GCN,n is a graph convolutional neural network with n layers, X is a node feature matrix, A is the adjacency matrix of the graph, and ({{{{{{{mathcal{N}}}}}}}}) denotes the Gaussian distribution. Moreover, the model typically assumes the existence of self-loops, i.e., the diagonal of the adjacency matrix consists of ones.

The decoder reconstructs the entries in the adjacency matrix by passing the inner product between latent variables through the logistic sigmoid. More formally, we have that

$$p({{{{{{{bf{A}}}}}}}}, | , {{{{{{{bf{Z}}}}}}}})=mathop{prod }limits_{i=1}^{N}mathop{prod }limits_{j=1}^{N}p({{{{{{{{bf{A}}}}}}}}}_{ij}, | , {{{{{{{{bf{z}}}}}}}}}_{i},{{{{{{{{bf{z}}}}}}}}}_{j})quad ,{{mbox{and}}},quad p({{{{{{{{bf{A}}}}}}}}}_{ij}=1, | , {{{{{{{{bf{z}}}}}}}}}_{i},{{{{{{{{bf{z}}}}}}}}}_{j})=tau ({{{{{{{{bf{z}}}}}}}}}_{i}^{top }{{{{{{{{bf{z}}}}}}}}}_{j}),$$

(9)

where Aij are entries in the adjacency matrix A and () is the logistic sigmoid function. A variational graph autoencoder is trained by optimising the evidence lower-bound loss function that can be seen as the combination of a reconstruction and a regularisation term:

$$tilde{{{{{{{mathcal{L}}}}}}}}({{{{{{mathbf{X}}}}}}},, {{{{{{mathbf{A}}}}}}})=underbrace{{{mathbb{E}}_{q({{{{{{mathbf{Z}}}}}}} mid {{{{{{mathbf{X}}}}}}},{{{{{{mathbf{A}}}}}}})} left[ log p({{{{{{mathbf{A}}}}}}} mid {{{{{{mathbf{Z}}}}}}}) right]}}_{{{{{{{mathcal{L}}}}}}}_{{{{{{{rm{RECON}}}}}}}}} - underbrace{{{{{mbox{KL}}}}} left[ q({{{{{{mathbf{Z}}}}}}} | {{{{{{mathbf{X}}}}}}},{{{{{{mathbf{A}}}}}}}) parallel p({{{{{{mathbf{Z}}}}}}}) right]}_{{{{{{{mathcal{L}}}}}}}_{{{{{{{rm{REG}}}}}}}}}$$

(10)

where KL[q()p()] is the Kullback-Leibler divergence between the variational distribution q() and the prior p(). The prior is assumed to be a Gaussian distribution given by (p({{{{{{{bf{Z}}}}}}}})={prod }_{i}p({{{{{{{{bf{z}}}}}}}}}_{i})={prod }_{i}{{{{{{{mathcal{N}}}}}}}}({{{{{{{{bf{z}}}}}}}}}_{i}, | , 0,, {{{{{{{bf{I}}}}}}}})). As the adjacency matrices of graphs are typically sparse, instead of taking all the negative entries when training one typically performs sub-sampling of entries with Aij=0.

We extend this neural architecture by adding a feedforward component operating on the latent space and account for its effectiveness via the mean squared error loss term that is added to the optimisation objective. More specifically, we optimise the following loss function:

$${{{{{{{mathcal{L}}}}}}}}({{{{{{{bf{X}}}}}}}},, {{{{{{{bf{A}}}}}}}},, {{{{{{{bf{y}}}}}}}})=tilde{{{{{{{{mathcal{L}}}}}}}}}({{{{{{{bf{X}}}}}}}},, {{{{{{{bf{A}}}}}}}})+frac{1}{N}mathop{sum }limits_{i=1}^{N}parallel nu ({{{{{{{{bf{Z}}}}}}}}}_{i})-{{{{{{{{bf{y}}}}}}}}}_{i}{parallel }^{2},$$

(11)

where (Z) is the predictive model operating on the latent space embedding Z associated with graph (X, A), y is the vector with target labels, and N is the number of labelled instances. Figure2 illustrates the setting and our approach to transfer learning using supervised variational graph autoencoders.

We note that our supervised variational graph autoencoder resembles the joint property prediction variational autoencoder (JPP-VAE) proposed by Gmez-Bombarelli et al.39. Their approach has been devised for generative purposes, which we do not consider here. The main difference to our approach, however, is the fact that JPP-VAE is a sequence model trained directly on the SMILES60 string representation of molecules using recurrent neural networks, a common approach in generative models61,62. The transition from traditional VAEs to geometric deep learning (graph data) in the first place, and then to molecular structures is not a trivial process for at least two reasons. Firstly, a variational graph autoencoder only reconstructs the graph connectivity information (i.e., the equivalent of the adjacency matrix) and not the node (atom) features, according to the original definition by Kipf and Welling. This is in contrast to traditional VAEs where the latent representation is directly optimised against the actual input data. The balance between reconstruction functions (for the connectivity, and node features respectively) is thus an open question in geometric deep learning. Secondly, for molecule-level tasks such as prediction and latent space representation, the readout function of the variational graph autoencoders is crucial. As we have previously explored in8 and further validate in Results section, standard readout functions such as sum, mean, or maximum lead to uninformative representations that are similar to completely unsupervised training (i.e., not performing well in transfer learning tasks). Thus, the supervised or guided variational graph autoencoders presented here are also an advancement in terms of graph representation learning for modelling challenging molecular tasks at the multi-million scale.

In the context of quantum chemistry and thedesign of molecular materials, the most computationally demanding task corresponds to the calculation of energy contribution that constitutes only a minor fraction of total energy, while the majority of the remaining calculations can be accounted for via efficient proxies28. Motivated by this, Ramakrishnan et al.28 have proposed an approach known as -machine learning, where the desired molecular property is approximated by learning an additive correction term for a low-fidelity proxy. For linear models, an approach along these lines can be seen as feature augmentation where instead of the constant bias term one appends the low-fidelity approximation as a component to the original representation of an instance. More specifically, if we represent a molecule in the low-fidelity domain via ({{{{{{{bf{x}}}}}}}}in {{{{{{{{mathcal{X}}}}}}}}}_{S}) then the representation transfer for ({{{{{{{{mathcal{D}}}}}}}}}_{T}) can be achieved via the feature mapping

$${Psi }_{{{{{{{{rm{Label}}}}}}}}}({{{{{{{bf{x}}}}}}}})=parallel ({, f}_{S}({{{{{{{bf{x}}}}}}}}),, {{{{{{{bf{x}}}}}}}})$$

(12)

where (, ) denotes concatenation in the last tensor dimension and fS is the objective prediction function associated with the source (low-fidelity) domain ({{{{{{{{mathcal{D}}}}}}}}}_{S}) defined in Overview of transfer learning and problem setting section. We consider this approach in the context of transfer learning for general methods (including GNNs) and standard baselines that operate on molecular fingerprints (e.g., support vector machines, random forests, etc.). A limitation of this approach is that it constrains the high-fidelity domain to the transductive setting and instances that have been observed in the low-fidelity domain. A related set of methods in the drug discovery literature called high-throughput fingerprints34,35,36,37 function in effectively the same manner, using a vector of hundreds of experimental single-dose (low-fidelity) measurements and optionally a standard molecular fingerprint as a general molecular representation (i.e., not formulated specifically for transductive or multi-fidelity tasks). In these cases, the burden of collecting the low-fidelity representation is substantial, involving potentially hundreds of experiments (assays) that are often disjoint, resulting in sparse fingerprints and no practical way to make predictions about compounds that have not been part of the original assays. In drug discovery in particular it is desirable to extend beyond this setting and enable predictions for arbitrary molecules, i.e., outside of the low-fidelity domain. Such a model would enable property prediction for compounds before they are physically synthesised, a paradigm shift compared to existing HTS approaches. To overcome the transductive limitation, we consider a feature augmentation approach that leverages low-fidelity data to learn an approximation of the objective function in that domain. Then, transfer learning to the high-fidelity domain happens via the augmented feature map

$${Psi }_{{{{{{{{rm{(Hybrid, label)}}}}}}}}}({{{{{{{bf{x}}}}}}}})=left{begin{array}{ll}parallel ({, f}!_{!S}({{{{{{{bf{x}}}}}}}}),, {{{{{{{bf{x}}}}}}}})quad &,{{mbox{if}}},quad {{{{{{{bf{x}}}}}}}}in {{{{{{{{mathcal{X}}}}}}}}}_{S},\ parallel ({tilde{, f}}!_{S}({{{{{{{bf{x}}}}}}}}),, {{{{{{{bf{x}}}}}}}})quad &,{{mbox{otherwise}}},end{array}right.$$

(13)

where ({tilde{f}}_{S}) is an approximation of the low-fidelity objective function fS. This is a hybrid approach that allows extending to the inductive setting with a different treatment between instances observed in the low-fidelity domain and the ones associated with the high-fidelity task exclusively. Another possible extension that treats all instances in the high-fidelity domain equally is via the map (Predictedlabel) that augments the input feature representation using an approximate low-fidelity objective (({tilde{f}}!!_{S})), i.e.,

$${Psi }_{({{{{{{{rm{Predicted}}}}}}}}, {{{{{{{rm{label}}}}}}}})}({{{{{{{bf{x}}}}}}}})=!!parallel ({tilde{f}}_{S}({{{{{{{bf{x}}}}}}}}),, {{{{{{{bf{x}}}}}}}})$$

(14)

Our final feature augmentation amounts to learning a latent representation of molecules in the low-fidelity domain using a supervised autoencoder (see Supervised variational graph autoencoders section), then jointly training alongside the latent representation of a model that is being fitted to the high-fidelity data. This approach also lends itself to the inductive setting. More formally, transfer learning in this case can be achieved via the feature mapping

$${Psi }_{{{{{{{{rm{Embeddings}}}}}}}}}({{{{{{{bf{x}}}}}}}})=!!parallel ({psi }_{S}({{{{{{{bf{x}}}}}}}}),, {psi }_{T}({{{{{{{bf{x}}}}}}}}))$$

(15)

where S(x) is the latent embedding obtained by training a supervised autoencoder on low-fidelity data ({{{{{{{{mathcal{D}}}}}}}}}_{S}), and T(x) represents the latent representation of a model trained on the sparse high-fidelity task. Note that S(x) is fixed (the output of the low-fidelity model which is trained separately), while T (x) is the current embedding of the high-fidelity model that is being learnt alongside S (x) and can be updated.

Supervised pre-training and fine-tuning is a transfer learning strategy that has previously proven successful for non-graph neural networks in the context of energy prediction for small organic molecules. In its simplest form, and as previously used by Smith et al.14, the strategy consists of first training a model on the low-fidelity data ({{{{{{{{mathcal{D}}}}}}}}}_{S}) (the pre-training step). Afterwards, the model is retrained on the high-fidelity data ({{{{{{{{mathcal{D}}}}}}}}}_{T}), such that it now outputs predictions at the desired fidelity level (the fine-tuning step). For the fine-tuning step, certain layers of the neural network are typically frozen, which means that gradient computation is disabled for them. In other words, their weights are fixed to the values learnt during the pre-training step and are not updated. This technique reduces the number of learnable parameters, thus helping to avoid over-fitting to a smaller high-fidelity dataset and reducing training times. Formally, we assume that we have a low-fidelity predictor ({tilde{f}}_{S}) (corresponding to pre-training) and define the steps required to re-train or fine-tune ablank model ({tilde{f}}_{{T}_{0}}) (in domain ({{mathcal{T}}}))into a high-fidelity predictor ({tilde{f}}_{T})

$${{{{{{{{bf{W}}}}}}}}}_{S}=,{{mbox{Weights}}},({tilde{f}}_{S})quad (,{{mbox{Extract weights of pre-trained model}}},{tilde{f}}_{S})$$

(16)

$${{{{{{{{bf{W}}}}}}}}}_{S}=,{{mbox{Freeze}}},({{{{{{{{bf{W}}}}}}}}}_{{S}_{{{{{{{{rm{GCN}}}}}}}}}},ldots )quad (,{{mbox{Freeze components,e.g.}}},{{{{{{{rm{GCN}}}}}}}},{{mbox{layers}}},)$$

(17)

$${tilde{f}}_{{T}_{0}}={{{{{{{{bf{W}}}}}}}}}_{S}quad (,{{mbox{Assign weights of}}},{tilde{f}}_{S},{{mbox{to a blank model}}},{tilde{f}}_{{T}_{0}})$$

(18)

where ({tilde{f}}_{{T}_{0}}) is fine-tuned into ({tilde{f}}_{T}). As a baseline, we define a simple equivalent to the neural network in Smith et al., where we pre-train and fine-tune a supervised VGAE model with the sum readout and without any frozen layers. This is justified by GNNs having a small number of layers to avoid well-known problems such as oversmoothing. As such, the entire VGAE is fine-tuned and the strategy is termed (TuneVGAE):

$${{{{{{mathbf{W}}}}}}}_S={{{{mbox{Freeze}}}}}(varnothing) qquad ({{mbox{No component is frozen}}})$$

(19)

$${tilde{f}}_{{T}_{0}} !!!={{{{{{{{bf{W}}}}}}}}}_{S}qquad (,{{mbox{Assign initial weights}}},)$$

(20)

$${Psi }_{{{{{left({{{{{rm{Tune}}}}}}; {{{{{rm{VGAE}}}}}}right)}}}}}({{{{{{{bf{x}}}}}}}})={tilde{f}}_{T}({{{{{{{bf{x}}}}}}}})qquad (,{{mbox{The final model is the fine-tuned}}},{tilde{f}}_{!!T})$$

(21)

Standard GNN readouts such as the sum operator are fixed functions with no learnable parameters. In contrast, adaptive readouts are implemented as neural networks, and the overall GNN becomes a modular architecture composed of (1) the supervised VGAE layers and (2) an adaptive readout. Consequently, there are three possible ways to freeze components at this level: (i) frozen graph convolutional layers and trainable readout, (ii) trainable graph layers and frozen readout, and (iii) trainable graph layers and trainable readout (no freezing). After a preliminary study on a representative collection of datasets, we decided to follow strategy (i) due to empirically strong results and overall originality for transfer learning with graph neural networks. More formally, we have that

$${{{{{{{{bf{W}}}}}}}}}_{S}=,{{mbox{Freeze}}},({{{{{{{{bf{W}}}}}}}}}_{{S}_{{{{{{{{rm{GCN}}}}}}}}}})qquad (,{{mbox{Freeze all}}},{{{{{{{rm{GCN}}}}}}}},{{mbox{layers}}},)$$

(22)

$${tilde{f}}_{{T}_{0}} !!!={{{{{{{{bf{W}}}}}}}}}_{S}qquad (,{{mbox{Assign initial weights}}},)$$

(23)

$${Psi }_{left({{{{mathrm{Tune}}}};{{{mathrm{readout}}}}}right)}({{{{{{{bf{x}}}}}}}})={tilde{f}}_{T}({{{{{{{bf{x}}}}}}}})quad (,{{mbox{The final model is the fine-tuned}}},{tilde{f}}_{T})$$

(24)

For drug discovery tasks, low-fidelity (LF) data consists of single-dose measurements (SD, performed at a single concentration) for a large collection of compounds. The high-fidelity (HF) data consists of dose-response (DR) measurements corresponding to multiple different concentrations that are available for a small collection of compounds (see Fig.1, top). In the quantum mechanics experiments, we have opted for the recently-released QMugs dataset with 657K unique drug-like molecules and 12 quantum properties. The data originating from semi-empirical GFN2-xTB simulations act as the low-fidelity task, and the high-fidelity component is obtained via density-functional theory (DFT) calculations (B97X-D/def2-SVP). The resulting multi-fidelity datasets are defined as datasets where SMILES-encoded molecules are associated with two different measurements of different fidelity levels.

As modelling large-scale high-throughput screening data and transfer learning in this context are understudied applications, a significant effort was made to carefully select and filter suitable data from public (PubChem) and proprietary (AstraZeneca) sources, covering a multitude of different settings. To this end, we have assembled several multi-fidelity drug discovery datasets (Fig.1, top) from PubChem, aiming to capture the heterogeneity intrinsic to large-scale screening campaigns, particularly in terms of assay types, screening technologies, concentrations, scoring metrics, protein targets, and scope. This has resulted in 23 multi-fidelity datasets (Supplementary Table1) that are now part of the concurrently published MF-PCBA collection29. We have also curated 16 multi-fidelity datasets based on historical AstraZeneca (AZ) HTS data (Supplementary Table2), the emphasis now being put on expanding the number of compounds in the primary (1 million+) and confirmatory screens (1000 to 10,000). The search, selection, and filtering steps, along with the naming convention are detailed in Supplementary Notes5 and29. As the QMugs dataset contains a few erroneous calculations, we apply a filtering protocol similar to the drug discovery data and remove the values that diverge by more than 5 standard deviations, which removes just over 1% of the molecules present. The QMugs properties are listed in Supplementary Table3. For the transductive setting, we selected a diverse and challenging set of 10K QMugs molecules (Supplementary Notes5.1), which resembles the drug discovery setting.

While methods to artificially generate multi-fidelity data with desired fidelity correlations have recently been proposed63, we did not pursue this direction as remarkably large collections of real-world multi-fidelity data are available, covering a large range of fidelity correlations and diverse chemical spaces. Furthermore, the successful application of such techniques to molecular data is yet to be demonstrated.

Further information on research design is available in theNature Portfolio Reporting Summary linked to this article.

AI shows promise but remains limited for heart/stroke care – Idaho Business Review

Posted on March 2, 2024 by Danzig

Artificial intelligence has the potential to change many aspects of cardiovascular care, but not right away, a new report says.

Existing AI and machine-learning digital tools are promising, according to the scientific statement from the American Heart Association. Such tools already have shown they can help screen patients and guide researchers in developing new treatments. The report was published Wednesday in the journal Circulation.

But, the authors said, research hasnt shown that AI-based tools improve care enough to justify their widespread use.

There is an urgent need to develop programs that will accelerate the education of the science behind AI/machine learning tools, thus accelerating the adoption and creation of manageable, cost-effective, automated processes, Dr. Antonis Armoundas, who led the statement writing committee, said in a news release. He is a principal investigator at the Cardiovascular Research Center at Bostons Massachusetts General Hospital.

We need more AI/machine learning-based precision medicine tools to help address core unmet needs in medicine that can subsequently be tested in robust clinical trials, said Armoundas, who also is an associate professor of medicine at Harvard Medical School.

The report is the AHAs first scientific statement on artificial intelligence. It looks at the state of research into AI and machine learning in cardiovascular medicine and suggests what may be needed for safe, effective widescale use.

Here, we present the state of the art, including the latest science regarding specific AI uses from imaging and wearables to electrocardiography and genetics, Armoundas said.

AI can analyze data and make predictions, typically for narrowly defined tasks. Machine learning uses mathematical models and algorithms to detect patterns in large datasets that may not be evident to human observers alone. Deep learning, a subfield of machine learning, is used in image recognition and interpretation.

Researchers have used such technologies to analyze electronic health records to compare the effectiveness of tests and treatments, and, more recently, to create models that inform care decisions.

The report notes several ways digital tools might help cardiovascular patients.

Imaging, for example, is important for diagnosing heart attacks and strokes. AI and machine-learning tools could address inconsistencies in human interpretation and relieve overburdened experts.

AI has helped automate analysis of electrocardiograms, which measure the hearts electrical activity, by identifying subtle results that human experts might not see.

And with implantable and wearable technologies providing steady streams of health information, AI could help remotely monitor patients and spot when something is amiss.

But the report also spells out many challenges and limits.

With imaging, for example, broad use of AI and machine learning for interpreting tests is challenging because the data available to study is limited. Researchers also need to prove AI technology works in each area where it will be used.

With implantable and wearable tech, the research gaps include ways to identify which patients and conditions may be best for AI- and machine learning-enabled remote monitoring. Other issues include how to address cost-effectiveness, privacy, safety and equitable access.

More broadly, protocols on how information is organized and shared are critical, the report says, and potential ethical, legal and regulatory issues need to be addressed.

And while AI algorithms have enhanced the ability to interpret genetic variants and abnormalities, the writing committee warned of limits. Such algorithms, the committee wrote, still require training on human-derived data that can be error-prone and inaccurate.

Excerpt from:
AI shows promise but remains limited for heart/stroke care - Idaho Business Review

9 Top AI Governance Tools 2024 – eWeek

Posted on March 2, 2024 by Danzig

eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

AI governance tools are software or platforms that help organizations manage and regulate the development, deployment, and use of artificial intelligence (AI) systems. By supporting disciplined AI governance, these tools provide features and functionalities that help organizations implement ethical and responsible AI practices and also create competitive advantage.

We analyzed the best AI governance software for different teams and organizations, their features, pricing, and strengths and weaknesses to help you determine the best tool for your business.

See the high-level feature and pricing comparison of the top-rated artificial intelligence governance tools and software to help you determine the best solution for your business.

IBM Cloud Pak for Data is an integrated data and AI platform that helps organizations accelerate their journey to AI-driven insights. Built on a multicloud architecture, it provides a unified view of data and AI services, enabling data engineers, data scientists, and business analysts to collaborate and build AI models faster.

The platform includes a wide array of governance features, including data cataloging, data lineage, data quality monitoring, and compliance management. IBMs end-to-end governance capabilities allow organizations to govern their AI projects by addressing key concerns such as data privacy, security, compliance, and model explainability.

IBM requires intending buyers to contact their sales team for custom quotes. However, our research found that IBM Cloud Pak for Data Standard Option with 48 VPCs costs $19,824 per month and $237,888 per year. Meanwhile, the IBM Cloud Pak for Data Enterprise Option with 72 VPCs costs $59,400 per month and $712,800 billable annually.

Further research shows that the IBM Cloud Pak for Data standard edition costs $350 per month per virtual processor core, while the enterprise edition costs $699 per month per virtual processor core. You can also try the tool free of cost for 60 days before making a financial commitment.

Amazon SageMaker offers developers and data scientists an all-in-one integrated development environment (IDE) that lets you build, train, and deploy ML models at scale using tools like notebooks, debuggers, profilers, pipelines, and MLOps.

SageMaker provides various tools and capabilities, including built-in algorithms, a data-labeling feature, model tuning, automatic scaling, and hosting options. It simplifies machine learning workflow, from data preparation to model deployment, and offers an integrated development environment for managing the entire process. The platform enables you to manage and control access to your ML projects, models, and data, ensuring compliance, accountability, and transparency in your ML workflows.

SageMaker integrates with AWS services such as AWS Glue for data integration, AWS Lambda for serverless computing, and Amazon CloudWatch for monitoring and logging.

SageMaker offers two choices for payment: on-demand pricing that offers no minimum fees and no upfront commitments, and the SageMaker savings plans that provide a usage-based pricing model. You can scroll through the platforms pricing page for your actual rate.

Dataiku DSS (Data Science Studio) is a collaborative and end-to-end data science platform that enables data teams to build, deploy, and monitor predictive analytics and machine learning models. It provides a visual interface for data preparation, analysis, and AI modeling, as well as the ability to deploy models into production and monitor their performance.

Dataiku DSS strongly focuses on collaboration, allowing both technical users (coders) and business users (noncoders) to work together on data projects in a shared workspace. This collaborative feature enables data scientists, data analysts, business analysts, and AI consumers to contribute their expertise and insights to the data projects.

As part of its AI governance initiative, Dataiku Govern centralizes the tracking of multiple data initiatives, ensuring that the proper workflows and processes are in place to deliver Responsible AI. This centralized oversight is significant as companies scale their AI footprint and embark on generative AI initiatives, as it helps maintain visibility into the various projects and reduces the risk of potential issues.

The following are the different plans offered by Dataiku. To get your actual rate, contact the company for a custom quote. The company also offers a 14-day free trial.

Dataiku DSS offers several capabilities, including data preparation, data visualization, machine learning, DataOps, MLOps, analytic apps, collaboration, governance, explainability, and architecture. It also supports functionalities through plug-ins and connectors such as OpenAI GPT, Geo Router, GDPR, Splunk, Collibra Connector, and more.

Azure Machine Learning supports AI governance, by providing tools, services, and frameworks to streamline the machine learning process, from data preparation and model training to deployment and monitoring, enabling data scientists and developers to build, train, and deploy machine learning models at scale.

Azure Machine Learning provides tools and features that enable users to implement Responsible AI practices in their machine learning projects. This includes features such as model interpretability, fairness, and transparency that help data scientists and developers understand and mitigate potential biases, ensure the ethical use of their models, and maintain transparency in the decision-making process.

Azure Machine Learning responsible AI is built on six core principles: fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability in machine learning models and processes. These principles are, in essence, the core of AI governance.

Azure offers three pricing options: pay-as-you-go, Azure savings plan for compute, and reservations. You can consult the Azure pricing table to learn about your rates or contact the company sales team for personalized quotes.

Datatron MLOps offers an AI model monitoring and AI governance platform that helps organizations manage and optimize their MLOps. The platform provides robust monitoring and tracking features to ensure that models perform as expected and meet compliance standards. This includes real-time model performance monitoring, data drift identification, and setting up alerts and notifications for any anomalies or deviations.

The platform provides a unified dashboard to monitor the performance and health of deployed models in real time, allowing organizations to identify and address issues proactively. Datatrons explainability capability plays a critical role in risk management and compliance. It provides insights into how AI models make decisions, enabling organizations to understand and evaluate the potential biases or risks associated with these judgments.

This helps businesses ensure fairness, transparency, and accountability in their AI systems, which is particularly important in regulated industries.

Contact the company for a personalized quote.

Qlik Staige is an AI governance solution enabling AI-powered analytics, allowing businesses to dynamize visualizations, generate natural language readouts that provide easy-to-understand summaries, and allow interactive, conversation-based experiences with data.

The platform empowers businesses to harness the capability of AI while maintaining control, security, and governance over their AI models and data. It provides data integration, quality assurance, and transformation capabilities to create AI-ready datasets. The tool facilitates the automation of machine learning processes, allowing analytics teams to generate predictions with explainability and integrate models in real time for comprehensive what-if analysis.

Monitaur facilitates orchestration and collaboration among various teams and stakeholders involved in the AI model development and deployment process, including ML engineers, data scientists, compliance officers, underwriters, and executive decision-makers. Monitaur helps organizations demonstrate that their AI models are compliant and trustworthy by centralizing governance processes and providing a library of standard policy controls.

The platform is especially beneficial for regulated industries with strict standards and compliance requirements. Thanks to its centrally managed library, organizations in these industries can adhere to regulations and easily demonstrate compliance.

Contact the company for a personalized quote.

Holistic AIs Governance Platform offers a range of features and functionalities to address the various aspects of AI governance, including risk management and compliance. The platform enables organizations to conduct comprehensive audits of their AI systems and generates detailed audit reports documenting the systems performance, vulnerabilities, and areas requiring improvement. The reporting functionality also includes context-specific impact analysis to understand the implications of AI systems on business processes and stakeholders.

Holistic AI supports regulation-specific assessments, ensuring your AI systems comply with relevant laws and regulations. It helps you map, mitigate, and monitor risks associated with specific rules, enabling you to comply.

Contact the company for quotes.

The Credo AI governance platform caters to the needs of AI-powered enterprises by offering features such as a centralized repository of AI metadata, a risk center for visualizing AI risk and value, and automated governance reports for building trust with stakeholders. It also offers an AI registry for tracking AI initiatives, and an AI governance workspace for collaboration on AI use cases.

Credo AI generates automated governance reports, including model cards, impact assessments, reports, and dashboards, which can be shared with executives, board members, customers, and regulators to build trust and transparency around AI initiatives. The companys AI registry feature provides visibility into the risk and value of all AI projects by registering them and capturing metadata to prioritize projects based on revenue potential, impact, and risk.

Available upon request.

Many factors help determine the best AI governance software for your business. Some solutions excel in data and AI privacy regulations, while others are well suited for setting compliance standards, ethical guidelines, or risk assessment.

When shopping for the best AI governance solution, you should look for software that offers features such as data governance, model management, compliance automation, and monitoring capabilities. Depending on the nature of your business, you may need industry-specific AI governance software tailored to meet your sectors unique requirements.

For example, healthcare organizations may need software compliant with HIPAA regulations, while financial institutions may require fraud detection and risk assessment tools. Conduct thorough research, evaluate your options, and consider your needs and budget to determine the best AI governance software for your business.

We looked at the cost of the software and whether it provides value for the price. Tools that offer free trials and transparent pricing earned higher marks in this category.

The feature set of the AI governance software was a significant factor in our evaluation. We assessed the range of features offered,

We also considered whether the software could be customized to meet the specific needs of different organizations.

We assessed the softwares user interface and user experience to determine how easy it is for users to navigate, set up, and use the software. We considered whether the software offers intuitive workflows and customization options.

We evaluated the level of customer support the software provider offers, including availability, responsiveness, and expertise. We looked at support channels, documentation, training resources, and user communities.

AI governance practices align with AI ethical considerations by ensuring that AI systems are developed, deployed, and used to uphold ethical principles such as fairness, transparency, accountability, and privacy.

Industries such as financial services, healthcare, and technology are leading the adoption of AI governance, as these sectors often deal with sensitive data and high-stakes decisions where ethical considerations are crucial.

As more and more organizations across various sectors continue to implement artificial intelligence solutions in their workflow, it becomes critical to have AI governance in place to ensure the responsible and ethical use of AI.

If AI is left unchecked, it can quickly become a source of biased decisions, privacy breaches, and other unintended consequences. Therefore, AI governance tools should not be an afterthought but instead an integral part of your companys AI strategy.

For a full portrait of the AI vendors serving a wide array of business needs, read our in-depth guide:150+ Top AI Companies 2024

See original here:
9 Top AI Governance Tools 2024 - eWeek

AI that designs and runs networks might not be far off – Light Reading

Posted on March 2, 2024 by Danzig

Artificial intelligence (AI) can already be unleashed to write sonnets in the style of Shakespeare or music that evokes Beethoven. But what if an AI could bypass the Open Systems Interconnection (OSI) model, the decades-old system for conceptualizing network design, and come up with a better air interface than anything a human has produced? If it sounds like pure science fiction, think again.

Deep within the darkest laboratories in Sweden, where Ericsson pioneers wireless research, work has already started on potentially taking AI to this next level in network design. "You could let the algorithm figure out a better way," said Erik Ekudden, Ericsson's chief technology officer, at a recent press event in London. Companies such as Qualcomm and Picocom, a small developer of silicon for small cells, are also thought to be exploring the possibilities. It all raises the prospect of network technologies designed entirely by machines, beyond the comprehension of the world's smartest scientists.

As it exists today, the OSI model imagines the network in seven layers, starting with the physical device links and moving all the way up to customer-facing applications. None of this is especially scientific, but it allows even the cleverest specialists to make sense of the whole shebang rather than just understanding their own contributions. "We've done that to make it intelligible to us," said Gabriel Brown, a principal analyst with Heavy Reading (a Light Reading sister company) on a recent Telecoms.com podcast. "But an AI-native thing doesn't have to have those limitations on it."

The technology could feasibly work by combining the advanced pattern-recognition principles of generative AI and large language models with the time series data found in a radio link. "You can start making relations between that," Brown told the Telecoms.com podcast. "You use the same technologies, the same computing ideas, to develop a much more efficient system."

Trusting your AI

Scrapping the OSI model, though, would inevitably conjure alarming thoughts of AI-created technologies that no person can understand, and subsequent Armageddon if the AI goes haywire. "If you apply new AI technologies to rebuild or build a new system, of course it would have to be not a black box but a very open box so that we can check what really goes on," said Ekudden, emphasizing the need for what he calls "trustworthy AI."

Keen to demonstrate a commitment to AI transparency, Ericsson this month added an "Explainable AI" feature to its latest software products. The basic idea is to show a telco how the AI-powered technology reached the conclusions it did. Ekudden, though, sounds unimpressed with broader government efforts in this trustworthiness area. "Current regulation, even after the UK summit, is not very helpful," he said.

Held in November, that summit featured Rishi Sunak, the UK's prime minister, in conversation with Elon Musk, naturally spotlighting generative AI and social media. But non-generative AI has already been used heavily to optimize networks, Ekudden points out. "We cannot go back."

Network designs that go far beyond what people have accomplished would be revolutionary, akin to an AI that fooled literary critics into thinking it were a human novelist with a fresh and unique style, or one that made other scientific breakthroughs. But Ericsson's CTO plays down any likelihood generative AI can produce something of major value.

"It depends on how good or bad a job we have done as humans, because the beauty of generative AI is that it really mimics humans very well," he said. With today's networks now optimized to a high level, humans are not even the best reference point for newer forms of AI. "Machines are already doing that better," explained Ekudden. "The kind of data-driven machine-learning capabilities that we have employed to build the best coding scheme, the best OSI stack, are pretty good. If you want some level of generative AI to outperform that, you really need to do a good job at generative AI."

He is not the only human doubting AI will have much impact anytime soon. "Broadly, these AI systems work by pattern recognition," said William Webb, the chief technology officer of Access Partnership, a consulting company, and a former director at UK regulatory body Ofcom. "They get trained on thousands to millions of examples which are already labelled as 'good' or 'bad.' They learn what patterns lead to 'good' and to 'bad' and can then influence future operation. But there isn't much labelled data so it's hard to understand what the AI would be trained on," he told Light Reading by email.

Webb is also dubious because the sheer quantity of network variables would require that a huge data set be used for training purposes. "There are good uses of AI in telecom networks, but it's not clear this is one of them," he said.

When machines give the orders

A far more realistic scenario in the next few years is that networks designed and installed by humans will be manageable without them. Much like carmakers, telecom players now refer to five levels of automation. Under definitions established by the TM Forum, a telecom standards group, Level 1 denotes "assisted operations and maintenance," while Level 5 is a "fully autonomous network."

Those may be technically possible in just a few years' time, according to Ekudden. But he sounds unconvinced they should be widely deployed, likening them to "robots on the streets" and self-driving cars outside controlled areas. "Unless you do that in a responsible way, so you are actually creating risks, I don't think it is a good idea to do it, and the same is true for networks," he said.

Nevertheless, Ericsson has already applied AI tools to automate parts of its managed services unit. Back in 2019, before that had been merged with other units to form the current cloud software and services business group, Peter Laurin, then Ericsson's managed services head, held AI responsible for some of the 8,000 job cuts at his unit in the previous year, more than a fifth of the former total.

Many big telcos have also been moving quickly to automate operations and technical activities. Shankar Arumugavelu, the chief information officer of Verizon, is already eyeing the transition to Level 4 capability described by the TM Forum as a "highly autonomous network" and he evidently believes technology is not the main barrier. "Today, some of the key decisions that are being made by humans are we comfortable letting that go and having the machine make that decision?" he said at a recent press briefing organized by the TM Forum. "I think that is the bridge we have to cross."

The transfer of decision-making responsibilities to AI would stoke obvious ethical concerns and threaten to make humans entirely redundant in this part of the telco business. But Arumugavelu envisages a set-up in which engineers act on the insights and recommendations of the AI. "Work goes to the people rather than people going to the work," he said. "This is the machine I am talking about that is sending and directing work to groups."

Headcount has fallen dramatically at Verizon and other large telcos in the last decade, as data-gathering by Light Reading has illustrated, although job cuts can be attributed in many cases to merger activity, the sale of assets and other, more mundane, efficiency measures. Yet Verizon has been able to grow annual sales by 2% since 2018, despite cutting more than 39,000 jobs or 27% of the total over that period.

A big question, though, is whether job cuts on the technology side will do much to boost profits. Scott Petty, the chief technology officer of Vodafone, thinks not. "That's not a massive driver of opex or costs in the organization," he said at the same TM Forum event, citing energy, leases and maintenance of software and equipment as much bigger expenses. "People is an important cost, but it is not the most important in the cost of a network."

Seeq Announces Generative AI Capabilities with Seeq AI Assistant – AiThority

Posted on March 2, 2024 by Danzig

New AI capabilities accelerate operational excellence across the industrial enterprise

Seeq, a leader in industrial analytics and AI, unveiled the Seeq AI Assistant, a generative AI (GenAI) resource embedded across its industrial analytics platform. The Seeq AI Assistant provides real-time assistance to users across the enterprise, empowering them to accelerate mastery of the Seeq platform, build advanced analytics, machine learning, and AI skills and knowledge, and accelerate insights to improve decision making in pursuit of operational excellence and sustainability.

Recommended AI News:Nasuni Launches Nasuni IQ to Unlock Data Silos for AI Services

In a recent study byDeloitte, 93% of industrial companies believe AI will be a game changer for driving growth and innovation in the industrial sector. The analytical insights required to bolster operational excellence continue encountering roadblocks due to a shortage of skills, siloed capabilities within organizations, and untapped stockpiles of time series data.

Seeq has over a decade of experience working with some of the most recognizable names in the oil & gas, chemicals, pharmaceuticals, and other industrial sectors to remove or mitigate these roadblocks. The Seeq AI Assistant provides organizations with the opportunity to further debottleneck their most precious resource the people at the frontlines of their processes and decisions.

GenAIis a type of artificial intelligence capable ofgenerating new content,such as text, images, and code in response to prompts entered by a user. GenAI models aretrained with existing data to learn patterns that enable the creation of new content. WhileGenAI is a powerful technology, it isnt innately capable of generating information and guidance applicable within the complexity and context of an industrial production environment.

Seeq is uniquely positioned to drive industrial innovation with GenAI, given the companys expertise in industrial data and its open and extensible analytics platform that was developed to leverage and serve subject matter experts and their enterprise decisions. Seeq provides on-demand access to critical time series data, data contextualization capabilities, and established intellectual property. Utilizing the extensive body of advanced analytics, data science, machine learning and coding knowledge held in Seeq technical documentation and its knowledge base, Seeq is operationalizing the power of GenAI for its customers. Combining these competencies with prompt engineering curated by the world-class analytics and learning engineers at Seeq, the Seeq AI assistant generates accurate and actionable suggestions for analytical approaches and techniques, code generation and more. Seeq also supports multiple providers and LLMs for organizational flexibility.

With the Seeq AI Assistant, we expect to decrease our process experts learning curve for advanced analytics and machine learning by 50% or possibly more, saidBrian Scallan, Director of Continuous Improvement at Ascend Performance Materials. For our extensive user base, this translates into immediate enhancements in process quality and yields, significantly elevating efficiency and value across the organization.By combining GenAI with advanced industrial analytics, organizations can unlock new levels of efficiency, accuracy, and innovation that deliver measurable business impact, saidDustin Johnson, Chief Technology Officer at Seeq. Integrating the Seeq AI Assistant across the Seeq platform enables team members across industrial organizations to harness the power of GenAI to drive favorable operational excellence, profitability, workforce upskilling, and sustainability outcomes and stay ahead in an increasingly competitive landscape.

Recommended AI News:Geotab Safety Center Supports AI-Powered Predictive Collision Solutions

In short, the Seeq AI Assistant empowers frontline experts in process engineering, data science and operations to rapidly bridge process, analytics and coding knowledge gaps, unlocking workflows and results that were previously time and effort prohibitive or impossible.

GenAI capabilities are a powerful inclusion in analytics software as a way to democratize AI and machine learning, saidJonathan Lang, Research Director for IDC Industry Operations. Based on conversations with industrial enterprises, GenAI offers a more natural interface to lower the barriers to data analytics, and Seeq has included features to alleviate one of the top concerns companies have about trust by including explainability to ensure the GenAI shows its work.

Seeq is available worldwide through a global partner network of system integrators, which provides training, services, and resale support for Seeq in over 40 countries, in addition to its global direct sales organization.

Recommended AI News:Lightning AI Signs Strategic Collaboration Agreement with AWS

[To share your insights with us as part of editorial or sponsored content, please write tosghosh@martechseries.com]

See more here:
Seeq Announces Generative AI Capabilities with Seeq AI Assistant - AiThority

Today’s AI Won’t Radically Transform Society, But It’s Already Reshaping Business – The Machine Learning Times

Posted on March 2, 2024 by Danzig

Eric Siegel had already been working in the machine learning world for more than 30 years by the time the rest of the worldcaught up with him. Siegels been a machine learning (ML) consultant to Fortune 500 companies, an author, and a former former Columbia University professor, and to him the last year or so ofAI hypehas gotten way out of hand.

Though the world has come to accept AI as our grand technological future, its often hard to distinguish from classic ML, which has, in fact, been around for decades. ML predicts which ads we see online, it keeps inboxes free of spam, and it powers facial recognition. (Siegels popularMachine Learning Weekconference has been running since 2009.) AI, on the other hand, has lately come to refer to generative AI systems likeChatGPT, some of which are capable of performing humanlike tasks.

But Siegel thinks the term artificial intelligence oversells what todays systems can do. More importantly, in his new book The AI Playbook: Mastering the Rare Art of Machine Learning Deployment, which is due out in February, Siegel makes a more radical argument: that the hype around AI distracts from its now proven ability to carry out powerful, but unsexy tasks. For example, UPS was able to cut 185 million delivery miles and save $350 million annually, in large part by building an ML system to predict package destinations for hundreds of millions of addresses. Not exactly society-shattering, but certainly impactful.

The AI Playbookis an antidote to overheated rhetoric of all-powerful AI. Whether you call it AI or MLand yes, the terms get awfully blurrythe book helpfully lays out the key steps to deploying the technology were now all obsessed with.Fast Companyspoke to Siegel about why so many AI projects fail to get off the ground and how to get execs and engineers on the same page.The conversation has been edited for length and clarity.

As someone whos worked in the machine learning industry for decades, how has it been for you personally the last year watching the hype around AI since ChatGPT launched?

Its kind of over the top, right? Theres a part of me that totally understands why the AI brand and concept has been so well adoptedand, indeed, as a child, thats what got me into all this in the first place. There is a side of me that I try to reserve for private conversations with friends thats frustrated with the hype and has been for a very long time. That hype just got about 10 or 20 times worse a year ago.

Why do you think the term artificial intelligence is so misleading now?

Everyone talks about that conference at Dartmouth in the 1950s, where they set out to sort of decide how theyre going to create AI.[Editors note:In 1956, leading scientists and philosophers met at the Dartmouth Summer Research Project on Artificial Intelligence. The conference is credited with launching AI as a discipline.] This meeting is almost always reported on and reiterated with reverence.

But, noI mean, the problem is what they did with the branding and the concept of AI, a problem that still persists to this day. Its mythology that you can anthropomorphize a machine in a plausible way. Now, I dont mean that theoretically, that a machine could never be as all-capable as a human. But its the idea that you can program a machine to do all the things the human brain or human mind does, which is a much, much, much more unwieldy proposition than people generally take into account.

And they mistake [AIs] progress and improvements on certain tasksas impressive as they truly arewith progress towards human-level capability. So the attempt is to abstract the word intelligence away from humanity.

Your book focuses on how companies can use this technology in the real world. Whether you call it ML or AI, how can companies get this tech right?

By focusing on truly valuable operational improvements by way of machine learning. We see that focus on concrete value and realistic uses of todays technology. In part, the book is an antidote to the AI hype or a solution to it.

So what the book does is to break it down into a six-step process that I call BizML, the end-to-end practice for running a machine learning project. So that not only is the number-crunching sound, but in the end, it actually deploys and generates a true return to the organization.

To continue reading this article, click here.

Continue reading here:
Today's AI Won't Radically Transform Society, But It's Already Reshaping Business - The Machine Learning Times

Decoding AI Ethics The Spectator – The Spectator

Posted on March 2, 2024 by Danzig

With recent advancements in artificial intelligence (AI) technology, and pop culture teeming with stories of robotic uprising and man versus machine (Im sorry Dave, Im afraid I cant do that,) it may seem as though our developing technology has been making breakthroughs at an alarming pace. From the birth of ChatGPT to Neuralink having its first person successfully receive a brain implant, recent rapid progress has generated conversations about the anxieties surrounding AI and even doomsday predictions.

However, while the high-profile releases in the past few years have seemed nothing short of exponential, the history of technology leading up to the current AI boom is an inextricable component of its current landscape. Associate Teaching Professor of Philosophy Eric Severson contextualizes AI on a continuum of technologynot only including the development of computing, but of the relationship between humans and tools as a whole. Engaging with that history is a crucial component of understanding AIs role in our world today.

While the uneasiness around the capabilities of AI can be of valid concern, technological advancements have always faced a degree of polarization due to a lack of understanding.

Max Tran, a second-year computer science major, is the president of the Artificial and Intelligent Machine Learning Club (AnIMaL) and emphasized that AI is currently being used as a blanket term, which makes it difficult to differentiate the variance of technology.

I think the machine learning side is being hidden by the marketing side of AI and generative AI. I think that is where part of the confusion and ambiguity comes from because were covering up the actual terms and its making it harder to figure out what this is, Tran said.

Tran went on to explain that machine learning is related to AI, but that not all types of AI being marketed as such are AI by definition and rather fall into the subcategories of machine learning.

Whats the main difference? Machine learning lacks intelligence and is only able to detect patterns based on data using math based algorithms. Tran believes that equating machine learning and AI can be greatly misleading, especially because the criteria of the two is constantly evolving.

Ensuring that perceptions of generative AI are definitionally correct so that users have the tools to properly understand new advancements is critical, but also thinking about its practical functions and the way it will slowly become more integrated into daily lives is another aspect of discussion.

Bryan Kim, a second-year computer science major and event coordinator for AnIMaL, thinks that suddenly having access to the Apple Vision Pro and the AI Pin may feel dystopian to a majority of people, but has the potential to improve accessibility for those that may need more assistance in everyday life.

As proven by our relationship with smartphones, Kim believes that with time we will become more reliant on technology utilizing AI. However, finding a balance between skepticism and receptivity is essential.

Awareness is huge. Just knowing how it works changes a lot of how you view AI. Have an open mind but also have discernment, Kim said. Society is going to change and we should expect that, but thinking about implications can help generate conversation.

Although some of the fear surrounding AI can stem from irrational notions, there have been instances where generative AI has done genuine harm, often through perpetuating harmful stereotypes and prejudices based on seemingly neutral prompts.

When the Washington Post requested a depiction of a productive person, it generated white men dressed in suits. Yet, when asked to generate an image of a person at social services, it mainly depicted people of color. Similar racially biased images were produced when asked to generate images of routine activities and common personality traits.

Severson raised concerns about how toolsincluding, but not limited to, AIexist in the context of their society. Especially when that society maintains socioeconomic inequities or other forms of oppression, those same problems can be internalized and reproduced with the tools themselves.

When we develop new tools in a sexist society, we should expect that they subtly and invisibly exacerbate the privileges experienced by men. In a white supremacist society, tools that we developwith or without anyones intentional effortwill often subtly or directly exacerbate the oppression of people of color, Severson said. What we need to be aware of every time we make or take up a tool is that we do it in a society that is already bent away from justice. Tools are not neutral.

He compared the phenomenon to the history of medicine, wherein tools were developed among and with a particular demographic of young, college-educated men in minda history that still perpetuates medical discrimination to this day.

Racialized outcomes in health care, education and criminal justice are really only explainable by systemic preferences that are carried without anyones direct intention. Racism and sexism do not require intentionality to flourish. They flourish nonetheless, Severson said.

Similarly, if AI were to be continually developed without accounting for how it responds to and impacts existing social issues, it would continually perpetuate those unexamined problems. Severson emphasized that AI does not simply help its users learn the information they request, but shapes the way they learn and interact with the world epistemically.

Whether it be in classrooms or club meetings on the Seattle University campus, or the relationship between the self and society on a large scale, raising questions about the ethics of AI remains at the forefront of the current conversation.

Original post:
Decoding AI Ethics The Spectator - The Spectator

The future of artificial intelligence in trucking – CCJ

Posted on March 2, 2024 by Danzig

Jason Cannon: CCJ's 10-44 is brought to you by Chevron Delo heavy duty diesel engine oil. Now there's even more reasons to choose Delo.

Matt Cole: Artificial intelligence has come a long way in trucking to help improve efficiencies. How much more can it help?

Jason Cannon: You're watching CCJ's 10-44, a weekly episode that brings you the latest trucking industry news and updates from the editors of CCJ. Don't forget to subscribe and hit the bell for notifications so you'll never miss an installment of 10-44. Hey, everybody. Welcome back. I'm Jason Cannon and my co-host on the other side is Matt Cole. AI is not a new idea in trucking. It's been around for more than a decade, and over that time, its capabilities have only grown.

Matt Cole: AI is used in trucking to help improve safety, efficiency, performance and more, by helping people do their jobs better in many cases. Joining us this week is Yoav Amiel, chief information officer at RXO, who talks about the advancements in AI within trucking and where it might eventually lead.

Yoav Amiel: RXO is an asset light transportation company and it's a technology group. We build all the technologies that help the business grow over time. One of our biggest platforms that we have to drive transportation is called RXO Connect, and this platform in many ways sits on top of all the lines of business that we are serving, brokerage at the front, we have managed transportation, last mile and freight forwarding.

Now, this platform was built from the ground up, meaning that we had the luxury of building things in a microservices approach allowing us to build this innovation, and I know that today, we're going to focus a lot around AI and machine learning, and I think on that front, we've been practicing AI for more than a decade now. This is not new to us, but of course the more AI is evolving over time, we are progressing with that and making sure that we take advantage of all the new capabilities that are available for us.

Jason Cannon: A lot of times when people think about AI and automation, they think it's a threat to their job, but Yoov says it really should be viewed as a supplement to help us do our jobs more efficiently.

Yoav Amiel: AI, it's a science of making machines do things that would require let's call it a human-like intelligence. There are a lot of areas, techniques within the AI. Think about machine learning, deep learning, neural networks. A lot of progress is happening there, but it's important to know that it's not to replace the human intelligence. In many ways, it's to amplify our creativity and ability to complete tasks, and I look at it more of an augmented intelligence, for us to be able to be better and be able to spend our time in the most important task that we need to do.

Matt Cole: RXO recently launched an AI driven system of its own to streamline the check-in process for trucks at warehouses and distribution centers. Yoav tells us how it works after a word from 10-44 sponsor, Chevron Lubricants.

Speaker 4: These past few years have been less than easy. We've encountered challenges we never imagined we'd ever have to deal with, from makeshift home offices and video meetings to global supply chain uncertainty, price instability, market disruptions, and everything in between. Delivering the level of services and products our customers had come to expect was difficult for all of us. We can't change what's behind us, but we can definitely learn from it. We can adapt, evolve, and take steps to reset our thinking, adapt our strategies, and restore your trust in us to better meet your needs, now and in the future. That change begins today.

Today, we break with convention and introduce a rebalance line of Delo heavy duty engine oils. We've reduced our product line from four categories to two. Consolidated and simplified, this lineup removes complexity from the manufacturing processes, enhancing price stability and supply chain reliability so you can trust you'll have the premium products you need to keep your business always moving forward. Our break with convention optimizes the Delo lineup to allow you to provide your customers with the best synthetic blend and synthetic heavy duty engine oils in the market, fully available at prices you can rely on. It's your assurance that you'll be well positioned to be their trusted source for proven engine protection that keeps equipment on the job, giving your customers even more reasons to choose Delo.

Yoav Amiel: When we build technology, we make sure that we build it to drive a business lever. We don't just build technology for the sake of technology, and we make sure that it ties to either productivity or volume or margins overall. In this case, this is a productivity type of an initiative and we saw that in our big warehouses and yards, there is a gate slowdowns when trucks are coming in. So we combine the already video that are coming from the gate, the CCTV that we have there, and apply the machine learning and AI capabilities to be able to extract the information of the truck and help the person at the gate to be more effective.

Instead of going through the track and writing down the number of the truck and the driver details on the piece of paper, going to the computer and typing that in, that actually allowed that person to be much more effective, and the moment a truck is coming and there is already an appointment in the system, it extracts the relevant information, able to match it to that appointment and make the whole process of checking in and getting into the yard much, much more efficient.

From our measurement or on average, they reduced about 30% of the wait time at the gate. And we get a lot of positive reactions from both, of course, the carriers and the operations at the gate, and we don't want to stop there. There are a lot of opportunities to even get efficiencies within the yard, leveraging drones and being able to understand what is going on instead of having a human trying to go through that. And of course when a human is involved, sometimes we make errors, and the moment you make an error, that creates more delays or challenges for the process. So leveraging these type of techniques, not just reducing the wait time, it reduces the error rate as well.

Jason Cannon: AI and trucking has evolved considerably over the last decade, and Yoav says there are still a lot of gains to be made.

Yoav Amiel: It started in the past by object recognition, image recognition, then it evolved into insights and system that came with recommendation like matching loads with carriers. And in today's world, I'm excited about things like task completion. I can just talk to you like let's talk about a futuristic state where a carrier can talk to a machine or type or whatever interface they want to interact with the machine, and say, "Book me a load for this week. I want to leave my facility Monday morning and I want to return by Friday, 5:00 PM." I don't know, maybe there is a birthday in the family or I want just to be there for dinner, and I want to spend one night in Chicago, and now just do it for me.

And the system will automatically find maybe multiple loads that can fit these requirements and of course the truck type and their certification that this specific driver have, and minimize the empty miles and maximize the revenue for the driver, and the driver does not or the carrier does not need to do anything. The system will automatically book the load, assign it to them, and this is the greatness. The way I look at AI and machine learning, it's a win-win type of a thing because everybody gets something out of it and we could focus most of our time in the things that we bring value as human beings to the surface.

Matt Cole: Yoav says the biggest benefits trucking has seen as a result of AI are the efficiencies gained from automation, load matching and more.

Yoav Amiel: Drilling down into the transportation industry, I think there are a lot of benefits. Of course, the first thing that comes to mind and the example that we just gave with the yard is efficiency and automation, but there are a lot of areas in the transportation industry like load matching which I refer to finding the right loads for the right carriers, route optimization, even warehouse workforce planning, and even document processing. In today's large language models, you could extract information from documents even if the document is not structured. We are actually using that at RXO as well. When a shipper is asking us for a quote via email in an unstructured format, we are using technology to be able to extract the request and even automatically send a quote to that shipper. So that's around automation.

Another benefit is of course it drives cost reduction, fuel, time on tasks, resource planning. As I mentioned, if you talk about warehouses, you don't want to find yourself that you have more workforce than what you really need to, so the moment you plan right, you could save a lot of costs.

The other benefit around AI and machine learning is around decision making, and I touched that a little bit earlier. That ability of a process to analyze massive, massive data sets, I'm not saying a human being cannot do that but that will take a long time. And helping that person to make a data-driven decision, that's a huge benefit of leveraging these machines. Overall, from customer service, think about personalization. The systems today are personalized. One user that logs into a system sees a different flavor of the system according to their behavior, past behavior or attributes or preferences. And the ability to provide a 24/7 support today, the gen AI is a big hype today and these bots that can help you with task completion and provide a self-service, a 24/7 self-service capability, this is a huge, huge benefit that we gain from this type of engine.

Jason Cannon: While there are plenty of benefits to using AI, there are also drawbacks that include security over-reliance and a lot more.

Yoav Amiel: The drawbacks are very, I would say, similar to any other area, but I can call out a few. So of course, the first one is around security. At the moment, you have AI engines and machine learning. You have a lot of data concentrated in one place, and then you could start having areas of data privacy and overall bad actors that may be trying to misuse this information. In addition, what we see today, that the bad actors are becoming more sophisticated. They themselves are leveraging AI and machine learning to try and trick the user in order to gain access to specific areas. So we need to be smarter and smarter over time to make sure that from a security perspective and a privacy perspective, we are protecting ourselves and protecting the data.

An area of AI which is a concern or an area we need to pay attention to is the fairness of AI or the bias that could be embedded within the data. A lot of engines rely on the data on the internet, and the data on the internet could be biased, and AI, the data you feed the engine, they learn from that and they could actually build bias within their recommendation. So we need to build a mechanism, and I know maybe it's funny to mention that it could be a machine learning mechanism that looks at the results of the AI engine and flag areas where things seem biased towards one specific group or one specific type of actions.

There is another area where I mentioned that around explainability. In today's world, we create that dependency on machines, so one, of course we need to make sure that we have the processes and mechanisms to be able to proceed, even if from some reason, the computer or the system stopped. And you think about autonomous trucks and areas like that, what happens if the computer from some reason is not functioning? You need the ability to know how to drive the car or to take control from remote and be able to address the situation by maybe stopping on the sidelines or doing anything like that.

But going to the black box type of approach is that we create that dependency from whatever the system recommends us or serves to us, we consider that as probably the best thing or we consider that as the truth. And in today, there is an effort to build explainable AI, it's called XAI, where the engines bring together with the results, they provide the reasoning behind it. So you could say, oh, you recommend this thing but these are the reasons why, because you did that in the past, because other let's say carriers like you optimized for this specific route. Having that explainability will allow us not just to understand what is going on, but even to be able to flag anything that is not necessarily relying on the right information.

Matt Cole: Like it has over the last decade, AI will continue to grow over the next decade and beyond. What will that look like in trucking?

Yoav Amiel: First, of course, there are use cases that we are not even aware of and we cannot even dream of, but the thing that we could see in front of us, I would say one is autonomous trucks and autonomous self-driving trucks and vehicles and even flying vehicles. You could think about drones delivering packages or even taking passengers on a vertical liftoff and landing. In addition, of course there is what is referred to as the connected infrastructure. The moment we'll get to a point that all the things, let's say driving on the road, will be connected together. If you think about even traffic lights and all the indications that are coming from the roads themselves, you could build a much more efficient transportation platform where vehicles will drive on this infrastructure in a much smoother way and allow the decision making and dynamically adjusting.

Thinking about traffic lights or anything like that. You could dynamically adjust that to the trucks or vehicles on the road and minimize even accidents. From a safety perspective, connected infrastructure will have a huge, huge impact on the transportation overall. From a safety perspective, I mentioned a little bit about the connected infrastructure, but you could think about the inside the vehicle as well, distracted drivers or drunk drivers. Even monitoring the vehicle health, being able to understand that something is about to break and make sure that you provide enough time for alerts, so you could even predict maintenance that could save money and make sure that all the vehicles on the road are in a safe state, again, minimizing any accidents or unplanned type of activities.

Jason Cannon: That's it for this week's 10-44. You can read more on ccjdigital.com. While you're there, sign up for our newsletter and stay up to date on the latest in trucking industry news and trends. If you have any questions or feedback, please let us know in the comments below. Don't forget to subscribe and hit the bell for notifications so you can catch us again next week.

See the original post:
The future of artificial intelligence in trucking - CCJ

AI-powered platform could help law enforcement get ahead of designer drugs – University of Alberta

Posted on March 2, 2024 by Danzig

An online platform powered by deep learning can predict the makeup of new psychoactive substances to help law enforcement in the fight against dangerous drugs.

Called NPS-MS, the platform houses a method that predicts novel psychoactive substances using deep learning, a type of machine learning in the field of artificial intelligence that involves training computing algorithms using large data sets to uncover complex relationships and create predictive models.

Illegal drugs are a small group of very similar-looking structures, says Fei Wang, a doctoral student in the Department of Computing Science at the University of Alberta and first author on the international study. The nature of psychoactive substances is that their structures are constantly evolving.

More than 1,000 such substances have been synthesized in the past decade, designed to mimic the effects of drugs like cocaine and methamphetamine while skirting laws that dont yet account for new chemical analogues.

We hope this program will reduce the flow of illegal drugs that hurt people and society, says study co-author Russ Greiner, computing science professor and Canada CIFAR AI Chair at the Alberta Machine Intelligence Institute (Amii).

Laboratory work to identify novel psychoactive substances requires expensive reference data and labour-intensive testing to produce spectrographs chemical information references that can be used to confirm an unknown substance.

Wangs research began with programming machine learning tools to aid in studying human metabolites and small molecules. After adapting a machine learning method to identify novel psychoactive substances, NPS-MS was trained using results from DarkNPS, a generative model built at the U of A to predict the spectrograph of potential NPS compounds.

After researchers in Denmark noticed Wangs computing technology might apply to identifying novel psychoactive substances, NPS-MS successfully identified a variant of phencyclidine, more commonly known as PCP, without the use of any reference standards.

The NPS-MS algorithm uses a data set of 1,872 spectrographs to cross-reference 624 new psychoactive substances.

With machine learning, there are no limitations to how many compounds we can collect for a data set, says Wang.

Wang says about 40,000 molecules have high-resolution spectrometry data available for forensic teams to cross-reference unknown substances, noting that databases containing more of the around 100 million known chemical substances can be expensive for labs to obtain.

NPS-MS will greatly reduce the amount of work involved for labs.

The research was supported by funding from Genome Canada, the Natural Sciences and Engineering Research Council of Canada and Alberta Machine Intelligence Institute with computational resources from the Digital Research Alliance of Canada.

The study, Deep Learning-Enabled MS/MS Spectrum Prediction Facilitates Automated Identification of Novel Psychoactive Substances, was published in Analytical Chemistry.

View original post here:
AI-powered platform could help law enforcement get ahead of designer drugs - University of Alberta

Genomic prediction in multi-environment trials in maize using statistical and machine learning methods | Scientific … – Nature.com

Posted on January 12, 2024 by Danzig

Phenotypic data

The data are composed of 265 single cross hybrids from the maize breeding program of Embrapa Maize and Sorghum evaluated in eight combinations of trials/locations/years under irrigated trials (WW) and water stress (WS) conditions at two locations in Brazil (JanabaMinas Gerais and TeresinaPiau) over two years (2010 and 2011). The hybrids were obtained from crosses between 188 inbred lines and two testers. The inbred lines belong to heterotic groups: dent (85 inbred lines), flint (86 inbred lines), and an additional group, referred to as group C (17 inbred lines), which is unrelated to the dent and flint origins. The two testers are inbred lines belonging to the flint (L3) and dent (L228-3) groups. Among the inbred lines, 120 were crossed with both testers, 52 were crossed with the L228-3 tester only, and 16 lines were crossed with the L3 tester only. Silva et al. (2020) evaluated the genetic diversity and heterotic groups in the same database. These authors showed the existence of subgroups within each heterotic group. Therefore, once these groups were not genetically well defined and the breeding program from Embrapa Maize was in the beginning, the effect of allelic substitution in both groups are assumed to be the same. More details on the experimental design and procedures can be found in Dias et al.13,30.

The experiment originally included 308 entries, but hybrids that were not present in all environments were also removed to evaluate the genomic prediction within each environment, resulting in a total of 265 hybrids for analysis. Each trial consisted of 308 maize single cross hybrids, randomly divided into six sets: sets 13 for crosses with L3 (61, 61, and 14 hybrids each), and sets 46 for crosses with L228-3 (80, 77, and 15 hybrids each). Four checks (commercial maize cultivars) were included in each set, and the experiment was designed in completely randomized blocks. Between trials, hybrids within each set remained the same, but hybrids and checks were randomly allocated into groups of plots within each set. This allocation varied between replicates of sets and between trials. The WS trials had three replications, except for the set containing 15 hybrids and the trials evaluated in 2010, which had two replications. All WW trials, except for the trial in 2011, had two replicates.

Two agronomic traits related to drought tolerance were analyzed: grain yield (GY) and female flowering time (FFT). GY was determined by weighing all grains in each plot, adjusted for 13% grain moisture, and converted to tons per hectare (t/ha), accounting for differences in plot sizes across trials. FFT was measured as the number of days from sowing until the stigmas appeared in 50% of the plants. A summary of means, standard deviations, and ranges of both evaluated traits are available in Table 1.

To conduct the analyses, hybrids considered as outliers were removed (i.e., hybrids that presented phenotypic values greater than 1.5interquartile range above the third quartile or below the first quartile) for the GY and FFT traits. The variations in predictive abilities among hybrids of T2, T1, and T0 are widely recognized31. However, the primary aim of our study was to compare different prediction methodologies in MET assays. In this study, there were 240 T2 hybrids and 68 T1 hybrids, with T2 hybrids had both parents evaluated in different hybrid combinations, while hybrids being single-cross hybrids sharing one parent with the tested hybrids. Given the realistic nature of our scenario, we have a limited and imbalanced distribution of these hybrid groups, making a fair comparison challenging. Consequently, we opted to construct a training set comprising T2 and T0 hybrids.

To correct the phenotypic values for experimental design effects, each trial (WW and WS) and environment were analyzed independently to obtain the Best Linear Unbiased Estimator (eBLUEs) for each hybrid, for the two traits evaluated. The estimates were obtained based on the following model:

$${varvec{y}} = 1mu + user2{ X}_{1} {varvec{r}} + {varvec{X}}_{2} {varvec{s}} + {varvec{X}}_{3} {varvec{h}} + {varvec{e}}$$

(1)

where (user2{y }left( {n times 1} right)) is the phenotype vector for (f) replicates, (t) sets of (p) hybrids, and (n) is the number of observations; (mu) is the mean; (user2{r }left( {f times 1} right)) is the fixed effect vector of the replicates; (user2{s }left( {t times 1} right)) is the fixed effect vector of the sets; ({varvec{h}}) (left( {p times 1} right)) is the fixed effect vector of the hybrids; and (user2{e }left( {k times 1} right)) is the residue vector, with (user2{ e} sim ,MVNleft( {0,{varvec{I}}sigma_{e}^{2} } right)), where ({varvec{I}}) is an identity matrix of corresponding order, and (sigma_{e}^{2}) the residual variance. ({varvec{X}}_{{1user2{ }}} left( {k times f} right)), ({varvec{X}}_{{2user2{ }}} left( {k times t} right)) e ({varvec{X}}_{{3user2{ }}} left( {k times p} right)) represents incidence matrices for their respective effects. The eBLUES of each environment were used in further analyses.

A total of 57,294 Single Nucleotide Polymorphisms (SNPs) markers were obtained from 188 inbred lines, and two testers used as parents of the 265 single cross hybrids. The genotyping by sequencing (GBS) strategyare detailed in Dias et al.13. For the quality control, SNPs were discarded if: the minor allele frequency was smaller than 5%, more than 20% of missing genotypes were found, and/or there were more than 5% of heterozygous genotypes. After filtering, missing data were imputed using NPUTE. Then, for each SNP, the genotypes of the hybrids were inferred based on the genotype of their parents (inbred line and tester). The number of SNPs per chromosome ranged from 3121 (chromosome 10) to 7705 (chromosome 1), totalizing 47,127 markers.

The additive and dominance genomic relationship matrices were constructed32 based on information from the SNPs using the package AGHmatrix33, following VanRaden34 and Vitezica et al., respectively.

Genomic predictions were performed using the Genomic Best Line Unbiased Prediction (GBLUP) method using the package AsReml v. 436. Two groups were considered: the first group comprised four environments under WW conditions, and the second included four environments under WS conditions. The linear model is described below:

$$overline{user2{y}} = mu 1 + user2{ Xb} + {varvec{Z}}_{1} {varvec{u}}_{{varvec{a}}} + {varvec{Z}}_{2} {varvec{u}}_{{varvec{d}}} + {varvec{e}}$$

(2)

where (user2{overline{y} }left( {pq times 1} right)) is the vector of eBLUES previously estimated for each environment with (p) hybrids and (q) environments;(mu) is the mean; (user2{b }left( {q times 1} right)) is the vector of environmental effects (fixed); ({varvec{u}}_{{user2{a }}} left( {pq times 1} right)) is the vector of individual additive genetic values nested within environments (random), with ({varvec{u}}_{{varvec{a}}} sim MVNleft( {0,left[ {{varvec{I}}_{{varvec{q}}} sigma_{{u_{a} }}^{2} + rho_{a} left( {{varvec{J}}_{{varvec{q}}} - {varvec{I}}_{{varvec{q}}} } right)} right] otimes {varvec{A}}} right)), where ({varvec{A}}) is the genomic relationship matrix between individuals for additive effects, (rho_{a}) is the additive genetic correlation coefficient between environments, ({varvec{I}}_{{varvec{q}}} user2{ }left( {q times q} right)) is an identity matrix, ({varvec{J}}_{{varvec{q}}} user2{ }left( {q times q} right)) is a matrix of ones, and (otimes) denotes the Kronecker product; ({varvec{u}}_{{varvec{d}}}) (left( {pq times 1} right)) is the vector of individual dominance genetic values nested within environments (random), with ({varvec{u}}_{{varvec{d}}} sim MVNleft( {0,left[ {{varvec{I}}_{{varvec{q}}} sigma_{{u_{d} }}^{2} + rho_{d} left( {{varvec{J}}_{{varvec{q}}} - {varvec{I}}_{{varvec{q}}} } right)} right] otimes {varvec{D}}} right)), where ({varvec{D}}) is the genomic relationship matrix between individuals for dominance effects, (rho_{{varvec{d}}}) is the dominance correlation coefficient between environments; ({varvec{e}}) (left( {pq times 1} right)) is the random residuals vector with ({varvec{e}}sim MVNleft( {0,{varvec{I}}sigma_{e}^{2} } right)). The capital letters (user2{X }left( {pq times q} right),user2{ Z}_{1} left( {pq times pq} right)) and ({varvec{Z}}_{2} user2{ }left( {pq times pq} right)) represent the incidence matrices for their respective effects, (1user2{ }left( {pq times 1} right)) is a vector of ones. The (co)variance components were obtained using the residual maximum likelihood method (REML)37.

Two alternative models were also used. The first for genomic prediction retained only additive effects by removing ({varvec{u}}_{{varvec{d}}}) from Eq.(2). The second model was used to estimate the genetic parameters within each environment separately.

The significance of random effects was tested using the Likelihood Ratio Test (LRT)38, given by:

$$LRT = 2*left( {LogL_{c} - LogL_{r} } right)$$

(3)

where (LogL_{c}) is the logarithm of the likelihood function of the complete model (with all effects included), and (LogL_{r}) is the logarithm of the restricted likelihood function of the reduced model (without the effect under test). Effect significance was tested by LRT using the chi-square (X2) probability density function with a degree of freedom and significance level of 5%39.

The narrow-sense heritability (({ }h^{2})), the proportion of variance explained by dominance effects ((d^{2})), and the broad-sense heritability (left( {H^{2} } right)) for each trait were estimated following Falconer and Mackay 199635.

Similar to the previous topic, the trials were divided between WW and WS conditions, and the potential of regression trees (RT) was explored using the following three algorithms: bagging, random forest, and boosting22. Bagging (Bag) is a methodology that aims to reduce the RT variance22. In other words, it consists of obtaining D samples with available sampling replacement, thus obtaining D models (hat{f}^{1} left( x right), hat{f}^{2} left( x right), ldots , hat{f}^{D} left( x right)), and finally use the generated models to obtain an average, given by:

$$hat{f}_{medio} left( x right) = frac{1}{D}mathop sum limits_{d = 1}^{D} hat{f}^{d} left( x right)$$

(4)

This decreases the variability obtained in the decision trees. The number of trees used in Bag is not a parameter that will result in overfitting of the model. In practice, a number of trees is used until the error has stabilized22. The number of trees sampled for Bag was set at 500 trees.

Random forest (RF) was proposed by HO40 and it is an improvement of Bag to avoid the high correlation of the trees and to improve the accuracy in the selection of individuals. RF changes only the number of predictor variables used in each split. That is, each time a split in a tree is considered, a random sample of (m) variables is chosen as candidates from the complete set of (p) variables. Hastie et al.21 suggest that the number of predictor variables used in each partition is equal to (m = frac{p}{3}) for regression trees. The number of trees for the RF was set at 500.

Boosting uses RT by adjusting the residual of an initial model. The residual is updated with each tree that grows sequentially from the previous tree's residual, and the response variable involves a combination of a large number of trees, such that:

$$hat{f}left( x right) = mathop sum limits_{b = 1}^{B} {uplambda } hat{f}^{b} left( x right)$$

(5)

The function (hat{f}left( . right)) refers to the final tree combined with sequentially adjusted trees, and is the shrinkage parameter that controls the learning rate of the method. Furthermore, this method needs to be adjusted with several splits in each of the trees. This parameter controls the complexity of the Boost and is known as the depth. For Boosting, the number of trees sampled was 250, with a learning rate of 0.1 and a depth of 3.

To perform hybrid prediction for each environment based on MET dataset, we propose the incorporation of location and year information in which the experiments were carried out as factors in the data input file together with SNPs markers as predictors in machine learning methodologies. As a response variable, the eBLUEs previously estimated by Eq.(1) were used.

For the construction of the bagging and random forest models, the randomForest function from the package randomForest41 was used. Finally, the package's gbm function gbm42 was used for boosting. All analyzes were implemented in the software R43.

Genomic predictions were carried out following Burgueo et al.16, considering two different prediction problems, CV1 and CV2, which simulate two possible scenarios a breeder can face. In CV1, the ability of the algorithms to predict the performance of hybrids that have not yet been evaluated in any field trial was evaluated. Thus, predictions derived from the CV1 scenario are entirely based on phenotypic and genotypic records from other related hybrids. In CV2, the ability of the algorithms to predict the performance of hybrids using data collected in other environments was evaluated. It simulates the prediction problem found in incomplete MET trials. Here, information from related individuals is used, and the prediction can benefit from genetic relationships between hybrids and correlations between environments. Within the CV2 scenario, two different situations of data imbalance were evaluated. In the first, called CV2 (50%), the tested hybrids were not present in half of the environments, while in the second, called CV2 (25%), the tested hybrids were not present in only 25% of the environments. Table 2 provides a hypothetical representation of this CV1, CV2 (50%), and CV2 (25%) validation scheme.

To separate the training and validation sets, the k-folds procedure was used, considering (k = 5). The set of 265 hybrids was divided into five groups, with 80% of the hybrids considered as the training population, and the remaining 20% hybrids considered as the validation population. The hybrids were separated into sets proportionally containing all the crosses performed (DentDent, DentFlint, FlintFlint, CDent, CFlint). The cross-validation process was performed separately for each trait, condition (WS or WW) and scenario (CV1, CV2-50% and CV2-25%) and was repeated five times to assess the predictive ability of the analyses.

The predictive ability within each environment for the conditions (WS and WW) was estimated by the Pearson correlation coefficient44 between the corrected phenotypic values (eBLUES) of Eq.(1) for each environment and the GEBVs predicted by each fitted method.

The authors confirm that all methods were carried out by relevant guidelines in the method section. The authors also confirm that the handling of the plant materials used in the study complies with relevant institutional, national, and international guidelines and legislation.

The authors confirm that the appropriate permissions and/or licenses for collection of plant or seed specimens are taken.

New study uses machine learning to bridge the reality gap in quantum devices – University of Oxford

Posted on January 12, 2024 by Danzig

A study led by the University of Oxford has used the power of machine learning to overcome a key challenge affecting quantum devices. For the first time, the findings reveal a way to close the reality gap: the difference between predicted and observed behaviour from quantum devices. The results have been published in Physical Review X.

Functional variability is presumed to be caused by nanoscale imperfections in the materials that quantum devices are made from. Since there is no way to measure these directly, this internal disorder cannot be captured in simulations, leading to the gap in predicted and observed outcomes.

To address this, the research group used a physics-informed machine learning approach to infer these disorder characteristics indirectly. This was based on how the internal disorder affected the flow of electrons through the device.

Lead researcher Associate Professor Natalia Ares (Department of Engineering Science, University of Oxford) said: As an analogy, when we play crazy golf the ball may enter a tunnel and exit with a speed or direction that doesnt match our predictions. But with a few more shots, a crazy golf simulator, and some machine learning, we might get better at predicting the balls movements and narrow the reality gap.

Associate Professor Ares added: In the crazy golf analogy, it would be equivalent to placing a series of sensors along the tunnel, so that we could take measurements of the balls speed at different points. Although we still cant see inside the tunnel, we can use the data to inform better predictions of how the ball will behave when we take the shot.

Not only did the new model find suitable internal disorder profiles to describe the measured current values, it was also able to accurately predict voltage settings required for specific device operating regimes.

Co-author David Craig, a PhD student at the Department of Materials, University of Oxford, added, Similar to how we cannot observe black holes directly but we infer their presence from their effect on surrounding matter, we have used simple measurements as a proxy for the internal variability of nanoscale quantum devices. Although the real device still has greater complexity than the model can capture, our study has demonstrated the utility of using physics-aware machine learning to narrow the reality gap.

The study 'Bridging the reality gap in quantum devices with physics-aware machine learning has been published in Physical Review X.

Visit link:
New study uses machine learning to bridge the reality gap in quantum devices - University of Oxford

Daily AI Roundup: Biggest Machine Learning, Robotic And Automation Updates – AiThority

Posted on January 12, 2024 by Danzig

This is our AI Daily Roundup. We are covering the top updates from around the world. The updates will feature state-of-the-art capabilities inartificial intelligence (AI),Machine Learning, Robotic Process Automation, Fintech, and human-system interactions.

We cover the role of AI Daily Roundup and its application in various industries and daily lives.

Ahead ofNRF2024, the retail industrys largest event, Google Cloud debuted several new AI andgenerative AI-powered technologies to help retailers personalize online shopping, modernize operations, and transform in-store technology rollouts.

Quantiphi, a leading AI-first digital engineering company andLambda, the GPU cloud and AI infrastructure company founded by deep learning engineers, have partnered to provide tailored AI solutions to enterprise customers and digital AI natives across multiple industries.

Quanta Computer Inc., a trailblazer in advanced technology solutions, andAmbarella, Inc., an edge AI semiconductor company,announced duringCESthe expansion of their strategic partnership. This collaboration is being broadened to include development with Ambarellas CV3-AD, CV7 and new N1 series AI systems-on-chip (SoCs), marking a significant capabilities advancement for cutting-edge AI products.

Patronus AI announced it is partnering with MongoDB to bring automated LLM evaluation and testing to enterprise customers. The joint offering will combine Patronus AIs capabilities with MongoDBs Atlas Vector Search product.

In a strategic move that anticipates the imminent shift indigital advertising,ZeotapData, the leading provider of people-based digital audiences, has announced a partnership with Illuma, the leader in AI-powered expansion and optimisation. This collaboration offers a new tactic in the face of third-party cookie deprecation.

[To share your insights with us, please write tosghosh@martechseries.com]

View post:
Daily AI Roundup: Biggest Machine Learning, Robotic And Automation Updates - AiThority

Toyota’s Robots Are Learning to Do HouseworkBy Copying Humans – WIRED

Posted on January 12, 2024 by Danzig

As someone who quite enjoys the Zen of tidying up, I was only too happy to grab a dustpan and brush and sweep up some beans spilled on a tabletop while visiting the Toyota Research Lab in Cambridge, Massachusetts last year. The chore was more challenging than usual because I had to do it using a teleoperated pair of robotic arms with two-fingered pincers for hands.

Courtesy of Toyota Research Institute

As I sat before the table, using a pair of controllers like bike handles with extra buttons and levers, I could feel the sensation of grabbing solid items, and also sense their heft as I lifted them, but it still took some getting used to.

After several minutes tidying, I continued my tour of the lab and forgot about my brief stint as a teacher of robots. A few days later, Toyota sent me a video of the robot Id operated sweeping up a similar mess on its own, using what it had learned from my demonstrations combined with a few more demos and several more hours of practice sweeping inside a simulated world.

Autonomous sweeping behavior. Courtesy of Toyota Research Institute

Most robotsand especially those doing valuable labor in warehouses or factoriescan only follow preprogrammed routines that require technical expertise to plan out. This makes them very precise and reliable but wholly unsuited to handling work that requires adaptation, improvisation, and flexibilitylike sweeping or most other chores in the home. Having robots learn to do things for themselves has proven challenging because of the complexity and variability of the physical world and human environments, and the difficulty of obtaining enough training data to teach them to cope with all eventualities.

There are signs that this could be changing. The dramatic improvements weve seen in AI chatbots over the past year or so have prompted many roboticists to wonder if similar leaps might be attainable in their own field. The algorithms that have given us impressive chatbots and image generators are also already helping robots learn more efficiently.

The sweeping robot I trained uses a machine-learning system called a diffusion policy, similar to the ones that power some AI image generators, to come up with the right action to take next in a fraction of a second, based on the many possibilities and multiple sources of data. The technique was developed by Toyota in collaboration with researchers led by Shuran Song, a professor at Columbia University who now leads a robot lab at Stanford.

Toyota is trying to combine that approach with the kind of language models that underpin ChatGPT and its rivals. The goal is to make it possible to have robots learn how to perform tasks by watching videos, potentially turning resources like YouTube into powerful robot training resources. Presumably they will be shown clips of people doing sensible things, not the dubious or dangerous stunts often found on social media.

If you've never touched anything in the real world, it's hard to get that understanding from just watching YouTube videos, Russ Tedrake, vice president of Robotics Research at Toyota Research Institute and a professor at MIT, says. The hope, Tedrake says, is that some basic understanding of the physical world combined with data generated in simulation, will enable robots to learn physical actions from watching YouTube clips. The diffusion approach is able to absorb the data in a much more scalable way, he says.

See the rest here:
Toyota's Robots Are Learning to Do HouseworkBy Copying Humans - WIRED

Use of Non-invasive Machine Learning to Help Predict the Chronic Degree of Lupus Nephritis – Lupus Foundation of America

Posted on January 12, 2024 by Danzig

Using a non-invasive machine learning model based on ultrasound radiomic imaging to analyze features of the kidneys, such as shape and texture, researchers were able to predict the degree of kidney injury in people with lupus nephritis, (LN, lupus-related kidney disease). Currently, a renal biopsy, an invasive test which can cause bleeding, pain and other outcomes, is the most common form of assessing a persons chronic degree of LN.

Using radiomics, the ultrasound images of 136 people with LN who had renal biopsies were examined. The images were divided into two groups, a training set and a validation set, and seven machine learning models were constructed based on five ultrasound-based radiomics to establish prediction models. The Xgboost model performed the best in the training and test sets.

Knowing the degree of kidney injury in people with LN can be useful to clinicians as they develop an individuals treatment plan. Learn more about lupus and the kidneys.

Read the study

Read the original here:
Use of Non-invasive Machine Learning to Help Predict the Chronic Degree of Lupus Nephritis - Lupus Foundation of America

Deep fake, AI and face swap in video edit. Deepfake and machine learning. Facial tracking, detection and recognition … – Frederick News Post

Posted on January 12, 2024 by Danzig

State Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire New Jersey New Mexico New York North Carolina North Dakota Ohio Oklahoma Oregon Pennsylvania Rhode Island South Carolina South Dakota Tennessee Texas Utah Vermont Virginia Washington Washington D.C. West Virginia Wisconsin Wyoming Puerto Rico US Virgin Islands Armed Forces Americas Armed Forces Pacific Armed Forces Europe Northern Mariana Islands Marshall Islands American Samoa Federated States of Micronesia Guam Palau Alberta, Canada British Columbia, Canada Manitoba, Canada New Brunswick, Canada Newfoundland, Canada Nova Scotia, Canada Northwest Territories, Canada Nunavut, Canada Ontario, Canada Prince Edward Island, Canada Quebec, Canada Saskatchewan, Canada Yukon Territory, Canada

Zip Code

Country United States of America US Virgin Islands United States Minor Outlying Islands Canada Mexico, United Mexican States Bahamas, Commonwealth of the Cuba, Republic of Dominican Republic Haiti, Republic of Jamaica Afghanistan Albania, People's Socialist Republic of Algeria, People's Democratic Republic of American Samoa Andorra, Principality of Angola, Republic of Anguilla Antarctica (the territory South of 60 deg S) Antigua and Barbuda Argentina, Argentine Republic Armenia Aruba Australia, Commonwealth of Austria, Republic of Azerbaijan, Republic of Bahrain, Kingdom of Bangladesh, People's Republic of Barbados Belarus Belgium, Kingdom of Belize Benin, People's Republic of Bermuda Bhutan, Kingdom of Bolivia, Republic of Bosnia and Herzegovina Botswana, Republic of Bouvet Island (Bouvetoya) Brazil, Federative Republic of British Indian Ocean Territory (Chagos Archipelago) British Virgin Islands Brunei Darussalam Bulgaria, People's Republic of Burkina Faso Burundi, Republic of Cambodia, Kingdom of Cameroon, United Republic of Cape Verde, Republic of Cayman Islands Central African Republic Chad, Republic of Chile, Republic of China, People's Republic of Christmas Island Cocos (Keeling) Islands Colombia, Republic of Comoros, Union of the Congo, Democratic Republic of Congo, People's Republic of Cook Islands Costa Rica, Republic of Cote D'Ivoire, Ivory Coast, Republic of the Cyprus, Republic of Czech Republic Denmark, Kingdom of Djibouti, Republic of Dominica, Commonwealth of Ecuador, Republic of Egypt, Arab Republic of El Salvador, Republic of Equatorial Guinea, Republic of Eritrea Estonia Ethiopia Faeroe Islands Falkland Islands (Malvinas) Fiji, Republic of the Fiji Islands Finland, Republic of France, French Republic French Guiana French Polynesia French Southern Territories Gabon, Gabonese Republic Gambia, Republic of the Georgia Germany Ghana, Republic of Gibraltar Greece, Hellenic Republic Greenland Grenada Guadaloupe Guam Guatemala, Republic of Guinea, Revolutionary People's Rep'c of Guinea-Bissau, Republic of Guyana, Republic of Heard and McDonald Islands Holy See (Vatican City State) Honduras, Republic of Hong Kong, Special Administrative Region of China Hrvatska (Croatia) Hungary, Hungarian People's Republic Iceland, Republic of India, Republic of Indonesia, Republic of Iran, Islamic Republic of Iraq, Republic of Ireland Israel, State of Italy, Italian Republic Japan Jordan, Hashemite Kingdom of Kazakhstan, Republic of Kenya, Republic of Kiribati, Republic of Korea, Democratic People's Republic of Korea, Republic of Kuwait, State of Kyrgyz Republic Lao People's Democratic Republic Latvia Lebanon, Lebanese Republic Lesotho, Kingdom of Liberia, Republic of Libyan Arab Jamahiriya Liechtenstein, Principality of Lithuania Luxembourg, Grand Duchy of Macao, Special Administrative Region of China Macedonia, the former Yugoslav Republic of Madagascar, Republic of Malawi, Republic of Malaysia Maldives, Republic of Mali, Republic of Malta, Republic of Marshall Islands Martinique Mauritania, Islamic Republic of Mauritius Mayotte Micronesia, Federated States of Moldova, Republic of Monaco, Principality of Mongolia, Mongolian People's Republic Montserrat Morocco, Kingdom of Mozambique, People's Republic of Myanmar Namibia Nauru, Republic of Nepal, Kingdom of Netherlands Antilles Netherlands, Kingdom of the New Caledonia New Zealand Nicaragua, Republic of Niger, Republic of the Nigeria, Federal Republic of Niue, Republic of Norfolk Island Northern Mariana Islands Norway, Kingdom of Oman, Sultanate of Pakistan, Islamic Republic of Palau Palestinian Territory, Occupied Panama, Republic of Papua New Guinea Paraguay, Republic of Peru, Republic of Philippines, Republic of the Pitcairn Island Poland, Polish People's Republic Portugal, Portuguese Republic Puerto Rico Qatar, State of Reunion Romania, Socialist Republic of Russian Federation Rwanda, Rwandese Republic Samoa, Independent State of San Marino, Republic of Sao Tome and Principe, Democratic Republic of Saudi Arabia, Kingdom of Senegal, Republic of Serbia and Montenegro Seychelles, Republic of Sierra Leone, Republic of Singapore, Republic of Slovakia (Slovak Republic) Slovenia Solomon Islands Somalia, Somali Republic South Africa, Republic of South Georgia and the South Sandwich Islands Spain, Spanish State Sri Lanka, Democratic Socialist Republic of St. Helena St. Kitts and Nevis St. Lucia St. Pierre and Miquelon St. Vincent and the Grenadines Sudan, Democratic Republic of the Suriname, Republic of Svalbard & Jan Mayen Islands Swaziland, Kingdom of Sweden, Kingdom of Switzerland, Swiss Confederation Syrian Arab Republic Taiwan, Province of China Tajikistan Tanzania, United Republic of Thailand, Kingdom of Timor-Leste, Democratic Republic of Togo, Togolese Republic Tokelau (Tokelau Islands) Tonga, Kingdom of Trinidad and Tobago, Republic of Tunisia, Republic of Turkey, Republic of Turkmenistan Turks and Caicos Islands Tuvalu Uganda, Republic of Ukraine United Arab Emirates United Kingdom of Great Britain & N. Ireland Uruguay, Eastern Republic of Uzbekistan Vanuatu Venezuela, Bolivarian Republic of Viet Nam, Socialist Republic of Wallis and Futuna Islands Western Sahara Yemen Zambia, Republic of Zimbabwe

Excerpt from:
Deep fake, AI and face swap in video edit. Deepfake and machine learning. Facial tracking, detection and recognition ... - Frederick News Post

Futurist Transhuman News Blog

Category Archives: Machine Learning

Accelerating large-scale neural network training on CPUs with ThirdAI and AWS Graviton | Amazon Web Services – AWS Blog

AI meets green: The future of environmental protection with ChatGPT – EurekAlert

Exploration and machine learning model development for T2 NSCLC with bronchus infiltration and obstructive … – Nature.com

Putting AI into the hands of people with problems to solve – MIT News

Untangling Truths and Myths of Machine Learning – TechiExpert.com

Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting – Nature.com

AI shows promise but remains limited for heart/stroke care – Idaho Business Review

9 Top AI Governance Tools 2024 – eWeek

AI that designs and runs networks might not be far off – Light Reading

Seeq Announces Generative AI Capabilities with Seeq AI Assistant – AiThority

Today’s AI Won’t Radically Transform Society, But It’s Already Reshaping Business – The Machine Learning Times

Decoding AI Ethics The Spectator – The Spectator

The future of artificial intelligence in trucking – CCJ

AI-powered platform could help law enforcement get ahead of designer drugs – University of Alberta

Genomic prediction in multi-environment trials in maize using statistical and machine learning methods | Scientific … – Nature.com

New study uses machine learning to bridge the reality gap in quantum devices – University of Oxford

Daily AI Roundup: Biggest Machine Learning, Robotic And Automation Updates – AiThority

Toyota’s Robots Are Learning to Do HouseworkBy Copying Humans – WIRED

Use of Non-invasive Machine Learning to Help Predict the Chronic Degree of Lupus Nephritis – Lupus Foundation of America

Deep fake, AI and face swap in video edit. Deepfake and machine learning. Facial tracking, detection and recognition … – Frederick News Post