Get started quickly with AWS Trainium and AWS Inferentia using AWS Neuron DLAMI and AWS Neuron DLC | Amazon … – AWS Blog

Posted on June 12, 2024 by Danzig

Starting with the AWS Neuron 2.18 release, you can now launch Neuron DLAMIs (AWS Deep Learning AMIs) and Neuron DLCs (AWS Deep Learning Containers) with the latest released Neuron packages on the same day as the Neuron SDK release. When a Neuron SDK is released, youll now be notified of the support for Neuron DLAMIs and Neuron DLCs in the Neuron SDK release notes, with a link to the AWS documentation containing the DLAMI and DLC release notes. In addition, this release introduces a number of features that help improve user experience for Neuron DLAMIs and DLCs. In this post, we walk through some of the support highlights with Neuron 2.18.

The DLAMI is a pre-configured AMI that comes with popular deep learning frameworks like TensorFlow, PyTorch, Apache MXNet, and others pre-installed. This allows machine learning (ML) practitioners to rapidly launch an Amazon Elastic Compute Cloud (Amazon EC2) instance with a ready-to-use deep learning environment, without having to spend time manually installing and configuring the required packages. The DLAMI supports various instance types, including Neuron Trainium and Inferentia powered instances, for accelerated training and inference.

AWS DLCs provide a set of Docker images that are pre-installed with deep learning frameworks. The containers are optimized for performance and available in Amazon Elastic Container Registry (Amazon ECR). DLCs make it straightforward to deploy custom ML environments in a containerized manner, while taking advantage of the portability and reproducibility benefits of containers.

The Neuron Multi-Framework DLAMI for Ubuntu 22 provides separate virtual environments for multiple ML frameworks: PyTorch 2.1, PyTorch 1.13, Transformers NeuronX, and TensorFlow 2.10. DLAMI offers you the convenience of having all these popular frameworks readily available in a single AMI, simplifying their setup and reducing the need for multiple installations.

This new Neuron Multi-Framework DLAMI is now the default choice when launching Neuron instances for Ubuntu through the AWS Management Console, making it even faster for you to get started with the latest Neuron capabilities right from the Quick Start AMI list.

The existing Neuron DLAMIs for PyTorch 1.13 and TensorFlow 2.10 have been updated with the latest 2.18 Neuron SDK, making sure you have access to the latest performance optimizations and features for both Ubuntu 20 and Amazon Linux 2 distributions.

Neuron 2.18 also introduces support in Parameter Store, a capability of AWS Systems Manager, for Neuron DLAMIs, allowing you to effortlessly find and query the DLAMI ID with the latest Neuron SDK release. This feature streamlines the process of launching new instances with the most up-to-date Neuron SDK, enabling you to automate your deployment workflows and make sure youre always using the latest optimizations.

To provide customers with more deployment options, Neuron DLCs are now hosted both in the public Neuron ECR repository and as private images. Public images provide seamless integration with AWS ML deployment services such as Amazon EC2, Amazon Elastic Container Service (Amazon ECS), and Amazon Elastic Kubernetes Service (Amazon EKS); private images are required when using Neuron DLCs with Amazon SageMaker.

Prior to this release, Dockerfiles for Neuron DLCs were located within the AWS/Deep Learning Containers repository. Moving forward, Neuron containers can be found in the AWS-Neuron/ Deep Learning Containers repository.

The Neuron SDK documentation and AWS documentation sections for DLAMI and DLC now have up-to-date user guides about Neuron. The Neuron SDK documentation also includes a dedicated DLAMI section with guides on discovering, installing, and upgrading Neuron DLAMIs, along with links to release notes in AWS documentation.

AWS Trainium and AWS Inferentia are custom ML chips designed by AWS to accelerate deep learning workloads in the cloud.

You can choose your desired Neuron DLAMI when launching Trn and Inf instances through the console or infrastructure automation tools like AWS Command Line Interface (AWS CLI). After a Trn or Inf instance is launched with the selected DLAMI, you can activate the virtual environment corresponding to your chosen framework and begin using the Neuron SDK. If youre interested in using DLCs, refer to the DLC documentation section in the Neuron SDK documentation or the DLC release notes section in the AWS documentation to find the list of Neuron DLCs with the latest Neuron SDK release. Each DLC in the list includes a link to the corresponding container image in the Neuron container registry. After choosing a specific DLC, please refer to the DLC walkthrough in the next section to learn how to launch scalable training and inference workloads using AWS services like Kubernetes (Amazon EKS), Amazon ECS, Amazon EC2, and SageMaker. The following sections contain walkthroughs for both the Neuron DLC and DLAMI.

In this section, we provide resources to help you use containers for your accelerated deep learning model acceleration on top of AWS Inferentia and Trainium enabled instances.

The section is organized based on the target deployment environment and use case. In most cases, it is recommended to use a preconfigured DLC from AWS. Each DLC is preconfigured to have all the Neuron components installed and is specific to the chosen ML framework.

The PyTorch Neuron DLC images are published to ECR Public Gallery, which is the recommended URL to use for most cases. If youre working within SageMaker, use the Amazon ECR URL instead of the Amazon ECR Public Gallery. TensorFlow DLCs are not updated with the latest release. For earlier releases, refer to Neuron Containers. In the following sections, we provide the recommended steps for running an inference or training job in Neuron DLCs.

Prepare your infrastructure (Amazon EKS, Amazon ECS, Amazon EC2, and SageMaker) with AWS Inferentia or Trainium instances as worker nodes, making sure they have the necessary roles attached for Amazon ECR read access to retrieve container images from Amazon ECR: arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly.

When setting up hosts for Amazon EC2 and Amazon ECS, using Deep Learning AMI (DLAMI) is recommended. An Amazon EKS optimized GPU AMI is recommended to use in Amazon EKS.

You also need the ML job scripts ready with a command to invoke them. In the following steps, we use a single file, train.py, as the ML job script. The command to invoke it is torchrun nproc_per_node=2 nnodes=1 train.py.

Extend the Neuron DLC to include your ML job scripts and other necessary logic. As the simplest example, you can have the following Dockerfile:

This Dockerfile uses the Neuron PyTorch training container as a base and adds your training script, train.py, to the container.

Complete the following steps:

You can now run the extended Neuron DLC in different AWS services.

For Amazon EKS, create a simple pod YAML file to use the extended Neuron DLC. For example:

Use kubectl apply -f .yaml to deploy this pod in your Kubernetes cluster.

For Amazon ECS, create a task definition that references your custom Docker image. The following is an example JSON task definition:

This definition sets up a task with the necessary configuration to run your containerized application in Amazon ECS.

For Amazon EC2, you can directly run your Docker container:

For SageMaker, create a model with your container and specify the training job command in the SageMaker SDK:

This section walks through launching an Inf1, Inf2, or Trn1 instance using the Multi-Framework DLAMI in the Quick Start AMI list and getting the latest DLAMI that supports the newest Neuron SDK release easily.

The Neuron DLAMI is a multi-framework DLAMI that supports multiple Neuron frameworks and libraries. Each DLAMI is pre-installed with Neuron drivers and support all Neuron instance types. Each virtual environment that corresponds to a specific Neuron framework or library comes pre-installed with all the Neuron libraries, including the Neuron compiler and Neuron runtime needed for you to get started.

This release introduces a new Multi-Framework DLAMI for Ubuntu 22 that you can use to quickly get started with the latest Neuron SDK on multiple frameworks that Neuron supports as well as Systems Manager (SSM) parameter support for DLAMIs to automate the retrieval of the latest DLAMI ID in cloud automation flows.

For instructions on getting started with the multi-framework DLAMI through the console, refer to Get Started with Neuron on Ubuntu 22 with Neuron Multi-Framework DLAMI. If you want to use the Neuron DLAMI in your cloud automation flows, Neuron also supports SSM parameters to retrieve the latest DLAMI ID.

Complete the following steps:

Activate your desired virtual environment, as shown in the following screenshot.

After you have activated the virtual environment, you can try out one of the tutorials listed in the corresponding framework or library training and inference section.

Neuron DLAMIs support SSM parameters to quickly find Neuron DLAMI IDs. As of this writing, we only support finding the latest DLAMI ID that corresponds to the latest Neuron SDK release with SSM parameter support. In the future releases, we will add support for finding the DLAMI ID using SSM parameters for a specific Neuron release.

You can find the DLAMI that supports the latest Neuron SDK by using the get-parameter command:

For example, to find the latest DLAMI ID for the Multi-Framework DLAMI (Ubuntu 22), you can use the following code:

You can find all available parameters supported in Neuron DLAMIs using the AWS CLI:

You can also view the SSM parameters supported in Neuron through Parameter Store by selecting the neuron service.

You can use the AWS CLI to find the latest DLAMI ID and launch the instance simultaneously. The following code snippet shows an example of launching an Inf2 instance using a multi-framework DLAMI:

You can also use SSM parameters directly in launch templates. You can update your Auto Scaling groups to use new AMI IDs without needing to create new launch templates or new versions of launch templates each time an AMI ID changes.

When youre done running the resources that you deployed as part of this post, make sure to delete or stop them from running and accruing charges:

In this post, we introduced several enhancements incorporated into Neuron 2.18 that improve the user experience and time-to-value for customers working with AWS Inferentia and Trainium instances. Neuron DLAMIs and DLCs with the latest Neuron SDK on the same day as the release means you can immediately benefit from the latest performance optimizations, features, and documentation for installing and upgrading Neuron DLAMIs and DLCs.

Additionally, you can now use the Multi-Framework DLAMI, which simplifies the setup process by providing isolated virtual environments for multiple popular ML frameworks. Finally, we discussed Parameter Store support for Neuron DLAMIs that streamlines the process of launching new instances with the most up-to-date Neuron SDK, enabling you to automate your deployment workflows with ease.

Neuron DLCs are available both private and public ECR repositories to help you deploy Neuron in your preferred AWS service. Refer to the following resources to get started:

Niithiyn Vijeaswaran is a Solutions Architect at AWS. His area of focus is generative AI and AWS AI Accelerators. He holds a Bachelors degree in Computer Science and Bioinformatics. Niithiyn works closely with the Generative AI GTM team to enable AWS customers on multiple fronts and accelerate their adoption of generative AI. Hes an avid fan of the Dallas Mavericks and enjoys collecting sneakers.

Armando Diaz is a Solutions Architect at AWS. He focuses on generative AI, AI/ML, and data analytics. At AWS, Armando helps customers integrate cutting-edge generative AI capabilities into their systems, fostering innovation and competitive advantage. When hes not at work, he enjoys spending time with his wife and family, hiking, and traveling the world.

Sebastian Bustillo is an Enterprise Solutions Architect at AWS. He focuses on AI/ML technologies and has a profound passion for generative AI and compute accelerators. At AWS, he helps customers unlock business value through generative AI, assisting with the overall process from ideation to production. When hes not at work, he enjoys brewing a perfect cup of specialty coffee and exploring the outdoors with his wife.

Ziwen Ning is a software development engineer at AWS. He currently focuses on enhancing the AI/ML experience through the integration of AWS Neuron with containerized environments and Kubernetes. In his free time, he enjoys challenging himself with badminton, swimming and other various sports, and immersing himself in music.

Anant Sharma is a software engineer at AWS Annapurna Labs specializing in DevOps. His primary focus revolves around building, automating and refining the process of delivering software to AWS Trainium and Inferentia customers. Beyond work, hes passionate about gaming, exploring new destinations and following latest tech developments.

Roopnath Grandhi is a Sr. Product Manager at AWS. He leads large-scale model inference and developer experiences for AWS Trainium and Inferentia AI accelerators. With over 15 years of experience in architecting and building AI based products and platforms, he holds multiple patents and publications in AI and eCommerce.

Marco Punio is a Solutions Architect focused on generative AI strategy, applied AI solutions and conducting research to help customers hyperscale on AWS. He is a qualified technologist with a passion for machine learning, artificial intelligence, and mergers & acquisitions. Marco is based in Seattle, WA and enjoys writing, reading, exercising, and building applications in his free time.

Rohit Talluri is a Generative AI GTM Specialist (Tech BD) at Amazon Web Services (AWS). He is partnering with top generative AI model builders, strategic customers, key AI/ML partners, and AWS Service Teams to enable the next generation of artificial intelligence, machine learning, and accelerated computing on AWS. He was previously an Enterprise Solutions Architect, and the Global Solutions Lead for AWS Mergers & Acquisitions Advisory.

More here:
Get started quickly with AWS Trainium and AWS Inferentia using AWS Neuron DLAMI and AWS Neuron DLC | Amazon ... - AWS Blog

A machine learning-based approach for constructing remote photoplethysmogram signals from video cameras … – Nature.com

Posted on June 12, 2024 by Danzig

In this section, the methodology used in this study is presented, from the data processing techniques to the models used to construct the rPPG. A general visualization of the pipeline is presented in Fig.1.

From data processing to comparison of the reference photoplethysmogram (PPG) with the remote photoplethysmogram (rPPG) constructed by the model. CV cross-validation, RGB red, green, and blue channels, ML machine learning. Colors: the green signal refers to the rPPG reconstructed by the model, and the black signal refers to the fingertip PPG.

For this study, three public datasets were utilized:

LGI-PPGI: This dataset is published under the CC-BY-4.0 license. The study was supported by the German Federal Ministry of Education and Research (BMBF) under the grant agreement VIVID 01S15024 and by CanControls GmbH Aachen21. The LGI-PPGI dataset is a collection of videos featuring six participants, the sex of five is male and one is female. The participants were recorded while performing four activities: Rest, Talk, Gym (exercise on a bicycle ergometer), and Rotation (rotation of the head of the subject at different speeds). The videos were captured using a Logitech HD C270 webcam with a frame rate of 25 fps, and cPPG signals were collected using a CMS50E PPG device at a sampling rate of 60 Hz. The videos were shot in varying lighting conditions, with talking scenes recorded outdoors and other activities taking place indoors.

PURE: Access to this dataset is granted upon request. It received support from the Ilmenau University of Technology, the Federal State of Thuringia, and the European Social Fund (OP 2007-2013) under grant agreement N501/2009 for the project SERROGA (project number 2011FGR0107)26. The PURE dataset contains videos of 10 participants, of which eight have the sex male and two female, engaged in various activities classified as Steady, Talk, Slow Translation (average speed is 7% of the face height per second), Fast Translation (average speed is 14% of the face height per second), Small Rotation (average head angle of 20), and Medium Rotation (average head angle of 35). The videos were captured using a 640480 pixel eco274CVGE camera by SVS-Vistek GmbH, with a 30 fps frame rate and a 4.8 mm lens. The cPPG signals were collected using a CMS50E PPG device at a sampling rate of 60 Hz. The videos were shot in natural daylight, with the camera positioned at an average distance of 1.1 m from the participants faces.

MR-NIRP indoor: This dataset is openly accessible without any restrictions. It received funding under the NIH grant 5R01DK113269-0227. The MR-NIRP indoor video dataset is comprised of videos of eight participants, including six participants with sex male and two female, with different skin tones: 1 Asian, 4 Indian, and 3 Caucasian. The participants were recorded while performing Still and Motion activities, with talking and head movements being part of the latter. The videos were captured using a FLIR Blackfly BFLY-U3-23S6C-C camera with a resolution of 640640 and a frame rate of 30 fps. The cPPG signals were collected using a CMS 50D+ finger pulse oximeter at a sampling rate of 60 Hz.

Each dataset includes video recordings of participants engaged in various activities, alongside a reference cPPG signal recorded using a pulse oximeter. Table1 provides detailed characteristics of each dataset.

The datasets used in our research are not only publicly available but are also extensively utilized within the scientific community for various secondary analyses. All datasets received the requisite ethical approvals and informed consents, in accordance with the regulations of their respective academic institutions. This compliance facilitated the publication of the data in academic papers and its availability online. The responsibility for managing ethical compliance was handled by the original data providers. They ensured that these datasets were made available under terms that permit their use and redistribution with appropriate acknowledgment.

Given the extensive use of these datasets across multiple studies, additional IRB approval for secondary analyses of de-identified and publicly accessible data is typically not required. This practice aligns with the policies at ETH Zurich, which do not mandate further IRB approval for the use of publicly available, anonymized data.

A comprehensive description of each dataset, including its source, funding agency, and licensing terms, has been provided in the manuscript. This ensures full transparency and adherence to both ethical and legal standards.

Several steps were necessary to extract the rPPG signal from a single video. First, the regions of interest (RoI) were extracted from the face. We extracted information from the forehead and cheeks using the pyVHR framework28, which includes the software MediaPipe for the extraction of RoI from a human face29. The RoI extracted from every individual were composed of a total of 30 landmarks. Each landmark is a specific region of the face, represented by a number that indicates the location of that region. The landmarks 107, 66, 69, 109, 10, 338, 299, 296, 336, and 9 were extracted from the forehead, the landmarks 118, 119, 100, 126, 209, 49, 129, 203, 205, and 50 were extracted from the left cheek, and the landmarks 347, 348, 329, 355, 429, 279, 358, 423, 425, and 280 were extracted from the right cheek. Every landmark was composed of 3030 pixels, and the average across the red, green, and blue (RGB) channels was computed for every landmark. The numbers of the landmarks of each area represent approximately evenly spaced regions of that area.

After all the landmarks were extracted, the RGB signals of each landmark were used as input for the algorithms CHROM, LGI-PPGI, POS, and ICA. These algorithms were chosen because of their effectiveness in separating the color information related to blood flow from the color information not related to blood flow, as well as their ability to extract PPG signals from facial videos. CHROM separates the color information by projecting it onto a set of basis vectors, while LGI-PPGI uses local gradient information to extract PPG signals. POS uses a multi-channel blind source separation algorithm to extract signals from different sources, and ICA separates the PPG signals from the other sources of variation in the video. These methods were chosen based on their performance in previous studies and their ability to extract high-quality PPG signals from facial videos20,23.

For the data processing, the signals used as rPPG are the outputs of the algorithms ICA, CHROM, LGI, and POS, and the cPPG signals were resampled to the same fps as the rPPG. First, the filters detrend and bandpass were applied to both the rPPG and cPPG signals. Bandpass is a sixth-order Butterworth with a cutoff frequency of 0.654 Hz. The chosen frequency range was intended to filter out noise in both low and high frequencies. Next, the rPPG signals were filtered by removing low variance signals and were segmented into non-overlapping windows of 10 seconds, followed by minmax normalization. We applied histogram equalization to the obtained spatiotemporal maps, showing a general improvement in the performance of the methods.

Spectral analysis was performed on both the rPPG and cPPG signals by applying Welchs method to each window of the constructed rPPG and cPPG signals. The highest peak in the frequency domain was selected as the estimated HR, with alternative methods such as autocorrelation also tested. However, these methods showed minimal absolute differences in beats-per-minute absolute difference (HR). Welchs method was deemed useful as it allowed for heart rate evaluation in the frequency domain and demonstrated the predictive capability of each channels rPPG signal.

The model was trained using data sourced from the PURE dataset. The input data contains information from 10 participants. Each participant was captured across 6 distinct videos, engaging in activities categorized as Steady, Talk, Slow Translation, Fast Translation, Small Rotation, and Medium Rotation. This accounts for a total of 60 videos, with an approximate average duration of 1min. Each video was transformed to RGB signals. Then, every RGB set of signals representing a video was subdivided into 10-s fragments, with each fragment serving as a unit for training data. The dataset used to train the model contains a total of 339 such samples.

Because the duration of each video is 10 seconds and the frame rate is 30, each sample is represented by three RGB signals composed of 300 time-steps. The RGB signals, serving as training inputs, underwent a transformation process resulting in the derivation of four distinct signals through the application of the POS, CHROM, LGI, and ICA methods. Consequently, each 10-s segment yielded four transformed signals, which were intended for subsequent utilization as input for the model. Before being fed to the model, data preprocessing was applied to the signals. Then, a 5-fold cross-validation (CV) procedure was conducted. During this procedure, the dataset was partitioned into five subsets, with a distribution ratio of 80% for training data and 20% for testing data within each fold.

The models architecture was composed of four blocks of LSTM and dropout, followed by a dense layer. The model architecture is shown in Fig.2. To reduce the number of features of the model in each layer, the number of cells in each block decreases from 90 to 1. The learning rate scheduler implemented was ReduceLROnPlateau and the optimizer was Adam30. Finally, the metrics root mean squared error (RMSE) and Pearson correlation coefficient (r) were set as loss function.

The model architecture generates a remote photoplethysmogram (rPPG) signal from three regions of interest: the forehead (R1), left cheek (R2), and right cheek (R3). The average value from each region is calculated, and these averages are then combined to produce the overall rPPG signal. The model is composed of four blocks of LSTM and dropout, followed by a dense layer. The methods ICA, LGI, CHROM, and POS were used as input to the model. rPPG remote photoplethysmogram, RGB red, green, and blue channels, LSTM long short-term memory.

To evaluate the signals, we applied four criteria: Dynamic Time Warping (DTW), Pearsons r correlation coefficient, RMSE, and HR. We computed each criterion for every window in each video. We then took the average of the values of all the windows to obtain the final results. This helped us to analyze the results of every model from different points of view.

DTW31 is a useful algorithm for measuring the similarity between two time series, especially when they have varying speeds and lengths. The use of DTW is also relevant for this case because the rPPG and its ground truth may not be aligned sometimes, so metrics that rely on matching timestamps are less appropriate. The metric was implemented using the Python package DTAIDistance32.

The equation below shows how the r coefficient calculates the strength of the relationship between rPPG and cPPG.

$$r=frac{mathop{sum }nolimits_{i = 1}^{N}({x}_{i}-hat{x})({y}_{i}-hat{y})}{sqrt{mathop{sum }nolimits_{i = 1}^{N}{({x}_{i}-hat{x})}^{2}}sqrt{mathop{sum }nolimits_{i = 1}^{N}{({y}_{i}-hat{y})}^{2}}}$$

(1)

In this equation, xi and yi are the values of the rPPG and PPG signals at lag i, respectively. (hat{x}) and (hat{y}) are their mean values. N is the number of values in the discrete signals.

The equation below shows how RMSE calculates the prediction error, which is the difference between the ground truth values and the extracted rPPG signals.

$${{{{{{mathrm{RMSE}}}}}}},=sqrt{frac{mathop{sum }nolimits_{i = 1}^{N}{left({x}_{i}-{y}_{i}right)}^{2}}{N}}$$

(2)

In this equation, N is the number of values and xi, yi are the values of the rPPG and contact PPG signals at lag i, respectively.

HR was estimated using Welchs method, which computes the power spectral density of a signal and finds the highest peak in the frequency domain. The peak was searched within a range of 39240 beats-per-minute (BPM), which is the expected range of human BPMs. HR is obtained as the absolute difference between the HR estimated from rPPG and the HR estimated from cPPG.

To evaluate the models performance, we applied non-parametric statistical tests, which have fewer assumptions about the data distribution than parametric ones. Some comparisons involved small sample sizes, such as those with a limited number of subjects.

The Friedman Test33 is appropriate for this study because it evaluates the means of three or more groups. Every group is represented by a model. If the p-value is significant, the means of the groups are not equal. The Nemenyi Test34 was used to calculate the difference in the average ranking values and then to compare the difference with a critical distance (CD). The general procedure is to apply the Friedmann test to each group and if the p-value is significant, the means of the groups are not equal. In that case, the Nemenyi test is performed to compare the methods pairwise. The Nemenyi test helps to identify which methods are similar or different in terms of their average ranks. The Bonferroni correction was applied for multiple-comparison correction.

Read the original:
A machine learning-based approach for constructing remote photoplethysmogram signals from video cameras ... - Nature.com

Arteris Selected by Esperanto Technologies to Integrate RISC-V Processors for High-Performance AI and Machine … – Design and Reuse

Posted on June 12, 2024 by Danzig

CSRCompiler, an SoC integration automation software solution, enables faster time-to-market and energy-efficient designs for AI inference and HPC applications at a fraction of the cost

CAMPBELL, Calif., June 11, 2024 -- Arteris, Inc. (Nasdaq: AIP), a leading provider of system IP which accelerates system-on-chip (SoC) creation, today announced that Esperanto Technologies, a leading developer of high-performance, energy-efficient artificial intelligence (AI) and high-performance computing (HPC) solutions based on the RISC-V instruction set, has chosen Arteris because of design familiarity with CSRCompiler for automation efficiency, error reduction and its integration capabilities. Esperanto will continue to utilize the SoC software to develop its next generation of energy-efficient solutions for AI inference and HPC workloads for data center and enterprise-edge applications.

At the heart of Esperantos solutions are its ET-SoC-1 supercomputer-on-a-chip which is designed to run Generative AI and HPC workloads with reduced total cost of ownership (TCO) due to its highly energy-efficient operation. This complex SoC integrates over 1,000 energy-efficient 64-bit RISC-V cores with custom vector/tensor units optimized for machine learning (ML) applications and can run the latest large language models (LLMs) with just a fraction of the power required by a GPU, making it versatile for use in the data center and at the enterprise edge. The massive processor parallelism and configurations for registers and memory maps make the integration automation software, CSRCompiler, essential for achieving higher-quality silicon and faster time to market.

Arteris is a leader in SoC integration automation technology and their software is an important part of our silicon design flow for managing complexity, said Art Swift, president and CEO of Esperanto Technologies. Arteris CSRCompiler software is a key enabler for achieving our silicon performance and power efficiency goals that will address the needs of the expanding data center and enterprise edge markets.

We are pleased to continue to play a key role in the development of Esperantos extremely complex SoCs, said K. Charles Janac, president and CEO of Arteris. CSRCompilers ability to streamline the design workflow and accelerate turnaround times empowers our customers to keep pace with rapidly changing market demands.

CSRCompiler from Arteris streamlines the hardware/software interface (HSI) foundation creation. The SoC integration automation software automates HSI design, verification, firmware and documentation, providing multi-language support without additional scripting. The method supports an agile flow to ensure best practices and early engagement by the entire team through collaborative management from a single source specification. CSRCompiler provides a complete, correct and up-to-date design ecosystem, turning address map sharing into a smooth and integrated process, preventing design mistakes and enabling faster turnaround times.

About Arteris

Arteris is a leading provider of system IP for the acceleration of system-on-chip (SoC) development across today's electronic systems. Arteris network-on-chip (NoC) interconnect IP and SoC integration automation technology enable higher product performance with lower power consumption and faster time to market, delivering better SoC economics so its customers can focus on dreaming up what comes next. Learn more at arteris.com.

About Esperanto Technologies

Esperanto Technologies Inc. delivers massively parallel, high-performance, energy-efficient computing solutions that offer a compelling choice for the most demanding Generative AI and non-AI applications. The changing, computationally intensive workloads of the machine learning era mandate a new clean-sheet solution, shedding the baggage of existing legacy architectures, and the programmability limitations of overspecialized hardware. Esperanto leverages the simple, elegant, open standard RISC-V instruction set architecture (ISA) to deliver flexibility, scalability, performance and energy-efficiency advantages. For more information, please visit https://www.esperanto.ai/

Read the original post:
Arteris Selected by Esperanto Technologies to Integrate RISC-V Processors for High-Performance AI and Machine ... - Design and Reuse

Build a FedRAMP compliant generative AI-powered chatbot using Amazon Aurora Machine Learning and Amazon … – AWS Blog

Posted on June 12, 2024 by Danzig

In this post, we explore how to use Amazon Aurora PostgreSQL-Compatible Edition and Amazon Bedrock to build Federal Risk and Authorization Management Program (FedRAMP) compliant generative artificial intelligence (AI) applications using Retrieval Augmented Generation (RAG). FedRAMP is a US government-wide program that delivers a standard approach to security assessment, authorization, and monitoring for cloud products and services. Cloud service providers must demonstrate FedRAMP compliance in order to offer services to the U.S. Government. RAG is often used in generative AI to enhance user queries and responses by augmenting the training of a large language model (LLM) with data from a companys internal business systems.

The solution we demonstrate uses Amazon Aurora Machine Learning which enables builders to create machine learning (ML) or generative AI applications using familiar SQL programming. Aurora ML provides access to foundation models (FMs) in Amazon Bedrock for creating embeddings (vector representations of text, images, and audio) and generating natural language text responses for generative AI applications directly with SQL. With Aurora ML, you can create text embeddings, perform similarity search using the pgvector extension, and generate text responses within the same Aurora SQL function. This reduces latency for text generation because document embeddings may be stored in the same table as the text, minimizing the need to return a search response to applications.

Amazon Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API, along with a broad set of capabilities to help you build generative AI applications. Amazon Aurora PostgreSQL is a fully managed, PostgreSQLcompatible, and ACIDcompliant relational database engine that combines the speed, reliability, and manageability of Amazon Aurora with the simplicity and cost-effectiveness of open-source databases.

In this post, we demonstrate how to build an AI-powered chatbot with Aurora ML and Amazon Bedrock, both of which are FedRAMP compliant. The solution includes the following AWS services:

We also use Streamlit to construct the interactive chatbot application running on AWS Cloud9.

The following diagram is a common solution architecture pattern you can use to integrate your Aurora PostgreSQL database with Amazon Bedrock using Aurora ML.

This architecture implements a RAG workflow. The first part involves ingesting the knowledge dataset, which contains unstructured data like PDFs and documents. This data is broken down into smaller chunks and embedded into vector representations using an embedding model such as Amazon Titan. The embeddings are stored alongside the original text chunks in Aurora, which serves as our vector database (Steps A and B).

The second part involves generating a response. First, the users question is converted into an embedding vector using the same embedding model . This question embedding is then used to perform semantic search over the database embeddings to determine relevant text chunks called the context. The context along with a prompt are formatted into a model input, which is fed to the text generation model to produce a natural language response to the question (Steps 14).

This implementation uses Amazon Aurora PostgreSQL with aws_ml (version 2) and the pgvector extension to store embeddings, Amazon Bedrock FMs (amazon.titan-embed-g1-text-02 for embeddings and anthropic.claude-instant-v1 for text generation), and Streamlit for the chatbot frontend. This can be deployed in three main steps:

For this walkthrough, complete the following prerequisites:

We selected PostgreSQL 15.5 for this example; we recommend using PostgreSQL 15.5 or higher so you can use the latest version of the open source pgvector extension

We selectedAurora I/O-Optimized, which provides improved performance with predictable pricing for I/O-intensive applications.

We opted to useAmazon Aurora Serverless v2, which automatically scales your compute based on your application workload, so you only pay based on the capacity used.

This step ingests your documents from an S3 bucket using the Boto3 library. Next, the function splits the documents into chunks using LangChains RecursiveCharacterTextSplitter. Lastly, the function uploads the chunks into an Aurora PostgreSQL table that you specify. The clean_chunk() function escapes special characters in the data to properly clean it before loading into the Aurora PostgreSQL table, which is a best practice because SQL functions can struggle with certain special characters.

Use the following Python code:

A SQL procedure is created to select the text contents, generate embeddings from the text using the Amazon Titan Embeddings G1 Text model (amazon.titan-embed-g1-text-02), and insert the embeddings into the table containing the original text. This model creates 1536-dimensional vector representations (embeddings) of unstructured text like documents, paragraphs, and sentences, taking up to 8,000 tokens as input.

The following code creates a PostgreSQL procedure that generates embeddings using the aws_bedrock.invoke_model_get_embeddings function. The model input is a JSON document. As a best practice, we recommend using the SQL format function to format the model input as given in the code.

The SQL function for generating responses to user questions performs the following tasks:

The similarity search and response generation are combined into a single function. This removes the need to call the similarity search separately from the application, reducing the overall latency of producing responses.

Use the following code:

This AI-powered chatbot application offers multiple ways for users to ask questions. One option is for SQL developers to directly use SQL functions. We also developed a Python chatbot application using the Streamlit frontend framework. The code for this chatbot application is available on GitHub.

We download the dataset for our knowledge base and upload it to an S3 bucket. This dataset will feed and power the chatbot. Complete the following steps:

The solution presented in this post is available in the following GitHub repo. You need to clone the GitHub repository to your local machine. Complete the following steps to configure the environment:

The following command allows you to connect to Amazon Aurora PostgreSQL using the psql client to ask a question and receive a response:

The following command launches a Streamlit-powered, web-based interactive application. The application allows you to ask questions and receive answers using a user-friendly interface.

The configuration to open port 8080 is for demonstration only. For your production environment, refer to Protecting data in transit for best practices to securely expose your application.

The UI will look like the following screenshot. You can ask a question by entering the text in the Enter your question here field.

To clean up your resources, complete the following steps:

In this post, we demonstrated how you can use Aurora, Amazon Bedrock, and other AWS services to build an end-to-end generative AI chatbot application using SQL functions and Python. By storing text embeddings directly in Aurora, you can reduce latency and complexity compared to traditional architectures. The combination of Aurora ML, Amazon Bedrock FMs, and AWS compute like Amazon SageMaker provides a powerful environment for rapidly developing next-generation AI applications.

For more information, see Using Amazon Aurora machine learning with Aurora PostgreSQL.

Naresh Dhiman is a Sr. Solutions Architect at Amazon Web Services supporting US federal customers. He has over 25 years of experience as a technology leader and is a recognized inventor with six patents. He specializes in containers, machine learning, and generative AI on AWS.

Karan Lakhwani is a Customer Solutions Manager at Amazon Web Services supporting US federal customers. He specializes in generative AI technologies and is an AWS Golden Jacket recipient. Outside of work, Karan enjoys finding new restaurants and skiing.

More:
Build a FedRAMP compliant generative AI-powered chatbot using Amazon Aurora Machine Learning and Amazon ... - AWS Blog

AI better predicts back surgery outcomes – Futurity: Research News

Posted on June 12, 2024 by Danzig

Share this Article

You are free to share this article under the Attribution 4.0 International license.

Researchers who had been using Fitbit data to help predict surgical outcomes have a new method to more accurately gauge how patients may recover from spine surgery.

Using machine-learning techniques, researchers worked to develop a way to more accurately predict recovery from lumbar spine surgery.

The results, published in the journal Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, show that their model outperforms previous models to predict spine surgery outcomes.

This is important because in lower back surgery and many other types of orthopedic operations, outcomes vary widely depending on the patients structural disease but also on varying physical and mental health characteristics across patients.

Surgical recovery is influenced by both physical and mental health before the operation. Some people may have excessive worry in the face of pain that can make pain and recovery worse. Others may suffer from physiological problems that worsen pain. If physicians can get a heads-up on the various pitfalls a patient faces, they can better tailor treatment plans.

By predicting the outcomes before the surgery, we can help establish some expectations and help with early interventions and identify high risk factors, says first author Ziqi Xu, a PhD student in the lab of Chenyang Lu, a professor in the McKelvey School of Engineering at Washington University in St. Louis.

Previous work in predicting surgery outcomes typically used patient questionnaires given once or twice in clinics, capturing a static slice of time.

It failed to capture the long-term dynamics of physical and psychological patterns of the patients, Xu says. Prior work training machine-learning algorithms focused on just one aspect of surgery outcome but ignored the inherent multidimensional nature of surgery recovery, she adds.

Researchers have used mobile health data from Fitbit devices to monitor and measure recovery and compare activity levels over time. But the new research has shown that activity data, plus longitudinal assessment data, is more accurate in predicting how the patient will do after surgery, says Jacob Greenberg, an assistant professor of neurosurgery at the School of Medicine.

The current work offers a proof of principle showing that, with multimodal machine learning, doctors can see a more accurate big picture of the interrelated factors that affect recovery. Before beginning this work, the team first laid out the statistical methods and protocol to ensure they were feeding the artificial intelligence system the right balanced diet of data.

Previously, the team had published work in the journal Neurosurgery showing for the first time that patient-reported and objective wearable measurements improve predictions of early recovery compared to traditional patient assessments.

In addition to Greenberg and Xu, Madelynn Frumkin, a PhD student studying psychological and brain sciences in Thomas Rodebaughs laboratory, was a co-first author on that work. Wilson Zack Ray, a professor of neurosurgery at the School of Medicine, was co-senior author, along with Rodebaugh and Lu. Rodebaugh is now at the University of North Carolina at Chapel Hill.

In that research, they show that Fitbit data can be correlated with multiple surveys that assess a persons social and emotional state. They collected that data via ecological momentary assessments (EMAs) that employ smartphones to give patients frequent prompts to assess mood, pain levels, and behavior multiple times throughout day.

We combine wearables, EMA, and clinical records to capture a broad range of information about the patients, from physical activities to subjective reports of pain and mental health, and to clinical characteristics, Lu says.

Greenberg adds that state-of-the-art statistical tools that Rodebaugh and Frumkin have helped advance, such as Dynamic Structural Equation Modeling, were key in analyzing the complex, longitudinal EMA data.

For the most recent study, they took all those factors and developed a new machine-learning technique of Multi-Modal Multi-Task Learning to effectively combine these different types of data to predict multiple recovery outcomes.

In this approach, the AI learns to weigh the relatedness among the outcomes while capturing their differences from the multimodal data, Lu adds.

This method takes shared information on interrelated tasks of predicting different outcomes and then leverages the shared information to help the model understand how to make an accurate prediction, according to Xu.

It all comes together in the final package, producing a predicted change for each patients post-operative pain interference and physical function score.

Greenberg says the study is ongoing as the researchers continue to fine-tune their models so they can take more detailed assessments, predict outcomes and, most notably, understand what types of factors can potentially be modified to improve longer-term outcomes.

Funding for the study came from AO Spine North America, the Cervical Spine Research Society, the Scoliosis Research Society, the Foundation for Barnes-Jewish Hospital, Washington University/BJC Healthcare Big Ideas Competition, the Fullgraf Foundation, and the National Institute of Mental Health.

Source: Washington University in St. Louis

Go here to see the original:
AI better predicts back surgery outcomes - Futurity: Research News

Developing a prognostic model using machine learning for disulfidptosis related lncRNA in lung adenocarcinoma … – Nature.com

Posted on June 12, 2024 by Danzig

Identification of prognostically relevant DRLs and construction of prognostic models

In our investigation of the LUAD landscape, we analyzed 16,882 lncRNAs derived from the TCGA-LUAD database. This comprehensive evaluation led to the identification of 708 DRLs, which demonstrate significant interactions with DRGs, as depicted in a sankey diagram (Fig.2A). Through further analysis incorporating data from three GEO databases, we narrowed these DRLs down to 199 lncRNAs consistently present across datasets, suggesting a pivotal role in LUAD pathogenesis (Fig.2B). Our prognostic assessment using univariate cox regression analysis revealed 37 lncRNAs with significant implications for LUAD patient outcomes (Fig.2C). Leveraging these lncRNAs, we constructed a predictive model employing an ensemble of machine learning techniques, with the ensemble model (Supplementary Table 2) achieving a notably high C-index of 0.677[95% confidence interval (CI) 0.63 to 0.73], suggesting robust predictive performance (Fig.2D). This model's effectiveness was further validated through a risk stratification system, categorizing patients into high and low-risk groups based on their lncRNA expression profiles. This stratification was substantiated by principal component analysis (PCA), which confirmed the distinct separation between the risk groups, underscoring the potential of our model in clinical risk assessment (Fig.2E).

Construction of prognostic model composed of 27 DRLs. (A) Sankey diagram illustrating the relationship between DRGs and associated lncRNAs. (B) The intersection of DRLs sourced from the TCGA database and GEO database. (C) 27 lncRNAs after univariate Cox regression. (D) 101 prediction models evaluated, with C-index calculated for each across all validation datasets. (E) Principal Component Analysis of the low-risk and high-risk cohorts based on 27 DRLs.

Our survival analysis using the TCGA-LUAD dataset revealed a significant distinction in OS between the high- and low-risk groups identified through our model (p<0.001, log-rank test) (Fig.3A). This finding was consistently replicated across three independent GEO datasets, demonstrating significant differences in both OS (GSE31210, p=0.001; GSE30219, p=0.019; GSE50081, p=0.025) (Fig.3BD) and DFS (GSE31210, p<0.001; GSE30219, p=0.009; GSE50081, p=0.023) (Supplementary Fig. S1AC). The predictive power of the risk score was superior to that of traditional prognostic factors such as age, gender, and staging, as evidenced by the C-index comparison (Supplementary Fig. S1D). The risk score also emerged as an independent prognostic indicator in our univariate and multivariate cox analyses (p<0.001) (Supplementary Table 3). Multicollinearity within the model was assessed using the variance inflation factor, which was below 10 for all variables (Supplementary Table 4). The AUC analysis further validated the robustness of our model, with one-year, two-year, and three-year AUCs of 0.76, 0.72, and 0.74, respectively, in the TCGA-LUAD dataset (Fig.3F). The external validation using GEO datasets underscored the model's accuracy, particularly notable in GSE30219, GSE50081 and GSE31210 for the evaluated intervals (Fig.3G,I).

Efficacy of the DRLs Survival Prognostic Risk Model. KaplanMeier (KM) analysis for high-risk and low-risk groups are exhibited in (A) TCGA-LUAD, (B) GSE31210, (C) GSE30219 and (D)GSE50081. (E) KaplanMeier (KM) survival curves for mutant and non-mutant groups. Analysis of 1-, 2-, and 3-year ROC curves for (F) TCGA-LUAD, (G) GSE30219, (H) GSE50081, and (I) GSE31210.

Further analysis showed gender-specific differences in risk scores across various pathological stages. In early stages (I and II), men exhibited significantly higher risk scores compared to women (Stage I: p=0.015; Stage II: p=0.006; Wilcoxon test) (Supplementary Fig. S2A,B). However, these differences were not observed in later stages (III/IV) (p=0.900, Wilcoxon test) (Supplementary Fig. S2C), suggesting stage-specific risk dynamics. In addition, our study uncovered notable disparities in risk scores among patients with mutations in EGFR, ALK, and KRAS genes in the GSE31210 dataset (p<0.001, KruskalWallis test) (Supplementary Fig. S2D). Patients harboring these mutations also exhibited better OS compared to those without (p=0.018, log-rank test) (Fig.3E), highlighting the potential prognostic relevance of genetic profiles in LUAD. The impact of smoking, a known risk factor for LUAD, was evident as significant differences in risk scores between smokers and non-smokers were observed in analyses of the GSE30210 and GSE50081 datasets (GSE31210, p=0.003; GSE50081, p=0.027; Wilcoxon test) (Supplementary Fig. S2E,F).

To enhance our model's utility in clinical decision-making, we developed a nomogram that incorporates the identified risk scores alongside essential clinical parametersage, gender, and TNM staging. This integration aims to provide a more comprehensive tool for predicting the prognosis of LUAD patients (Fig.4A). We rigorously validated the nomogram's predictive accuracy using calibration curves, which compare the predicted survival probabilities against the observed outcomes. The results demonstrated a high degree of concordance, indicating that our nomogram accurately reflects patient survival rates (Fig.4B). Further assessment through DCA (Fig.4C-E) confirmed that the nomogram provides substantial clinical benefit. Notably, the analysis showed that the nomogram significantly outperforms the predictive capabilities of the risk score alone, particularly in terms of net benefit across a wide range of threshold probabilities.

Development of a Nomogram for Risk Prediction & Analysis of Mutation Patterns in Both Risk Groups. (A) Nomogram that combines model and clinicopathological factors. (B) Calibration curves in 1-, 3-, and 5-year for the nomogram. (CE) The decision curves analysis (DCA) of the nomogram and clinical characteristics in 1-, 3-, and 5-year. (F) TMB levels between the high-risk and low-risk groups. (G) Gene mutation waterfall chart of the low-risk group. (H) Gene mutation waterfall chart of the high-risk group.

A marked difference in TMB was discerned between the high- and low-risk cohorts (p<0.001 by wilcoxon test) (Fig.4F). The waterfall plot delineates the mutational landscape of the ten most prevalent genes across both risk strata. In the low-risk cohort, approximately 84.53% of specimens exhibited gene mutations (Fig.4G), whereas in the high-risk stratum, mutations were observed in roughly 95.33% of specimens (Fig.4H). Predominant mutations within the high-risk category included TP53, TTN, and CSMD3.

The differential expression analysis revealed a total of 1474 DEGs between the low-risk and high-risk cohorts. Among these, 568 genes were upregulated and 906 genes were downregulated. The volcano plot (Supplementary Fig. S2G) illustrates the distribution of these DEGs. These results indicate that specific genes are significantly associated with risk stratification in our study cohort. In the GO analysis (Fig.5A,D), DEGs showed predominant enrichment in terms of molecular functions such as organic anion transport, carboxylic acid transport. Regarding cellular components, the main enrichment was observed in the apical plasma membrane (Fig.5C). Figure5E demonstrates the GSEA results, highlighting significant enrichment of specific gene sets related to metabolic processes, DNA binding, and hyperkeratosis. The KEGG result highlighted a significant enrichment of DEGs in neuroactive ligand-receptor interaction and the cAMP signaling pathway (Fig.5B).

Biological function analysis of the DRLs risk score model. The top 5 significant terms of (A) GO function enrichment and (B) KEGG function enrichment. (C,D) System clustering dendrogram of cellular components. (E) Gene set enrichment analysis.

To validate the precision of our results, we employed seven techniques: CIBERSORT, EPIC, MCP-counter, xCell, TIMER, quanTIseq, and ssGSEA, to assess immune cell penetration in both high-risk and low-risk categories (Fig.6A). With the ssGSEA data, we explored the connection between TME and several characteristics of lung adenocarcinoma patients, such as age, gender, and disease stage (Fig.6B). We then visualized this data with box plots for both CIBERSORT and ssGSEA (Fig.6C,D). These plots showed that the infiltration levels of B cells memory, T cells CD4 memory resting, and Monocyte was notably lower in the high-risk group compared to the low-risk group. With the help of the ESTIMATE algorithm, we evaluated the stromal (Fig.6F), immune (Fig.6E), and ESTIMATE scores (Supplementary Fig. S3A) across the different risk groups. This allowed us to gauge tumor purity. Our study suggests that the high-risk group has reduced stromal, ESTIMATE, and immune scores. Conversely, the score of tumor purity in the low-risk group is less than that in the high-risk group (Supplementary Fig. S3B).

The tumor microenvironment between high-risk and low-risk groups based on DRLs. (A) Comparing the levels of immune cell infiltration for different immune cell types in the CIBERSORT, EPIC, MCP-counter, xCell, TIMER and quanTIseq algorithm for low-risk and high-risk groups. (B) Immune infiltration of different lung adenocarcinoma patient characteristics. Box plot of the difference in immune cell infiltration between the high-risk and low-risk score groups based on (C) CIBERSORT and (D) ssGSEA. *p-value<0.05, **p-value<0.01, ***p-value<0.001, ns=no significance. (E) Immune score, and (F)stromal score were lower in the high-risk group than in the low-risk group.

We calculated the TIDE score and forecasted the immunotherapy response in both groups of the high risk and low risk (Fig.7A). Based on results from both datasets, patients in low-risk group seem more inclined to show a positive reaction to immunotherapy. Additionally, IPS for the combination of anti-CTLA4 and anti-PDL1 treatment, as well as for anti-CTLA4 alone, was consistently higher in the low-risk group (Fig.7B,C). However, the analysis of anti-PDL1 treatment alone (P=0.170) did not reach statistical significance (Fig.7D). This suggests that low-risk patients may respond better to anti-CTLA4 and/or anti-PDL1 immunotherapy. Recently, research has found a link between tumor TLS and outcomes in several tumor types. In line with these discoveries, our review of TCGA-LUAD dataset showed that LUAD patients with high TLS scores had more favorable outcomes than those with low scores (Fig.7F). We also noticed that the TLS score was higher in the low-risk group compared to the high-risk group (Fig.7E).

Immunotherapeutic sensitivity between high-risk and low-risk groups based on DRLs. (A) Differences in risk scores between the TIDE responsive and nonresponsive groups. (BD) Sensitivity of high- and low-risk groups to combination therapy, anti-CTLA4, and anti-PDL1 by different IPS scores. (E) Differences in tumor tertiary lymphoid structure (TLS) scores in high-risk and low-risk groups in TCGA-LUAD. (F) KM analysis of high-TLS and low-TLS groups.

In our assessment of the relationship between risk scores and sensitivity to chemotherapy, we measured the IC50 for some widely used chemotherapeutic medicine. Our findings showed that the high-risk group was more sensitive to drugs like Cisplatin, Vinblastine, Cytarabine, Vinorelbine, Bexarotene, Cetuximab, Docetaxel, and Doxorubicin than the low-risk group (Fig.8AP).

Immunotherapy sensitivity analysis and in-depth study of LINC00857. (AP) Differences in drug sensitivity between high-risk and low-risk groups. (Q) Volcano plot for GTEX_Lung vs. TCGA_Lung_ Adenocarcinoma.

Through differential gene analysis of tumor tissues and normal tissues, 13,995 DEGs (|logFC|>1.5, p-value<0.050) (Fig.8Q, Supplementary Fig. S3C) were identificated. By cross-referencing with the 27 lncRNAs that form our prognostic model, we pinpointed LINC01003. Supplementary Fig. S4A presents a heatmap demonstrating the expression levels of LINC01003 across different NSCLC datasets and cell types. The results indicate that LINC01003 is differentially expressed, with notable high expression in monocytes/macrophages and endothelial cells across several datasets, suggesting its potential involvement in these cell types within the NSCLC tumor microenvironment. Supplementary Figure S4B further illustrates the expression profile of LINC01003 in different cell populations from the GSE143423 dataset. The violin plot shows significant expression of LINC01003 in malignant cells, compared to other cell types, indicating its potential role in tumor progression.

To decipher the LINC00857 related regulatory mechanisms, we constructed a lncRNA-miRNA-mRNA network (Supplementary Fig. S4C). This network illustrates the intricate interactions between LINC00857 and various miRNAs and mRNAs. In this network, LINC00857 acts as a central regulatory hub, potentially influencing gene expression by sequestering multiple miRNAs, such as hsa-miR-4709-5p, hsa-miR-760, and hsa-miR-340-5p. These miRNAs, in turn, are connected to a wide array of target genes, including YWHAZ, BCL2L2, PTEN, and MYC, which are critical in cellular processes such as cell cycle regulation, apoptosis, and signal transduction.

See the original post:
Developing a prognostic model using machine learning for disulfidptosis related lncRNA in lung adenocarcinoma ... - Nature.com

Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare … – Nature.com

Posted on June 12, 2024 by Danzig

The primary training cohort used to recalibrate the model included 49,652 patients (median [IQR] age = 66.0 [26.0]), of which 49.9% self-identified as female, 29.6% self-identified as Black or African American, 54.8% were on Medicare and 27.8% on Medicaid. 11,664 (24%) malnutrition cases were identified. Baseline characteristics are summarized in Table 1 and malnutrition event rates are summarized in Supplementary Table 2. The validation cohort used to test the model included 17,278 patients (median [IQR] age = 66.0 [27.0]), of which 49.8% self-identified as female, 27.1% self-identified as Black or African American, 52.9% were on Medicare, and 28.2% on Medicaid. 4,005 (23%) malnutrition cases were identified.

Although the model overall had a c-index of 0.81 (95% CI: 0.80, 0.81), it was miscalibrated according to both weak and moderate calibration metrics, with a Brier score of 0.26 (95% CI: 0.25, 0.26) (Table 2), indicating that the model is relatively inaccurate17. It also overfitted the risk estimate distribution, as evidenced by the calibration curve (Supplementary Fig. 1). Logistic recalibration of the model successfully improved calibration, bringing the calibration intercept to 0.07 (95% CI: 0.11, 0.03), calibration slope to 0.88 (95% CI: 0.86, 0.91), and significantly decreasing Brier score (0.21, 95% CI: 0.20, 0.22), Emax (0.03, 95% CI: 0.01, 0.05), and Eavg (0.01, 95% CI: 0.01, 0.02). Recalibrating the model improved specificity (0.74 to 0.93), PPV (0.47 to 0.60), and accuracy (0.74 to 0.80) while decreasing sensitivity (0.75 to 0.35) and NPV (0.91 to 0.83) (Supplementary Tables 2 and 3).

Weak and moderate calibration metrics between Black and White patients significantly differed prior to recalibration (Table 3, Supplementary Fig. 2A, B), with the model having a more negative calibration intercept for White patients on average compared to Black patients (1.17 vs. 1.07), and Black patients having a higher calibration slope compared to White patients (1.43 vs. 1.29). Black patients had a higher Brier score of 0.30 (95% CI: 0.29, 0.31) compared to White patients with 0.24 (95% CI: 0.23, 0.24). Logistic recalibration significantly improved calibration for both Black and White patients (Table 4, Fig. 1ac). For Black patients within the hold-out set, the recalibrated calibration intercept was 0 (95% CI: -0.07, 0.05), calibration slope was 0.91 (95% CI: 0.87, 0.95), and Brier score improved from 0.30 to 0.23 (95% CI: 0.21, 0.25). For White patients within the hold-out set, the recalibrated calibration intercept was -0.15 (95% CI: -0.20, -0.10), calibration slope was 0.82 (95% CI: 0.78, 0.85), and Brier score improved from 0.24 to 0.19 (95% CI: 0.18, 0.21). Post-recalibration, calibration for Black and White patients still differed significantly according to weak calibration metrics, but not so according to moderate calibration metrics and the strong calibration curves (Table 4, Fig. 1). Calibration curves of the recalibrated model showed good concordance between actual and predicted event probabilities, although the predicted risks for Black and White patients differed between the 30th and 60th risk percentiles. Logistic recalibration also improved the specificity, PPV, and accuracy, but decreased the sensitivity and NPV of the model across both White and Black patients (Supplementary Tables 2and 3). Discriminative ability was not significantly different for White and Black patients before and after recalibration. We also found calibration statistics to be relatively similar in Asian patients (Supplementary Table 4).

Columns from left to right are curves for a, No Recalibration b, Recalibration-in-the-Large and c, Logistic Recalibration for Black vs. White patients d, No Recalibration e, Recalibration-in-the-Large and f, Logistic Recalibration for male vs. female patients.

Calibration metrics between male and female patients also significantly differed prior to recalibration (Table 3, Supplementary Fig. 2C, D). The model had a more negative calibration intercept for female patients on average compared to male patients (1.49 vs. 0.88). Logistic recalibration significantly improved calibration for both male and female patients (Table 4, Fig. 1df). In male patients within the hold-out set, the recalibrated calibration intercept was 0 (95% CI: 0.05, 0.03), calibration slope was 0.88 (95% CI: 0.85, 0.90), and Brier score improved from 0.29 to 0.23 (95% CI: 0.22, 0.24). In female patients within the hold-out set, the recalibrated calibration intercept was 0.11 (95% CI: 0.16, 0.06), calibration slope was 0.91 (95% CI: 0.87, 0.94), but the Brier score did not significantly improve. After logistic recalibration, only calibration intercepts differed between male and female patients. Calibration curves of the recalibrated model showed good concordance, although the predicted risks for males and females differed between the 10th and 30th risk percentiles. Discrimination metrics for male and female patients were significantly different before recalibration. The model had a higher sensitivity and NPV for females than males, but a lower specificity, PPV, and accuracy (Supplementary Table 2). The recalibrated model had the highest sensitivity (0.95, 95% CI: 0.94, 0.96), NPV (0.84, 95% CI: 0.83, 0.85) and accuracy (0.82, 95% CI: 0.81, 0.83) for female patients, at the cost of substantially decreasing sensitivity (0.27, 95% CI: 0.25, 0.30) (Supplementary Table 3).

We also assessed calibration by payor type and hospital type as sensitivity analyses. In the payor type analysis, we found that malnutrition predicted risk was more miscalibrated in patients with commercial insurance with more extreme calibration intercepts, Emax, and Eavg suggesting overestimation of risk (Supplementary Tables 5 and 6, Supplementary Fig. 3A, B). We did not observe substantial differences in weak or moderate calibration across hospital type (community, tertiary, quaternary) except that tertiary acute care centers had a more extreme calibration intercept, suggesting an overestimation of risk (Supplementary Tables 7 and 8, Supplementary Fig. 3C, D). Across both subgroups, logistic recalibration significantly improved calibration across weak, moderate, and strong hierarchy tiers (Supplementary Table 5, Supplementary Table 7, Supplementary Figs. 4 and 5).

Read this article:
Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare ... - Nature.com

Want an AI Job? Check out these new AWS AI certifications – ZDNet

Posted on June 12, 2024 by Danzig

zf L/Getty Images

I know many people who are twitchy about AI taking their jobs away. I hear you, and I understand. But AI is also opening the door to new jobs. While the top positions demand esoteric skills, such as knowing how to program with OpenCV, PyTorch, and TensorFlow, there are also jobs out there for people who aren't computer scientists. Amazon Web Services (AWS) is opening the doors for you and me with a suite of training courses and new certifications.

At first, prompt engineering -- a fancy way of asking AI chatbots nicely for the best answers -- seemed like the way for non-experts to get AI jobs. It turns out that large language models (LLM) can write and optimize their own prompts just fine.

Also: The best free AI courses

However, there are many other AI-related jobs where you can make good money. Indeed, according to AWS, employers are willing to pay you up to 47% higher salaries if you've got AI skills. While top dollar goes to IT professionals, AWS has also found that sales and marketing workers can get 43% higher salaries, while finance professionals can make up to 42% higher. That's not hay!

To help workers build and prove their AI expertise, Amazon Web Services (AWS) has announced a new AI training portfolio to equip you with the necessary AI skills and two new certifications to show that you know your stuff. These include:

AWS Certified AI Practitioner: This foundational-level certification is designed for non-technical workers. It helps you to demonstrate that you know your way aroundAI, machine learning (ML), and generative AI concepts and tools. This certification is ideal for professionals in marketing, sales, project and product management, HR, finance, and other non-IT roles who want to identify AI opportunities and collaborate effectively with technical teams.

AWS Certified Machine Learning Engineer -- Associate : This certification is for a technical worker. It's meant to show that you have the skills needed to build, deploy, and maintain AI and ML solutions. It covers essential aspects like optimizing model performance, managing computational resources, updating model versions, and securing AI solutions. This certification is crucial for IT professionals leveraging AI to meet business objectives.

To prepare for these certifications, AWS has launched eleven new training courses on AWS Skill Builder, its digital learning center. Individual subscriptions to this site are $29 a month or $449 a year.

The AI courses include foundational topics such as:

For those aiming for the higher-level AWS Certified Machine Learning Engineer Associate certification, additional courses cover advanced topics like data transformation techniques, feature engineering, bias mitigation strategies, and data security.

The AWS Certified AI Practitioner training resources include eight free courses. In these, you'll learn about real-world use cases for AI, ML, and generative AI, how to select a foundation model (FM), the concepts and techniques involved in crafting effective prompts, and more.

Also:Want to work in AI? How to pivot your career in 5 steps

Altogether, AWS offers more than 100 AI, ML, and generative AI courses and learning resources on AWS Skill Builder and AWS Educate to help you prepare for the future of work.

This follows Amazon's "AI Ready" initiative, which was meant to provide free AI skills training to 2 million people globally by 2025. It's also, of course, part of the AWS Certification system .

On August 13, you can register for the beta exams of AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer Associate. The complete certifications should be available by the end of the year.

Also: I spent a weekend with Amazon's free AI courses, and highly recommend you do too

Certifications are valuable for getting jobs. That's especially true in a field like AI, which is so new that it's difficult for employers to know if you really know your way around AI. I strongly suggest that if you think your future lies in AI work, or if just having AI skills can help you with your current job, you give these programs a try. The financial benefits speak for themselves.

See the original post here:
Want an AI Job? Check out these new AWS AI certifications - ZDNet

Enhancing customer retention in telecom industry with machine learning driven churn prediction | Scientific Reports – Nature.com

Posted on June 12, 2024 by Danzig

Kimura, T. Customer churn prediction with hybrid resampling and ensemble learning. J. Manag. Inform. Decis. Sci. 25(1), 123 (2022).

MathSciNet Google Scholar

Lalwani, P., Mishra, M.K., Chadha, J.S. and Sethi, P. Customer churn prediction system: a machine learning approach.Computing, pp.124 (2022).

Hadden, J., Tiwari, A., Roy, R. & Ruta, D. Computer assisted customer churn management: State-of- the-art and future trends. Comput. Oper. Res. 34(10), 29022917 (2007).

Article Google Scholar

Rajamohamed, R. & Manokaran, J. Improved credit card churn prediction based on rough clustering and supervised learning techniques. Clust. Comput. 21(1), 6577 (2018).

Article Google Scholar

Backiel, A., Baesens, B. & Claeskens, G. Predicting time-to-churn of prepaid mobile telephone customers using social network analysis. J. Operat. Res. Soc. 67(9), 11351145. https://doi.org/10.1057/jors.2016.8 (2016).

Article Google Scholar

Zhu, B., Baesens, B. & Vanden Broucke, S. K. An empirical comparison of techniques for the class imbalance problem in churn prediction. Inform. Sci. 408, 8499. https://doi.org/10.1016/j.ins.2017.04.015 (2017).

Article Google Scholar

Vijaya, J. & Sivasankar, E. Computing efficient features using rough set theory combined with ensemble classification techniques to improve the customer churn prediction in telecommunication sector. Computing 100(8), 839860 (2018).

Article Google Scholar

Ahmad, S. N. & Laroche, M. S. Analyzing electronic word of mouth: A social commerce construct. Int. J. Inform. Manag. 37(3), 202213 (2017).

Article Google Scholar

Gaurav Gupta, S. A critical examination of different models for customer churn prediction using data mining. Int. J. Eng. Adv. Technol. 6(63), 850854 (2019).

Google Scholar

Abbasimehr, H., Setak, M. & Tarokh, M. A neuro-fuzzy classifier for customer churn prediction. Int. J. Comput. Appl. 19(8), 3541 (2011).

Google Scholar

Kumar, S. & Kumar, M. Predicting customer churn using artificial neural network. In Engineering Applications of Neural Networks: 20th International Conference, EANN 2019, Xersonisos, Crete, Greece, May 24-26, 2019, Proceedings (eds Macintyre, J. et al.) 299306 (Springer International Publishing, 2019). https://doi.org/10.1007/978-3-030-20257-6_25.

Chapter Google Scholar

Sharma, T., Gupta, P., Nigam, V. & Goel, M. Customer churn prediction in telecommunications using gradient boosted trees. In International Conference on Innovative Computing and Communications: Proceedings of ICICC 2019 Vol. 2 (eds Khanna, A. et al.) 235246 (Springer Singapore, 2020). https://doi.org/10.1007/978-981-15-0324-5_20.

Chapter Google Scholar

Umayaparvathi, V. & Iyakutti, K. A survey on customer churn prediction in telecom industry: Datasets, methods and metrics. Int. Res. J. Eng. Technol. 4(4), 10651070 (2016).

Google Scholar

Ahmad, A. K., Jafar, A. & Aljoumaa, K. Customer churn prediction in telecom using machine learning in big data platform. J. Big Data 6(1), 28 (2019).

Article Google Scholar

Extracted from: https://www.kaggle.com/competitions/customer-churn-prediction-2020/data?select=test.csv

Mishra, A. & Reddy, U. S. A comparative study of customer churn prediction in telecom industry using ensemble based classifiers. In 2017 International Conference on Inventive Computing and Informatics (ICICI). IEEE, 721725. (2017)

Coussement, K., Lessmann, S. & Verstraeten, G. A comparative analysis of data preparation algorithms for customer churn prediction: A case study in the telecommunication industry. Decis. Support Syst. 95, 2736 (2017).

Article Google Scholar

Wang, Q. F., Xu, M. & Hussain, A. Large-scale ensemble model for customer churn prediction in search ads. Cogn. Comput. 11(2), 262270 (2019).

Article Google Scholar

Hashmi, N., Butt, N. A. & Iqbal, M. Customer churn prediction in telecommunication a decade review and classification. Int. J. Comput. Sci. Issues 10(5), 271 (2013).

Google Scholar

Eria, K. & Marikannan, B. P. Systematic review of customer churn prediction in the telecom sector. J. Appl. Technol. Innovat. 2(1), 714 (2018).

Google Scholar

Brnduoiu, I., Toderean, G. & Beleiu, H. Methods for churn prediction in the pre-paid mobile telecommunications industry. In 2016 International conference on communications (COMM), 97100. IEEE. (2016)

Singh, M., Singh, S., Seen, N., Kaushal, S., & Kumar, H. Comparison of learning techniques for prediction of customer churn in telecommunication. In 2018 28th International Telecommunication Networks and Applications Conference (ITNAC) IEEE, pp. 15. (2018)

Lee, E. B., Kim, J. & Lee, S. G. Predicting customer churn in the mobile industry using data mining technology. Ind. Manag. Data Syst. 117(1), 90109 (2017).

Article Google Scholar

Bharadwaj, S., Anil, B. S., Pahargarh, A., Pahargarh, A., Gowra, P. S., & Kumar, S. Customer Churn Prediction in Mobile Networks using Logistic Regression and Multilayer Perceptron (MLP). In 2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT), IEEE. pp. 436438, (2018)

Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785794. (2016)

Dhaliwal, S. S., Nahid, A. A. & Abbas, R. Effective intrusion detection system using XGBoost. Information 9(7), 149 (2018).

Article Google Scholar

Baesens, B., Hppner, S. & Verdonck, T. Data engineering for fraud detection. Decis. Support Syst. 150, 113492 (2021).

Article Google Scholar

Zhou, H., Chai, H. F. & Qiu, M. L. Fraud detection within bankcard enrollment on mobile device based payment using machine learning. Front. Inform. Technol. Electron. Eng. 19(12), 15371545 (2018).

Article Google Scholar

Pamina, J., Raja, B., SathyaBama, S. & Sruthi, M. S. An effective classifier for predicting churn in telecommunication. J. Adv. Res. Dyn. Control Syst. 11, 221229 (2019).

Google Scholar

Kuhn, M. & Johnson, K. Applied Predictive Modeling 26th edn. (Springer, 2013).

Book Google Scholar

Yijing, L., Haixiang, G., Xiao, L., Yanan, L. & Jinling, L. Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl. -Based Syst. 94, 88104 (2016).

Article Google Scholar

Verbeke, W., Martens, D., Mues, C. & Baesens, B. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Syst. Appl. 38(3), 23542364 (2011).

Article Google Scholar

Burez, J. & Van den Poel, D. Handling class imbalance in customer churn prediction. Expert Syst. Appl. 36(3), 46264636 (2009).

Article Google Scholar

Lpez, V., Fernndez, A., Garca, S., Palade, V. & Herrera, F. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Inform. Sci. 250, 113141 (2013).

Article Google Scholar

Kaur, H., Pannu, H. S. & Malhi, A. K. A systematic review on imbalanced data challenges in machine learning: Applications and solutions. ACM Comput. Surv. (CSUR) 52(4), 136 (2019).

Google Scholar

Salunkhe, U. R. & Mali, S. N. A hybrid approach for class imbalance problem in customer churn prediction: A novel extension to under-sampling. Int. J. Intell. Syst. Appl. 11(5), 7181 (2018).

Google Scholar

Galar, M., Fernandez, A., Barrenechea, E., Bustince, H. & Herrera, F. A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C 42(4), 463484. https://doi.org/10.1109/TSMCC.2011.2161285 (2012).

Article Google Scholar

Singh, A. & Purohit, A. A survey on methods for solving data imbalance problem for classification. Int. J. Comput. Appl. 127(15), 3741 (2015).

Google Scholar

Schaefer, G., Krawczyk, B., Celebi, M. E. & Iyatomi, H. An ensemble classification approach for melanoma diagnosis. Memetic Comput. 6(4), 233240 (2014).

Article Google Scholar

Salunkhe, U. R. & Mali, S. N. Classifier ensemble design for imbalanced data classification: A hybrid approach. Proc. Comput. Sci. 85, 725732 (2016).

Article Google Scholar

Liu, X. Y., Wu, J. & Zhou, Z. H. Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B Cybern. 39(2), 539550 (2008).

Google Scholar

Haixiang, G., Yijing, L., Shang, J. & Mingyun, G. Learning from class-imbalanced data: Review of methods and applications. Expert Syst. Appl. 73, 220239 (2017).

Article Google Scholar

Douzas, G., Bacao, F. & Last, F. Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inform. Sci. 465, 120. https://doi.org/10.1016/j.ins.2018.06.056 (2018).

Article Google Scholar

Mahesh, B. Machine learning algorithms-a review. Int. J. Sci. Res. 9, 381386 (2020).

Google Scholar

Bonaccorso, G. Machine Learning Algorithms (Packt Publishing Ltd., 2017).

Google Scholar

Ray, S. A quick review of machine learning algorithms. In2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon). IEEE. pp. 3539, (2019)

Singh, A., Thakur, N. and Sharma, A., A review of supervised machine learning algorithms. In2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 13101315. 2016

Ayodele, T. O. Types of machine learning algorithms. New Adv. Mach. Learn. 3, 1948 (2010).

Google Scholar

Sagi, O. & Rokach, L. Ensemble learning: A survey. Wiley Interdisciplin. Rev.: Data Min. Knowled. Discov. 8(4), e1249 (2018).

Google Scholar

Zhang, C. & Ma, Y. (eds) Ensemble Machine Learning: Methods and Applications (Springer Science & Business Media, 2012).

Google Scholar

Amin, A., Adnan, A. & Anwar, S. An adaptive learning approach for customer churn prediction in the telecommunication industry using evolutionary computation and Nave Bayes. Appl. Soft Comput. 137, 110103 (2023).

Article Google Scholar

Amin, A. et al. Customer churn prediction in the telecommunication sector using a rough set approach. Neurocomputing 237, 242254 (2017).

Article Google Scholar

Amin, A., Shah, B., Khattak, A. M., Baker, T., & Anwar, S. Just-in-time customer churn prediction: With and without data transformation. In2018 IEEE congress on evolutionary computation (CEC), IEEE, pp. 16. (2018).

Amin, A., Shah, B., Abbas, A., Anwar, S., Alfandi, O., & Moreira, F. Features weight estimation using a genetic algorithm for customer churn prediction in the telecom sector. InNew Knowledge in Information Systems and Technologies: Vol. 2. Springer International Publishing. pp. 483491, (2019)

Chaubey, G. et al. Customer purchasing behavior prediction using machine learning classification techniques. J. Ambient Intell. Hum. Comput. https://doi.org/10.1007/s12652-022-03837-6 (2022).

Article Google Scholar

Thomas, W. E., & David, O. M. Chapter 4exploratory study.Research methods for cyber security, Syngress, 95130 (2017).

See the article here:
Enhancing customer retention in telecom industry with machine learning driven churn prediction | Scientific Reports - Nature.com

Apples AI leaders talk Siri, Apple Intelligence, and more – The Verge

Posted on June 12, 2024 by Danzig

The WWDC keynote may be over, but we still have lots of questions about the state and future of Apple Intelligence. And in a somewhat unusual move, Apple is here to answer some of them: Craig Federighi and John Giannandrea, two of the executives in charge of all of Apples AI efforts, are taking the stage in the Steve Jobs Theater to talk about everything Apple announced on Monday.

Federighi, of course, is Apples software chief, overseeing nearly all the platforms and features we saw at WWDC. Giannandrea is Apples SVP of machine learning and AI strategy and has been an important figure in the AI world for a long time. (Before coming to Apple, he was an engineering VP at Google, overseeing a lot of the same stuff.) Together, theyve overseen Apples work revamping Siri, partnering with OpenAI, bringing machine learning to devices and the cloud, and practically everything else Apple announced.

Were in the Steve Jobs Theater at Apple Park and will be live-blogging the event as it happens. Rest assured: there will be Siri questions.

AuthenticID Releases Deep Fake and Generative AI Detection Solution for Businesses – AiThority

Posted on June 12, 2024 by Danzig

New Capability Addresses Surge in Injection Attacks Fueled by AI-Based Technology

AuthenticID, a global leader in identity verification and fraud prevention solutions, announced the release of a new solution to detect deep fake and generative AI injection attacks. This new enhancement to their identity verification technology, developed by AuthenticIDs Product and Applied Research team, uses proprietary algorithms to prevent the majority of digital injection attacks that leverage AI-generated content.

Read:Impel adds WhatsApp messaging to AI-Powered Customer Lifecycle Management Platform

The widespread availability of inexpensive, easy-to-use tools allows bad actors to create highly convincing fake identity documents and biometrics

Deep fakes are synthetic documents, videos or biometrics such as facial or audio that are artificially generated using machine learning, specifically deep learning techniques. Digital injection attacks occur when a bad actor injects these deep fakes into an identity verification workflow to spoof the system and bypass traditional fraud detection and identity verification methods.

The widespread availability of inexpensive, easy-to-use tools allows bad actors to create highly convincing fake identity documents and biometrics, said Alex Wong, VP Product Management at AuthenticID. Recentnews storieshave shown just how devastating these attacks can be to any organization. Ourdeep fake injection attack solutionmeets a critical need to determine the legitimacy of a user in this new era of technology.

AiThority.com Latest News:Arista Unveils Etherlink AI Networking Platforms

AuthenticIDs Injection Attack Solution uses three methods with targeted, proprietary algorithms.

The AuthenticID identity verification solution is 100% automated, driven by machine learning technology. The solutions automation means there is no human bias or lag time introduced in the detection and decisioning process. Stopping injection attacks and deep fake attacks can be done in a workflow in a matter of milliseconds.

Weve observed fraudsters making fewer mistakes when they create fake documents, noted Stephen Thwaits, AuthenticIDs SVP Global Solutions. Traditional identity verification methods cant keep up with both the sophistication and ease at which bad actors can circumvent security measures with the use of new tools. Thats why continuous innovation is necessary to meet fraudsters at the front line.

Importantly, these algorithms and the overall solution are not a silver bullet in the fight against injection attacks and other identity fraud methods. Fraudsters are evolving their methods daily to develop new ways to circumvent identify verification and security measures. That is why AuthenticIDsIdentity Fraud Taskforceis committed to continuously developing new algorithms and improving the identity verification decisioning engine to ensure new fraud vectors can be identified and stopped.

AuthenticID will continue to drive innovation forward in its technology to ensure companies can stay ahead of changing fraud techniques and regulatory requirements while delivering best-in-class customer experience.

AiThority.com Latest News:Quantexa Partners with Databricks to Accelerate Customer Data and AI Initiatives

[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]

Study on Climate Change Control with ML Named Best Paper – AZoCleantech

Posted on June 12, 2024 by Danzig

The Tackling Climate Change with Machine Learning Workshop at the International Conference on Learning Representations (ICLR), held in Vienna, Austria, and remotely in May, awarded a best paper award to a group exploring and utilizing machine learning to predict weather.

Romit Maulik, aPenn State Institute for Computational and Data Sciences (ICDS) co-hire and assistant professor in the College of Information Sciences and Technology co-authored the winning article Scaling transformer neural networks for skillful and reliable medium-range weather forecasting.

This has been a year-long collaboration between the University of California-Los Angeles, Carnegie Mellon University, Argonne National Laboratory, and Penn State.

Romit Maulik, Study Co-Author and Assistant Professor, College of Information Sciences and Technology, Penn State University

The study looked at the utilization of modern artificial intelligence (AI) techniques for forecasting compared to traditional approaches currently used for operational forecasts.

Maulik added, It is a paradigm shift from looking at the classical forecasts provided by several agencies. Those forecasts are typically obtained with very large computing resources, and it can be computationally costly. We thought, what if we took an alternative route?

According to Maulik, the AI model learns weather patterns using data from historical sources like satellite images and archival forecasts. It is built on computer vision techniques.

Then, a trained model can make forecasts in real time, without requiring access to very large computational resources. Once the neural networks are trained and released, the model deployment can be done effectively on a laptop and, eventually, on increasingly smaller resources such as cell phones, Maulik further added.

According to Maulik, the ICLR hosts seminars on AI-related subtopics where researchers can present their work and get feedback.

Maulik stated, Getting accepted into a workshop, which is quite competitive, maximizes the papers visibility. It helps us get good feedback from both the AI and the domain sciences community and significantly improve our methods. The award itself is great; it validates our hard work. However, our long-term goal remains the same. We want to find ways to improve our current models and provide a viable competitor to classical weather forecasting approaches.

One of the objectives, according to Maulik, is for the researchers to be able to predict weather extremes more accurately than existing models may be able to.

Maulik concluded, Our eyes are set on grander challenges. That being said, as computational scientists, we want to solve the problem and we think of the tool after. We try to balance classical and machine learning methods and are not partial to either.

Maulik, Tung Nguyen, Rohan Shah, Hritik Bansal, Troy Arcomano, Veerabhadra Rao Kotamarthi, Ian Foster, Sandeep Madireddy, and Aditya Grover are among the article's authors and collaborators.

See the original post:
Study on Climate Change Control with ML Named Best Paper - AZoCleantech

Machine learning-guided realization of full-color high-quantum-yield carbon quantum dots – Nature.com

Posted on June 12, 2024 by Danzig

Workflow of ML-guided synthesis of CQDs

Synthesis parameters have great impacts on the target properties of resulting samples. However, it is intricate to tune various parameters for optimizing multiple desired properties simultaneously. Our ML-integrated MOO strategy tackles this challenge by learning the complex correlations between hydrothermal/solvothermal synthesis parameters and two target properties of CQDs in a unified MOO formulation, thus recommending optimal conditions that enhance both properties simultaneously. The overall workflow for the ML-guided synthesis of CQDs is shown in Fig.1 and Supplementary Fig.1. The workflow primarily consists of four key components: database construction, multi-objective optimization formulation, MOO recommendation, and experimental verification.

It consists of four key components: database construction, multi-objective optimization (MOO) formulation, MOO recommendation, and experimental verification.

Using a representative and comprehensive synthesis descriptor set is of vital importance in achieving the optimization of synthesis conditions36. We carefully selected eight descriptors to comprehensively represent the hydrothermal system, one of the most common methods to prepare CQDs. The descriptor list includes reaction temperature (T), reaction time (t), type of catalyst (C), volume/mass of catalyst (VC), type of solution (S), volume of solution (VS), ramp rate (Rr), and mass of precursor (Mp). To minimize human intervention, the bounds of synthesis parameters are determined primarily by the constraints of the synthesis methods and equipment used, instead of expert intuition. For instance, in employing hydrothermal/solvothermal method to prepare CQDs, as the reactor inner pot is made of polytetrafluoroethylene material, the usage temperature should be 220oC. Moreover, the capacity of the reactor inner pot used in the experiment is 25mL, with general guidance of not exceeding 2/3 of this volume for reactions. Therefore, in this study, the main considerations of experimental design are to ensure experimental safety and accommodate the limitations of equipment. These practical considerations naturally led to a vast parameter space, estimated at 20 million possible combinations, as detailed in Supplementary Table1. Briefly, the 2,7-naphthalenediol molecule along with catalysts such as H2SO4, HAc, ethylenediamine (EDA) and urea, were adopted in constructing the carbon skeleton of CQDs during the hydrothermal or solvothermal reaction process (Supplementary Fig.2). Different reagents (including deionized water, ethanol, N,N-dimethylformamide (DMF), toluene, and formamide) were used to introduce different functional groups into the architectures of CQDs, combined with other synthesis parameters, resulting in tunable PL emission. To establish the initial training dataset, we collected 23 CQDs synthesized under different randomly selected parameters. Each data sample is labelled with experimentally verified PL wavelength and PLQY (see Methods).

To account for the varying importance of multiple desired properties, an effective strategy is needed to quantitatively evaluate candidate synthesis conditions in a unified manner. A MOO strategy has thus been developed that prioritizes full-color PL wavelength over PLQY enhancement, by assigning an additional reward when maximum PLQY of a color surpassing the predefined threshold for the first time. Given (N) explored experimental conditions, {(({x}_{i},,{y}_{i}^{c},,{y}_{i}^{gamma }{|; i}=(1,2,ldots,N))}, ({x}_{i}) indicates the (i)-th synthesis condition defined by 8 synthesis parameters, ({y}_{i}^{c}) and ({y}_{i}^{gamma }) indicate the corresponding color label and yield (i.e., PLQY) given ({x}_{i}); ({y}_{i}^{c}in left{{c}_{1},,,{c}_{2},ldots,{c}_{M}right}) for (M) possible colors, ({y}_{i}^{gamma }in left[0,,1right]). The unified objective function is formulated as the sum of maximum PLQY for each color label, i.e.,

$$mathop{sum}nolimits_{{c}_{j}}{Y}_{{c}_{j}}^{max },$$

(1)

where (jin left{1,,2,,ldots,,Mright}) and ({Y}_{{c}_{j}}^{max }) is 0 if (nexists {y}_{i}^{c}={c}_{j}); otherwise

$${Y}_{{c}_{j}}^{max }={max }_{i}left[Big({y}_{i}^{gamma }+R{{cdot }}{mathbb{1}}left({y}_{i}^{gamma }ge alpha right)Big){{cdot }}{mathbb{1}}left({y}_{i}^{c}={c}_{j}right)right].$$

(2)

({mathbb{1}}({{cdot }})) is an indicator function that output 1 if true, otherwise outputs 0. The term (Rcdot {mathbb{1}}({y}_{i}^{gamma }ge alpha )) enforces a higher priority of full-color synthesis, where PLQY for each color shall be at least (alpha) ((alpha=0.5) in our case) to have an additional reward of (R) ((R=10) in our settings). (R) can be any real value larger than 1 (i.e., maximum possible improvement of PLQY for one synthesis condition), to ensure the higher priority of exploring synthesis conditions for colors in which yield has not achieved (alpha). We set (R) to 10, such that the tens digit of unified objective functions value clearly indicates the number of colors with maximum PLQYs exceeding (alpha), and the units digit reflects the sum of maximum PLQYs (without the additional reward) for all colors. As defined by the ranges of PL wavelength in Supplementary Table2, seven primary colors considered in this work are purple (<420nm), blue (420 and <460nm), cyan (460 and <490nm), green (490 and <520nm), yellow (520 and <550nm), orange (550 and <610nm), and red (610nm), i.e., (M=7). Notably, the proposed MOO formulation unifies the two goals of achieving full color and high PLQY into a single objective function, providing a systematical approach to tune synthesis parameters for the desired properties.

The MOO strategy is premised on the prediction results of ML models. Due to the high-dimensional search space and limited experimental data, it is challenging to build models that generalize well on unseen data, especially considering the nonlinear nature of the condition-property relationship37. To address this issue, we employed a gradient boosting decision tree-based model (XGBoost), which has proven advantageous in handling related material datasets (see Methods and Supplementary Fig.3)30,38. In addition, its capability to guide hydrothermal synthesis has been proven in our previous work (Supplementary Fig.4)21. Two regression models, optimized with the best hyperparameters through grid search, were fitted on the given dataset, one for PL wavelength and the other for PLQY. These models were then deployed to predict all unexplored candidate synthesis conditions. The search space for candidate conditions is defined by the Cartesian product of all possible values of eight synthesis parameters, resulting in ~20 million possible combinations (see Supplementary Table1). The candidate synthesis conditions, i.e., unexplored regions of the search space, are further ranked by MOO evaluation strategy with the prediction results.

Finally, the PL wavelength and PLQY values of the CQDs synthesized under the top two recommended synthesis conditions are verified through experiments and characterization, whose results are then augmented to the training dataset for the next iteration of the MOO design loop. The iterative design loops continue until the objectives are fulfilled, i.e., when the achieved PLQY for all seven colors surpasses 50%. In prior studies on CQDs, its worth noting that only a limited number of CQDs with short-wavelength fluorescence (e.g., blue and green), have reached PLQYs above 50%39,40,41. On the other hand, their long-wavelength counterparts, particularly those with orange and red fluorescence, usually demonstrate PLQYs under 20%42,43,44. Underlining the efficacy of our ML-powered MOO strategy, we have set an ambitious goal for all fluorescent CQDs: the attainment of PLQYs exceeding 50%. The capacity to modulate the PL emission of CQDs holds significant promise for various applications, spanning from bioimaging and sensing to optoelectronics. Our four-stage workflow is crafted to forge an ML-integrated MOO strategy that can iteratively guide hydrothermal synthesis of CQDs for multiple desired properties, while also constantly improving the models prediction performance.

To assess the effectiveness of our ML-driven MOO strategy in the hydrothermal synthesis of CQDs, we employed several metrics, which were specifically chosen to ascertain whether our proposed approach not only meets its dual objectives but also enhances prediction accuracy throughout the iterative process. The unified objective function described above measures how well the two desired objectives have been realized experimentally, and thus can be a quantitative indicator of the effectiveness of our proposed approach in instructing the CQD synthesis. The evaluation output of the unified objective function after a specific ML-guided synthesis loop is termed as objective utility value. The MOO strategy improves the objective utility value by a large margin of 39.27% to 75.44, denoting that the maximum PLQY in all seven colors exceeds the target of 0.5 (Fig.2a). Specifically, at iterations 7 and 19, the number of color labels with maximum PLQY exceeding 50% increases by one, resulting in an additional reward of 10 each time. Even on the seemingly plateau, the two insets illustrate that the maximally achieved PLQY is continuously enhanced. For instance, during iterations 8 to 11, the maximum PLQY for cyan emission escalates from 59% to 94%, and the maximum PLQY for purple emission rises from 52% to 71%. Impressively, our MOO approach successfully fulfilled both objectives within only 20 iterations (i.e., 40 guided experiments).

a MOOs unified objective utility versus design iterations. b Color explored with new synthesized experimental conditions. Value ranges of colors defined by PL wavelength: purple (PL<420nm), blue (420nm PL<460nm), cyan (460nm PL<490nm), green (490nm PL<520nm), yellow (520nm PL<550nm), orange (550nm PL<610nm), and red (610nm PL). It shows that while high PLQY has been achieved for red, orange, and blue in the initial dataset, the MOO strategy purposefully enhances PLQYs for yellow, purple, cyan, green respectively in subsequent synthesized conditions in a group of five. c MSE between the predicted and real target properties. d Covariance matrix for correlation among the 8 synthesis parameters (i.e., reaction temperature T, reaction time t, type of catalyst C, volume/mass of catalyst VC, type of solution S, volume of solution VS, ramp rate Rr, and mass of precursor Mp) and 2 target properties, i.e., PLQY and PL wavelength (PL ). e Two-dimensional t-distributed stochastic neighbor embedding (t-SNE) plot for the whole search space, including unexplored (circular points), training (star-shaped points), and explored (square points) conditions, where the latter two sets are colored by real PL wavelengths.

Figure2b reveals that the MOO strategy systematically explores the synthesis conditions for each color, addressing those that have not yet achieved the designed PLQY threshold, starting with yellow in the first 5 iterations and ending with green in the last 5 iterations. Notably, within each quintet of 5 iterations, a singular color demonstrates an enhancement in its maximum PLQY. Initially, the PLQY for yellow surges to 65%, which is then followed by a significant rise in purples maximum PLQY from 44% to 71% during the next set of 5 iterations. This trend continues with cyan and green, where the maximum PLQY escalates to 94% and 83% respectively. Taking into account both the training set (i.e., the first 23 samples) and the augmented dataset, the peak PLQY for all colors exceeds 60%. Several colors approach 70% (including purple, blue, and red), and some are near 100% (including cyan, green, and orange). This further underscores the effectiveness of our proposed ML technique. A more detailed visualization of the PL wavelength and PLQY along each iteration is provided in Supplementary Fig.5.

The MOO strategy ranks candidate synthesis conditions based on ML prediction; thus, it is vital to evaluate the ML models performance. Mean squared error (MSE) is employed as the evaluation metric, commonly used for regression, which is computed based on the predicted PL wavelength and PLQY from the ML models versus the experimentally determined values45. As shown in Fig.2c, the MSE of PLQY drastically decreases from 0.45 to approximately 0.15 within just four iterations a notable error reduction of 64.5%. The MSE eventually stabilizes around 0.1 as the iterative loops progress. Meanwhile, the MSE of PL wavelength remains consistently low, always under 0.1. MSE of PL wavelength is computed after normalizing all values to the range of zero to one for a fair comparison, thus MSE of 0.1 signifies a favorable deviation within 10% between the ML-predicted values and the experimental verifications. This indicates that the accuracies of our ML models for both PL wavelength and PLQY consistently improve, with predictions closely aligning with actual values after enhanced learning from augmented data. This not only demonstrates the efficacy of our MOO strategy in optimizing multiple desired properties but also in refining ML models.

To unveil the correlation between synthesis parameters and target properties, we further calculated the covariance matrix. As illustrated in Fig.2d, the eight synthesis parameters generally exhibit low correlation among each other, indicating that each parameter contributes unique and complementary information for the optimization of the CQDs synthesis conditions. In terms of the impact of these synthesis parameters on target properties, factors such as reaction time and temperature are found to influence both PL wavelength and PLQY. This underscores the importance for both experimentalists and data-driven methods to adjust them with higher precision. Besides reaction time and temperature, PL wavelength and PLQY are determined by distinct sets of synthesis parameters with varying relations. For instance, the type of solution affects PLQY with a negative correlation, while solution volume has a stronger positive correlation with PLQY. This reiterates that, given the high-dimensional search space, the complex interplay between synthesis parameters and multiple target properties can hardly be unfolded without capable ML-integrated methods.

To visualize how the MOO strategy has navigated in the expansive search space (~20 million) using only 63 data samples, we have compressed the initial training, explored, and unexplored space into two dimensions by projecting them into a new reduced embedding space using t-distributed stochastic neighbor embedding (t-SNE)46. As shown in Fig.2e, discerning distinct clustering patterns by color proves challenging, which emphasizes the intricate task of uncovering the relationship between synthesis conditions and target properties. This complexity further underscores the critical role of a ML-driven approach in deciphering the hidden intricacies within the data. The efficacy of ML models is premised on the quality of training data. Thus, selecting training data that span as large search space as possible is particularly advantageous to models generalizability37. As observed in Fig.2e, our developed ML models benefit from the randomly and sparsely distributed training data, which in turn encourage the models to further generalize to previously unseen areas in the search space, and effectively guide the searching of optimal synthesis conditions within this intricate multi-objective optimization landscape.

With the aid of ML-coupled MOO strategy, we have successfully and rapidly identified the optimal conditions giving rise to full-color CQDs with high PLQY. The ML-recommended synthesis conditions that produced the highest PLQY of each color are detailed in the Methods section. Ten CQDs with the best optical performance were selected for in-depth spectral investigation. The resulting absorption spectra of the CQDs manifest strong excitonic absorption bands, and the normalized PL spectra of the CQDs displayed PL peaks ranging from 410nm of purple CQDs (p-CQDs) to 645nm of red CQDs (r-CQDs), as shown in Fig.3a and Supplementary Fig.6. This encompasses a diverse array of CQD types, including p-CQDs, blue CQDs (b-CQDs, 420nm), cyan CQDs (c-CQDs, 470nm), darkcyan CQDs (dc-CQDs, 485nm), green CQDs (g-CQDs, 490nm), yellow-green CQDs (yg-CQDs, 530nm), yellow CQDs (y-CQDs, 540nm), orange CQDs (o-CQDs, 575nm), orange red CQDs (or-CQDs, 605nm), and r-CQDs. Importantly, PLQY of most of these CQDs were above 60% (Supplementary Table3), exceeding the majority of CQDs reported to date (Supplementary Table4). Corresponding photographs of full-color fluorescence ranging from purple to red light under UV light irradiation are provided in Fig.3b. Excellent excitation-independent behaviors of the CQDs have been further revealed by the three-dimensional fluorescence spectra (Supplementary Fig.7). Furthermore, a comprehensive investigation of the time-resolved PL spectra revealed a notable trend. The monoexponential lifetimes of CQDs progressively decreased from 8.6ns (p-CQDs) to 2.3ns (r-CQDs) (Supplementary Fig.8). This observation signified that the lifetimes of CQDs diminished as their PL wavelength experiences a shift towards the red end of the spectrum47. Moreover, the CQDs also demonstrate long-term photostability (>12hours), rendering them potential candidates for applications in optoelectronic devices that require stable performance over extended periods of time (Supplementary Fig.9). All the results together demonstrate the high quality and great potential of our synthesized CQDs.

a Normalized PL spectra of CQDs. b Photographs of CQDs under 365 nm-UV light irradiation. c Dependence of the HOMO and LUMO energy levels of CQDs.

To gain further insights into the properties of the synthesized CQDs, we calculated their bandgap energies using the experimentally obtained absorption band values (Supplementary Fig.10 and Table5). It is revealed that the calculated bandgap energies gradually decrease from 3.02 to 1.91eV from p-CQDs to r-CQDs. In addition, we measured the highest occupied molecular orbital (HOMO) energy levels of the CQDs using ultraviolet photoelectron spectroscopy. As shown in the energy diagram in Fig.3c, the HOMO values exhibit wave-like variations without any discernible pattern. This result further suggests the robust predictive and optimizing capability of our ML-integrated MOO strategy, which enabled the successful screening of these high-quality CQDs from vast and complex search space using only 40 sets of experiments.

To uncover the underlying mechanism of the tuneable optical effect of the synthesized CQDs, we have carried out a series of characterizations to comprehensively investigate their morphologies and structures (see Methods). X-ray diffraction (XRD) patterns with a single graphite peak at 26.5 indicate a high-degree graphitization in all CQDs (Supplementary Fig.11)15. Raman spectra exhibit a stronger signal intensity for the ordered G band at 1585cm1 compared to the disordered D band at 1397cm1, further confirming the high-degree graphitization (Supplementary Fig.12)48. Fourier-transform infrared (FT-IR) spectroscopy was then performed to detect the functional groups in CQDs, which clearly reveals the NH2 and NC stretching at 3234 and 1457cm1, respectively, indicating the presence of abundant NH2 groups on the surface of CQDs, except for orange CQDs (o-CQDs) and yellow CQDs (y-CQDs) (Supplementary Fig.13)49. The C=C aromatic ring stretching at 1510cm1 confirms the carbon skeleton, while three oxide-related peaks, i.e., OH, C=O, and CO stretching, were observed at 3480, 1580, and 1240cm1, respectively, due to abundant hydroxyl groups of the precursor. The FT-IR spectrum also shows a stretching vibration band SO3 at 1025cm1, confirming the additional functionalization of y-CQDs by SO3H groups.

X-ray photoelectron spectroscopy (XPS) was adopted to further probe the functional groups in CQDs (Supplementary Fig.14 to 23). XPS survey spectra analysis reveals three main elements in CQDs, i.e., C, O, and N, except o-CQDs and y-CQDs. Specifically, o-CQDs and y-CQDs lack the N element and y-CQDs contains S element. The high-resolution C1s spectrum of CQDs can be deconvoluted into three peaks, including a dominant CC/C=C graphitic carbon bond (284.8eV), CO/CN (286eV), and carboxylic C=O (288eV), revealing the structures of CQDs. The N1s peak at 399.7eV indicates the presence of NC bonds, verifying the successful N-doping in the basal plane network structure of CQDs, except o-CQDs and y-CQDs. The separated peaks of O1s at 531.5 and 533eV indicate the two forms of oxyhydrogen functional groups with C=O and CO, respectively, consistent with the FT-IR spectra50. The S2p band of y-CQDs can be decomposed into two peaks at 163.5 and 167.4eV, representing SO3/2P3/2 and SO3/2P1/2, respectively47,51. Combining the results of structure characterization, the excellent fluorescence properties of the CQDs are attributed to the presence of N-doping, which reduces non-radiative sites of CQDs and promotes the formation of C=O bonds. The C=O bonds play a crucial role in radiation recombination and can increase the PLQY of the CQDs.

To gain deeper insights into the morphology and microstructures of the CQDs, we have then conducted transmission electron microscopy (TEM). The TEM images demonstrate uniformly shaped and monodisperse nanodots, with the gradual increase of average lateral sizes ranging from 1.85nm for p-CQDs to 2.3nm for r-CQDs (Fig.4a and Supplementary Fig.24), which agrees with the corresponding PL wavelength, providing further evidence for the quantum size effect of CQDs (Fig.4a)47. High-resolution TEM images further reveal the highly crystalline structures of CQDs with well-resolved lattice fringes (Fig.4b-c). The measured crystal plane spacing of 0.21nm corresponds to the (100) graphite plane, further corroborating the XRD data. Our analysis suggests that the synthesized CQDs possess a graphene-like high-crystallinity characteristic, thereby giving rise to their superior fluorescence performance.

a The lateral size and color of full-color fluorescent CQDs (inset: dependence of the PL wavelength and the lateral size of full-color fluorescent CQDs). Data correspond to meanstandard deviation, n=3. b, c High-resolution TEM images and the fast Fourier transform patterns of p-, b-, c-, g-, y-, o- and r-CQDs, respectively. d Boxplots of PL wavelength (left)/PLQY (right) and 7 synthesis parameters of CQDs. VC is excluded here as its value range is dependent on C, whose relationships with other parameters are not directly interpretable. The labels at the bottom indicate the minimum value (inclusive) for the respective bins, whereas the bins on the left are the same as the discretization of colors in Supplementary Table2, the bins on the right are uniform. Each box spans vertically from the 25th percentile to the 75th percentile, with the horizontal line marking the median and the triangle indicating the mean values. The upper and lower whiskers extend from the ends of the box to the minimum and maximum data values.

Following the effective utilization of ML in thoroughly exploring the entire search space, we proceeded to conduct a systematic examination of 63 samples using box plots, aiming to elucidate the complex interplay between various synthesis parameters and the resultant optical properties of CQDs. As depicted in Fig.4d, the synthesis under conditions of high reaction temperature, prolonged reaction time, and low-polarity solvents, tends to result in CQDs with a larger PL wavelength. These findings are consistent with the general observations in the literature, which suggest that the parameters identified above can enhance precursor molecular fusion and nucleation growth, thereby yielding CQDs with increased particle size and high PL wavelength47,52,53,54,55. Moreover, a comprehensive survey of existing literature implies that precursors and catalysts, typically including electron donation and acceptance, aid in producing long-wavelength CQDs56,57. Interestingly, diverging from traditional findings, we successfully synthesized long-wavelength red CQDs under ML guidance, with 2,7-naphthalenediol containing electron-donating groups as the precursor and EDA is known for its electron-donating functionalities as the catalyst. This significant breakthrough questions existing assumptions and offers new insights into the design of long-wavelength CQDs.

Concerning PLQY, we found that catalysts with stronger electron-donating groups (e.g., EDA) led to enhanced PLQY in CQDs, consistent with earlier observations made by our research team16. Remarkably, we uncovered the significant impact of synthesis parameters on CQDs PLQY. In the high PLQY regime, strong positive correlations were discovered between PLQY and reaction temperature, reaction time, and solvent polarity, previously unreported in the literature58,59,60,61. This insight could be applied to similar systems for PLQY improvement.

Aside from the parameters discussed above, other factors such as ramp rate, the amount of precursor, and solvent volume also influence the properties of CQDs. Overall, the emission color and PLQY of CQDs are governed by complex, non-linear trends resulting from the interaction of numerous factors. Its noteworthy to mention that the traditional methods used to adjust CQDs properties often result in a decrease in PLQY as the PL wavelength redshifts4,47,51,54. However, utilizing AI-assisted synthesis, we have successfully increased the PLQY of the resulting full-color CQDs to over 60%. This significant achievement highlights the unique advantages offered by ML-guided CQDs synthesis and confirms the powerful potential of ML-based methods in effectively navigating the complex relationships among diverse synthesis parameters and multiple target properties within a high-dimensional search space.

Predictive approach for liberation from acute dialysis in ICU patients using interpretable machine learning | Scientific … – Nature.com

Posted on June 12, 2024 by Danzig

We constructed a machine-learning algorithm to forecast the early likelihood of dialysis liberation in critically ill patients with AKI, incorporating both the variability and trends of dynamic parameters from routine data within the first 72h of dialysis. These models were developed, cross-validated, and temporal tested, thus demonstrating good discrimination in predicting renal recovery at hospital discharge. Furthermore, we applied class weighting to address data imbalance, used LASSO to develop models with few variables, and predicted short time windows (the first 24 or 48h), all showing good discrimination. These results suggest the potential clinical utility of integration into EMR for clinical decision-support systems. Finally, using SHAP value and PDP, we identified critical features influencing the predictions of the model, with early vital signs and inputs/outputs domains being the vital drivers of the model. Explainable machine learning-based prediction for AKI-D recovery using existing EMR data hold potential for improving risk stratification and gaining insights into patient outcomes.

Nonrecovery of renal function after AKI-D is associated with increased morbidity and mortality and high health care cost25,26. Consistent with the findings of previous epidemiology studies1,26, the present study showed higher short-term (3-month) and long-term (1-year) post-discharge mortality rates for patients dependent on dialysis compared to those liberated from acute dialysis (Supplemental Figure S3). Early prediction of recovery from AKI-D in critically ill patients has significant implications regarding patient-centered care27. Currently, the prediction solely relies on clinical experience. The most commonly used dialysis cessation indicator is the increase in urinary output5. However, urinary outputs accuracy in predicting successful RRT discontinuation remains controversial, with reported AUROCs ranging from 0.63 to 0.91 and varying cut-off values5,28. Additionally, urinary output is typically employed as an indicator of renal recovery later upon RRT discontinuation rather than a marker in the early stages. Traditional functional biomarkers (serum/urine Cr or cystatin C) and novel biomarkers (kidney injury molecule-1, neutrophil gelatinase-associated lipocalin, osteopontin, tissue inhibitor of metalloprotease-2/insulin-like growth factor binding protein-7, proenkephalin A 119159, etc.) have been explored as predictors of AKI-D recovery5,8,11,27,29. Current biomarkers for renal function recovery after AKI-D, which require additional samples and have limited conclusive evidence, have not been used widely to identify patients with a high probability of renal recovery in the early stages. The urgent need for precision guide to liberate from RRT was also recognized in the recent Acute Dialysis Quality Initiative (ADQI) consensus conference30. The experts emphasized on the integration of big data analysis and single case EMR evaluation to allow personalized RRT for every single individual. To address this gap, there is a clinical unmet need to integrate EMR to assess their predictive value for RRT discontinuation and prognosis in AKI-D.

Machine-learning models developed in critical nephrology can harness the data collected in EMR for important renal outcome predictions13,31,32. As data accumulates, these models will also offer the additional advantage of early prediction or enhanced accuracy. However, validated machine-learning models for predicting acute dialysis discontinuation in critical setting are less studied. To our knowledge, one prior research has employed a machine-learning approach to predict freedom from RRT in patients with AKI-D. Pattharanitima et al. utilized the Medical Information Mart for Intensive Care (MIMIC-III) database to predict RRT-free survival in critically ill patients with AKI requiring continuous renal replacement therapy (CRRT)16. Out of 684 patients, 30% had stopped from RRT successfully. Models using 81 features extracted between hospital admission and CRRT initiation yielded AUROC values ranging 0.430.7. In our study including 1,381 AKI-D individuals, we used 90 variables from the initial three days post-dialysis, including all vital signs and inputoutput records. Thus, variability and trends across multiple time points of these data were incorporated into our models. The prediction models in our study exhibited good performance, with AUROC of 0.770.81 in the development cohort and 0.820.85 in the temporal testing cohort. Aside from candidate predictors, the differences in model performance between our study and the prior study may also be due to differences in the study populations character, number of participants, and different feature window. The first three days are considered as acute phase of ICU patients, as exemplified by septic shock, where shock reversal often occurs within the first 3days33. Importantly, providing additional prognostic information after initial intensive treatment period can aid in subsequent medical decisions, including the consideration of clinical trials for high-risk groups, or the potential withdrawal of life-sustaining medical care. Moreover, we trained models at 24 and 48h in addition to the 72h model, both maintaining good predictive performance.

As shown in Table 3 and Table S6, a proposed threshold of 0.5 for predicting renal recovery provided good specificity, whereas a threshold of 0.3 enhanced sensitivity. Decision curve analysis revealed the net benefits of using these models in clinical decision-making by considering the trade-offs between sensitivity and specificity at various threshold probabilities. The models use would yield more benefit than harm at both threshold of 0.5 and 0.3. Consequently, a lower threshold, such as 0.3, allowed for the identification of a broader subset of patients likely to recover renal function. Meanwhile, a threshold of 0.5 resulted in fewer false positives and would reduce alarm fatigue, a major concern in ICU alarm systems34,35. Therefore, the selection of a threshold in practical applications should be based on whether a healthcare provider requires assistance in accurately identifying patients who can recover or cannot recover after AKI-D, while effectively managing resources.

Using the interpretable machine-learning algorithm, nonsurprisingly, the single most influential variables in renal function recovery after initiating dialysis for AKI was urine output. Patients who successfully liberated from RRT demonstrated significantly higher urine output. According to PDP (Fig.4), patients with urine volume>1570ml over the 72h period post-dialysis were more likely to achieve dialysis independence at discharge. Figure4 demonstrates the PDP of top predictors by SHAP value and the cut-off value in favor of renal recovery. In our study, the top 20 variables include previously well-studies factors for renal recovery, including urine volume or BUN5,36,37, along with less-explored variables (ex: enteral diet intake within the first 3days after dialysis). Moreover, we categorized the top 20 variables identified by the XGBoost model by clinical domains, including comorbidity, vital signs, laboratory data, Inputs/outputs domain, and others. Besides urine volume, most of the early predictors were related to the vital signs domain (Fig.3B). A general consensus exists that hemodynamic instability caused by excessive fluid removal during dialysis hinders renal recovery38. However, traditional prediction models for renal recovery have often overlooked vital signs due to their complexity and dynamic nature. Bellomo et al. conducted a retrospective study of critically ill patients with shock and found that higher levels of relative hypotension during the first few hours of vasopressor support were significantly associated with an increased risk of adverse kidney-related outcomes39. In line with the current evidence, our data suggests that early vital signs, not only the variance of systolic BP, but also SpO2, trend of respiratory rate, were significantly associated with the renal prognosis of critically ill patients. Additionally, the use of LASSO model with more limited variables and the incorporation of routinely collected laboratory data offer a practical means of rapid integration into EMR (Supplemental Table S8). An illustrate the interpretability of the models and the evolving of the key features over time using two separate individuals is presented in Supplemental Figure S4. Altogether, explainable machine-learning models can be deconvoluted to unveil new insights of how ICU patient features at the early stage interact with patient future events.

Strengths of our study include its size, comprising 1,381 patients with AKI-D among 26,593 ICU admissions. We also have complete data on vital signs and inputs/outputs with very low missing rates (<1%). Furthermore, we linked the NHIRD cause-of-death data to mitigate withdrawal bias risk. This is especially important in ICU studies, as 40%60% of critically ill patients with AKI-D have their treatment discontinued due to life support withdrawal or death4.

Our study has several limitations. First, recovery status was determined at hospital discharge; however, we recognized that dialysis liberation may proceed further. Nevertheless, the median hospital stay of critically ill patients with AKI-D was long (25.0days for the entire cohort and 30.5days for the recovery group), with dialysis-dependent catastrophic illness certificates verified before discharge by nephrologist among AKI-D nonrecovers. Thus, the kidney prognosis is clinically relevant. Second, our models made one-time early predictions of renal recovery on the basis of data obtained within 3days of dialysis initiation. Events after that may drive the patient outcome away from the predictions. Though, we developed additional models at various time horizons (1 and 2days), a continuously updating prediction is more appropriate for such cases. Third, limitations of our retrospective database include lack of other predictors of interest, including the degree of urine proteinuria, timed creatinine clearance, and novel kidney biomarkers, which may influence renal and patient recovery. Last, we used temporal testing, which is considered as an in-between validation of internal and external validation40. Although the recovery and mortality rates in this cohort was comparable to those reported in the literature4,26, the results should be further validated in other settings.

See the original post:
Predictive approach for liberation from acute dialysis in ICU patients using interpretable machine learning | Scientific ... - Nature.com

5 Key Ways AI and ML Can Transform Retail Business Operations – InformationWeek

Posted on June 12, 2024 by Danzig

Odds are youve heard more about artificial intelligence and machine learning in the last two years than you had in the previous 20. Thats because advances in the technology have been exponential, and many of the worlds largest brands, from Walmart and Amazon to eBay and Alibaba, are leveraging AI to generate content, power recommendation engines, and much more.

Investment in this technology is substantial, with exponential growth projected -- the AI in retail market was valued at $7.14 billion in 2023, with the potential to reach $85 billion by 2032.

Brands of all sizes are eyeing this technology to see how it fits into their retail strategies. Lets take a look at some of the impactful ways AI and ML can be leveraged to drive business growth.

One of the major hurdles for retailers -- particularly those with large numbers of SKUs -- is creating compelling, accurate product descriptions for every new product added to their assortment. When you factor in the ever-increasing number of platforms on which a product can be sold, from third-party vendors like Amazon to social selling sites to a brands own website, populating that amount of content can be unsustainable.

One of the areas in which generative AI excels is creating compelling product copy at scale. Natural language generation (NLG) algorithms can analyze vast amounts of product data and create compelling, tailored descriptions automatically. This copy can also be adapted to each channel, fitting specific parameters and messaging towards focused audiences. For example, generative AI engines understand the word count restrictions for a particular social channel. They can focus copy to those specifications, tailored to the demographic data of the person who will encounter that message. This level of personalization at scale is astonishing.

Related:Is an AI Bubble Inevitable?

This use of AI has the potential to help brands achieve business objectives through product discoverability and conversion by creating compelling content optimized for search.

Another area in which AI and ML excel is in the cataloging and organizing of data. Again, when brands deal with product catalogs with hundreds of thousands of SKUs spread across many channels, it is increasingly difficult to maintain consistency and clarity of information. Product, inventory, and eCommerce managers spend countless hours attempting to keep all product information straight and up-to-date, and they still make mistakes.

Related:Is Innovation Outpacing Responsible AI?

Brands can leverage AI to automate tasks such as product categorization, attribute extraction, and metadata tagging, ensuring accuracy and scalability in data management across all channels. This use of AI takes the guesswork and labor out of meticulous tasks and can have wide-ranging business implications. More accurate product information means a reduction in returns and improved product searchability and discoverability through intuitive data architecture.

As online shopping has evolved over the past decade, consumer expectations have shifted. Customers rarely go to company websites and browse endless product pages to discover the product theyre looking for. Rather, customers expect a curated and personalized experience, regardless of the channel through which theyre encountering the brand. A report from McKinsey showed that 71% of customers expect personalization from a brand, and 76% get frustrated when they dont encounter it.

Brands have been offering personalized experiences for decades, but AI and ML unlock entirely new avenues for personalization. Once again, AI enables an unprecedented level of scale and nuance in personalized customer interactions. By analyzing vast amounts of customer data, AI algorithms can connect the dots between customer order history, preferences, location and other identifying user data and create tailored product recommendations, marketing messages, shopping experiences, and more.

Related:Overcoming AIs 5 Biggest Roadblocks

This focus on personalization is key for business strategy and hitting benchmarks. Personalization efforts lead to increases in conversion, higher customer engagement and satisfaction, and better brand experiences, which can lead to long-term loyalty and customer advocacy.

Search functionalities are in a constant state of evolution, and the integration of AI and ML is that next leap. AI-powered search algorithms are better able to process natural language, enabling a brand to understand user intent and context, which improves search accuracy and relevance.

Whats more, AI-driven search can provide valuable insights into customer behavior and preferences, enabling brands to optimize product offerings and marketing strategies. By analyzing search patterns and user interactions, brands can identify emerging trends, optimize product placement, and tailor promotions to specific customer segments. Ultimately, this enhanced search experience improves customer engagement while driving sales growth and fostering long-term customer relationships.

At its core, the main benefit of AI and ML tools is that theyre always working and never burn out. This fact is felt strongest when applied to customer support. Tools like chatbots and virtual assistants enable brands to provide instant, personalized assistance around the clock and around the world. This automation reduces wait times, improves response efficiency, and frees staff to focus on higher-level tasks.

Much like personalization engines used in sales, AI-powered customer support tools can process vast amounts of customer data to tailor responses based on a customers order history and preferences. Also, like personalization, these tools can be deployed to radically reduce the amount of time customer support teams spend on low-level inquiries like checking order status or processing returns. Leveraging AI in support allows a brand to allocate resources in more impactful ways without sacrificing customer satisfaction.

Brands are just scratching the surface of the capabilities of AI and ML. Still, early indicators show that this technology can have a profound impact on driving business growth. Embracing AI can put brands in a position to transform operational efficiency while maintaining customer satisfaction.

See the original post here:
5 Key Ways AI and ML Can Transform Retail Business Operations - InformationWeek

Exploring the relationship between heavy metals and diabetic retinopathy: a machine learning modeling approach … – Nature.com

Posted on June 12, 2024 by Danzig

Characteristics of the study population

Table 1 demonstrated the general characteristics of the participants with and without diabetic retinopathy in this study. A total of 1,042 American adults were included, of whom 212 were diagnosed with DR and 830 with non-DR. The mean age of the total population was about 62years. Women comprised 53% of the included people, slightly more than men (47%). Non-Hispanic whites (40%), married (61%), those with high school or above levels of education (60%), never smoking (52%), former alcohol drinking (34%), self-reported history of hypertension (75%) and hyperlipidemia (88%), and those with PIR1.00 (78%), respectively, accounted for the largest proportions of the total population. Smoking status, urinary creatinine level (-87.5 vs 103) and mean concentrations of heavy metal, such as Sb (-7.30 vs -7.42), Tl (-6.71 vs -6.56) and Pt (-9.41 vs -9.58) differed significantly between DR and non-DR patients.

Figure1 showed the correlation between each of the 13 heavy metals and the baseline characteristics of the population. The results indicated that most of the metals are correlated with each other to varying degrees, with a relatively strong correlation between TI and Cs (r=0.54) and the similar relationship between Co and Ba (r=0.44). Curves based on correlations of heavy metals with baseline population characteristics and other details are shown in Fig. S1. We also assessed multicollinearity between all selected metals and covariates using variance inflation factors (VIFs), which showed that there was no multicollinearity.

The results of Pearson's correlation analysis among the metal factors and baseline variables.

Figure2 showed the efficacy of the 11 machine learning models included to predict DR risk based on the testing set, and the results of the training set are shown in Fig. S2, presented as ROC analysis curves. The AUC value of the KNN model is 1.000, of the GBM model is 0.991, of the RF model is 0.988, of the C5.0 model is 0.987, of the NN model is 0.966, of the XGBoost model is 0.961, of the SVM model is 0.939, of the MLP model is 0.911, of the NB model is 0.831, of the GP model is 0.800, of the LR model is 0.622. Tables S1 and Tables 2 provide the perfomance indicators of the 11 models used in this study in the training set and validation set, respectively, and show the confusion matrices used by 11 machine learning algorithms. The results show that among these machine learning models, the KNN model exhibits the best prediction performance. As a result, the prediction model based on the KNN model was finally selected for subsequent analyses.

The ROC of the 11 machine learning models in testing set.

PFI analysis provided insights into the relative importance of all variables in the KNN model. We used the IML method to assess the contribution weights of heavy metal exposure (Ba, Cd, Co, Cs, Pb, Sb, etc.) and people's baseline characteristics (age, sex, BMI, education level, ethnicity, smoking and drinking status, etc.) in the prediction model, which is presented in Fig.3A. The results of the analyses showed that the first five variables (Sb, Ba , Pt, Ur, As) were relatively more important variables in the prediction model. Among them, Sb level contributed a weight of 1.7306321.791722 in predicting DR risk, which was significantly higher than all other included variables. The critical variables only compared to Sb level also include Ba, Pt, Ur, As, which are also relatively sensitive metals in predicting the development of DR. The contribution weights of Ba, Pt, Ur, As were 1.5604741.602271, 1.5660631.633790, 1.5113661.540538, 1.4563521.496473 respectively. It is worth noting that the contribution weights of demographic characteristics and lifestyle-related variables in the prediction of DR risk in our results were lower compared to heavy metal exposure, and all baseline characteristics variables except age were weaker than heavy metal exposure.

The contribution of metal factors and baseline variables in predictive model. (A) The forest map based on PFI analysis displays the corresponding contribution weights of heavy metals and baseline variables and their corresponding standard deviations; (B) The SHAP summary plot of all variables and DR risk. The width of the range of the horizontal bars can be interpreted as the effect on the model predictions, with the wider the range, the greater the effect. The direction on the x-axis represented the likelihood of developing DR (right) or not developing (left); (C) The SHAP features importance plot of heavy metals and DR risk. The magnitude of the effect of each feature on the model output was measured by the average of the absolute values of the SHAP values for all samples, ranked from top to bottom by their magnitude of effect; D) The SHAP summary plot of heavy metals and DR risk.

Furthermore, we further validated the relationship of each variable with the predicted DR risk by the SHAP method after screening the KNN model. The SHAP summary plot (Fig.3B) showed the overall effect of heavy metals and baseline variables on DR risk, and was ranked in descending order according to the importance of the feature. In this case, a positive SHAP value indicates that the value of the feature is positively associated with DR risk, and the larger the value, the greater the contribution. The results showed that the top five potentially critical factors influencing higher predicted DR risk were, in descending order, Sb, Pt, As, Tl, Ba. Moreover, Sb had higher contribution weight in the prediction model than any other heavy metal or baseline variable under two different analysis methods, which is in line with the results of the SHAP summary plot between heavy metals and predicted DR risk (Fig.3C,D).

The predictive performance of the selected KNN model was further explained by PDP analysis, and the relationships between six key heavy metals (Sb, Ba, Pt, As, Tl, Cd) and the predicted values of DR are shown in Fig.4, while the results for the remaining heavy metals are shown in Fig. S3. The results show that some of the heavy metals, including As, Co, Sb, and Tu, showed a significant trend of increasing predicted risk of DR with elevated levels of these heavy metals in the log-transformed interval of the relatively high concentrations. The predicted risk of DR was significantly increased when the log-transformed levels of some heavy metals, including As, Co, Sb, and Tu, were elevated at relatively high concentration, but there was no significant correlation between Pt and the predicted risk of DR at high concentration. However, there was no significant correlation between increasing or decreasing levels of Cs, Hg, and Pb and DR risk. These findings suggest that timely detection of key metal levels in vivo may play an essential role in predicting the development of DR.

Relationships between key metal including (A) Sb, (B) Ba, (C) Pt, (D) As, (E) Tl, (F) Cd and predictive DR risk. The x-axis of the plot represented the log-transformed values of each metal.

We performed the analysis of heavy metal exposure interaction properties by PDPs model. The results in Fig.5A show that the corresponding variables with overall interaction strength greater than 0.2 were Sb, age, Tu, Pt, As, Cd, and Ur, with Sb having the most significant interaction effect. The interaction performance of the baseline variables for the prediction of DR risk remained weaker than that of heavy metals. Therefore, we further performed the interaction analysis of Sb with other variables. Figure5B revealed that the interaction between Sb and age ranked the highest among all metal pairs, with overall interaction strength greater than 0.4. In addition to the strong synergistic effect of Sb with As, Tl, and Cs, ethnicity had an effect on the prediction of DR risk by Sb, with overall interaction strength greater than 0.2. The results suggest that monitoring Sb levels, especially in older populations, may be more critical in controlling the development of DR.

Interaction effects of variables on DR. (A) Interactions between heavy metals and baseline variables on DR; (B) Interactions between Sb levels and other variables on DR. The range of the straight line represents the overall interaction strength, the wider the range, the greater the effect.

More here:
Exploring the relationship between heavy metals and diabetic retinopathy: a machine learning modeling approach ... - Nature.com

TikTok moves toward ‘performance automation vision’ with latest machine learning ad tools – Digiday

Posted on May 25, 2024 by Danzig

TikToks latest machine learning ad solutions are proof that the platform wants to automate as much of its advertising as possible.

The product, dubbed Performance Automation, was announced at the platforms fourth annual TikTok World product summit today its first official summit since Biden signed the TikTok divest or sell bill last month, and subsequently the entertainment app took the U.S. government to court to appeal.

Its safe to say TikTok wants advertisers to believe its not entertaining the idea of being booted out of the U.S. anytime soon. If that wasnt already obvious during its NewFront earlier this month, this latest announcement makes it clearer that its business as usual for the platform right now. Or at least trying to make it as clear as possible that advertisers can park their contingency plans and keep spending on TikTok.

TikTok is actively working to keep marketers engaged and on the platform despite the legislative challenges, said Traci Asbury, social investment lead at Goodway Group. They [TikTok] have complete confidence in their upcoming legal appeals and are actively encouraging marketers to keep adopting best practices and usage of the platforms capabilities to make positive impacts on their businesses.

Well, you probably already know about TikToks Smart Performance Campaign, which was launched last year. The campaign uses semi-automation capabilities including auto-targeting, auto-bidding and auto-creative.

But Performance Automation, which is still in early testing, goes one step further, by automating more of the process, including the creative. With this campaign, advertisers input the necessary assets, budget and goals, and TikToks predictive AI and machine learning will select the best creative asset, to ensure the best campaign is put in front of the right customer at the right time. As a TikTok spokesperson confirmed, the platform is moving toward a performance automation vision and this latest product is the next step on that journey.

And thats not all. The platform has also launched a similar capability for its TikTok Shop, dubbed TikTok Shop Marketing Automation. Like Performance Automation, this works by automating bidding, budgeting, ad management and creative for TikTok Shop products. Since TikTok Shop is only available in select regions, this latest product is currently rolled out in South-East Asia, and in testing in the U.S.

Ohio-based health and wellness brand Triquetra Health is one of those early testers. According to Adolfo Fernandez, global product strategy and operations at TikTok, the brand already achieved 4x their return on investment in TikTok Shop within the first month of using this new automation product, and increased sales on the platform by 136%. He did not provide exact figures.

To be clear, Performance Automation and TikTok Shop Marketing Automation arent their official names. These are just temporary identities the platform is using until they roll out the products officially.

Still, all sounds familiar? Thats because it is. Performance Automation is similar to what the other tech giants have been doing for a while now, and what TikTok started to dabble in with its Smart Performance Campaign last year. Think Googles Performance Max, Metas Advantage+ and now even Amazons Performance+ they all play a similar role for their respective platforms. TikTok just joining the pack simply confirms that automation is the direction that advertising as an industry is heading.

In many ways, this was inevitable. Meta, Google et al have amassed billions of ad dollars over the years by making it as easy as possible for marketers to spend on their ads. From programmatic bidders to attribution tools, the platforms have tried to give marketers fewer reasons to spend elsewhere. Machine learning technologies that essentially oversee campaigns are the latest manifestation of this. Sooner or later TikTok was always going to make a move.

Still, there are concerns aplenty over how these technologies work they are, after all, the ultimate set it and forget it type of campaign. Marketers hand over the assets and data they want the platform to work with, and the technology takes it from there. Thats it. Marketers have no way of knowing whether these campaigns are doing what the platform says theyre doing because theyre unable to have them independently verified. It remains to be seen whether TikToks own effort will take a similar stance or break with tradition.

Speaking of measurement, TikTok is also launching unified lift a new product which measures TikTok campaign performance across the entire decision journey, using brand and conversion lift studies. KFC Germany has already tried it out and drove a 25% increase in brand recall and saw an 81% increase in app installs, according to Fernandez, without providing exact figures.

Among the other announcements were:

Well for now, nothing much has changed. Marketers have contingency plans in place, but thats just standard business practice. Beyond that, everything as far as TikTok goes is pretty much business as usual.

Colleen Fielder, group vp of social and partner marketing solutions at Basis Technologies said her team is not actively recommending any of their clients discontinue spending on TikTok. Theyre continuing to include the platform on proposals.

We knew TikTok was going to sue the U.S. government, and that may push this 9-12 month timeline even further back, which gives us a longer lead time to continue running on TikTok and / or identify alternative platforms as needed, she said.

For Markacy, its a similar state of play. We have a loose partnership with digital media company Attn, which is heavily invested in TikTok, said Tucker Matheson, co-CEO of the company. Theyre still getting big proposals for work, which is a positive sign.

See the article here:
TikTok moves toward 'performance automation vision' with latest machine learning ad tools - Digiday

iPadOS 18’s Smart Script uses machine learning to make your handwriting less horrible – Yahoo Movies Canada

Posted on June 12, 2024 by Danzig

Last month, Apple's tablets got a major revamp with the arrival of the M4 chip, two size options for the iPad Air, updates to the Magic Keyboard and a new iPad Pro packing a fancy Tandem OLED display. And now at WWDC 2024, Apple is looking to flesh out the iPad's software with the introduction of Apple Intelligence and a number of fresh features heading to iPadOS 18, which is due out sometime later this year.

To start, iPadOS is getting deeper customization options for your home screen including the ability to put app icons pretty much wherever you want. Apple's Control Center has also been expanded with support for creating multiple lists and views, resizing and rearranging icons and more. There's also a new floating tab bar that makes it easy to navigate between apps, which can be further tuned to remember your favorites. Next, SharePlay is getting the ability to draw diagrams on someone else's iPad or control someone else's device remotely (with permission) for times like when you need to help troubleshoot.

After years of requests, the iPad is also getting its own version of the Calculator app, which includes a new Math Notes feature that supports the Apple Pencil and the ability to input handwritten formulas. Math Notes will even update formulas in real time or you can save them in case you want to revisit things later. Alternatively, the Smart Script tool in the Notes app uses machine learning to make your notes less messy and easier to edit.

General privacy is also being upgraded with a new feature that lets you lock an app. This allows a friend or family member to borrow your device without giving them full access to everything on your tablet. Alternatively, theres also a new hidden apps folder so you can stash sensitive software in a more secretive way.

In Messages, Tapbacks are now compatible with all your emoji. Furthermore, you'll be able to schedule messages or send texts via satellite in case you aren't currently connected to Wi-Fi or a cellular network. Apple even says messages sent using satellite will feature end-to-end encryption.

The Mail and Photos apps are also getting similarly big revamps. Mail will feature new categorizations meant to make it easier to find specific types of offers or info (like plane flights). Meanwhile, the Photos app will sport an updated UI that will help you view specific types of images while hiding things like screenshots. And to better surface older photos and memories, there will be new categories like Recent Days and People and Pets to put similar types of pics all in a single collection.

Audio controls on iPads is also getting a boost with a new ability for Siri to understand gestures for Yes and No by either shaking or nodding your head while wearing AirPods. This should make it easier to provide Apple's digital assistant with simple responses in areas like a crowded bus or quiet waiting room where you might be uncomfortable talking aloud.

However, the biggest addition this year is that alongside all the iPad-specific features, Apples tablet OS is also getting Apple Intelligence. This covers many of the companys new AI-powered features like the ability to create summaries of websites, proofread or rewrite emails or even generate new art based on your prompts.

Apple says that to make its AI more useful, features will be more personalized and contextual. That said, to help protect your privacy and security, the company claims it wont build profiles or sell data to outside parties. Generally, Apple says it will use on-device processing for most of its tools, though some features require help from the cloud.

As its iconic digital assistant, Siri is getting a big refresh via Apple Intelligence too. This includes better natural language recognition and the ability to understand and remember context from one query to another. Siri will also be able to help you use your device, allowing you to ask your tablet how to perform certain tasks, search for files or control apps and features using your voice.

Some examples of what Apple Intelligence can do is highlight priority emails and put them at the top of your inbox so you don't miss important messages or events. Or if you're feeling more creative, you can use AI to create unique emoji (called Genmoji). And in photos, Apple Intelligence can help you edit images with things like the Clean Up tool. And for those who want the freedom to use other AI models, Apple is adding the option to integrate other services, the first of which will be Chat GPT.

Finally, other minor updates including a new Passwords app for stashing credentials across apps and websites, a new dedicated Game Mode with personalized spatial audio, expanded hiking results in Apple Maps and a new eye-tracking feature for improved accessibility.

Catch up here for all the news out of Apple's WWDC 2024.

Study uses AI and machine learning to accelerate protein engineering process – Dailyuw

Posted on May 25, 2024 by Danzig

In recent months, the process of protein design at UW has been revolutionized by the implementation of a machine learning computational approach. In a new paper published in the journal Nature Computational Science, the UW molecular design Berndt Lab reports its findings.

Machine learning, recently applied to the realm of protein engineering, has been effective in reducing the amount of time needed to design proteins that can efficiently perform a biochemical task. The current trial-and-error method of mutating an amino acid sequence can take anywhere from several months to upward of years of tedious analysis. However, with the recent use of machine learning at the Berndt Lab, the future of protein engineering appears promising.

The application of machine learning was used to analyze how mutations to GCaMP, a biosensor that tracks calcium in cells, would affect its behavior. Collaborators provided empirical knowledge of GCaMP, which was then combined with an AI algorithm that could predict the effects of the protein mutations. Well-developed proteins can provide valuable insight to disease and a patients response to treatment.

The machine learning model achieved the equivalent of several years worth of lab mutations in a single night, with a very high rate of success. Of the 17 mutations implemented in real biological cells, five or six were absolute successes. According to Andre Berndt, assistant professor in the department of bioengineering and senior author on the paper, out of 10 mutations you are typically lucky if just one provides a gain of function.

A lot of the mutations that were predicted to be better were indeed better at a much, much faster pace from a much larger pool of virtually tested mutations, Berndt said. So this was a very efficient process just based on the trained model.

Berndts team was comprised of graduate and undergraduate students who collaborated on the study. Lead author Sarah Wait, a Ph.D. candidate in molecular engineering,spearheaded the research by undertaking various roles such as testing mutation variants, engineering data, establishing the machine learning framework, and analyzing the results.

Computational programs can discover all of the really hard-to-observe patterns that, maybe, we wouldnt be able to observe ourselves, Wait said. It's just a really great tool to help us as the researcher[s] discover these really small patterns that may be hidden to us given the amount of data we have to look at in order to actually see them.

Reach contributing writer Ashley Ingalsbe at news@dailyuw.com X: @ashleyiing

Like what youre reading? Support high-quality student journalism by donating here.

View post:
Study uses AI and machine learning to accelerate protein engineering process - Dailyuw

Reinforcement learning AI might bring humanoid robots to the real world – Science News Magazine

Posted on May 25, 2024 by Danzig

ChatGPT and other AI tools are upending our digital lives, but our AI interactions are about to get physical. Humanoid robots trained with a particular type of AI to sense and react to their world could lend a hand in factories, space stations, nursing homes and beyond. Two recent papers in Science Robotics highlight how that type of AI called reinforcement learning could make such robots a reality.

Weve seen really wonderful progress in AI in the digital world with tools like GPT, says Ilija Radosavovic, a computer scientist at the University of California, Berkeley. But I think that AI in the physical world has the potential to be even more transformational.

The state-of-the-art software that controls the movements of bipedal bots often uses whats called model-based predictive control. Its led to very sophisticated systems, such as the parkour-performing Atlas robot from Boston Dynamics. But these robot brains require a fair amount of human expertise to program, and they dont adapt well to unfamiliar situations. Reinforcement learning, or RL, in which AI learns through trial and error to perform sequences of actions, may prove a better approach.

We wanted to see how far we can push reinforcement learning in real robots, says Tuomas Haarnoja, a computer scientist at Google DeepMind and coauthor of one of the Science Robotics papers. Haarnoja and colleagues chose to develop software for a 20-inch-tall toy robot called OP3, made by the company Robotis. The team not only wanted to teach OP3 to walk but also to play one-on-one soccer.

Soccer is a nice environment to study general reinforcement learning, says Guy Lever of Google DeepMind, a coauthor of the paper. It requires planning, agility, exploration, cooperation and competition.

The toy size of the robots allowed us to iterate fast, Haarnoja says, because larger robots are harder to operate and repair. And before deploying the machine learning software in the real robots which can break when they fall over the researchers trained it on virtual robots, a technique known as sim-to-real transfer.

Training of the virtual bots came in two stages. In the first stage, the team trained one AI using RL merely to get the virtual robot up from the ground, and another to score goals without falling over. As input, the AIs received data including the positions and movements of the robots joints and, from external cameras, the positions of everything else in the game. (In a recently posted preprint, the team created a version of the system that relies on the robots own vision.) The AIs had to output new joint positions. If they performed well, their internal parameters were updated to encourage more of the same behavior. In the second stage, the researchers trained an AI to imitate each of the first two AIs and to score against closely matched opponents (versions of itself).

To prepare the control software, called a controller, for the real-world robots, the researchers varied aspects of the simulation, including friction, sensor delays and body-mass distribution. They also rewarded the AI not just for scoring goals but also for other things, like minimizing knee torque to avoid injury.

Real robots tested with the RL control software walked nearly twice as fast, turned three times as quickly and took less than half the time to get up compared with robots using the scripted controller made by the manufacturer. But more advanced skills also emerged, like fluidly stringing together actions. It was really nice to see more complex motor skills being learned by robots, says Radosavovic, who was not a part of the research. And the controller learned not just single moves, but also the planning required to play the game, like knowing to stand in the way of an opponents shot.

In my eyes, the soccer paper is amazing, says Joonho Lee, a roboticist at ETH Zurich. Weve never seen such resilience from humanoids.

But what about human-sized humanoids? In the other recent paper, Radosavovic worked with colleagues to train a controller for a larger humanoid robot. This one, Digit from Agility Robotics, stands about five feet tall and has knees that bend backward like an ostrich. The teams approach was similar to Google DeepMinds. Both teams used computer brains known as neural networks, but Radosavovic used a specialized type called a transformer, the kind common in large language models like those powering ChatGPT.

Instead of taking in words and outputting more words, the model took in 16 observation-action pairs what the robot had sensed and done for the previous 16 snapshots of time, covering roughly a third of a second and output its next action. To make learning easier, it first learned based on observations of its actual joint positions and velocity, before using observations with added noise, a more realistic task. To further enable sim-to-real transfer, the researchers slightly randomized aspects of the virtual robots body and created a variety of virtual terrain, including slopes, trip-inducing cables and bubble wrap.

After training in the digital world, the controller operated a real robot for a full week of tests outside preventing the robot from falling over even a single time. And in the lab, the robot resisted external forces like having an inflatable exercise ball thrown at it. The controller also outperformed the non-machine-learning controller from the manufacturer, easily traversing an array of planks on the ground. And whereas the default controller got stuck attempting to climb a step, the RL one managed to figure it out, even though it hadnt seen steps during training.

Reinforcement learning for four-legged locomotion has become popular in the last few years, and these studies show the same techniques now working for two-legged robots. These papers are either at-par or have pushed beyond manually defined controllers a tipping point, says Pulkit Agrawal, a computer scientist at MIT. With the power of data, it will be possible to unlock many more capabilities in a relatively short period of time.

And the papers approaches are likely complementary. Future AI robots may need the robustness of Berkeleys system and the dexterity of Google DeepMinds. Real-world soccer incorporates both. According to Lever, soccer has been a grand challenge for robotics and AI for quite some time.

More here:
Reinforcement learning AI might bring humanoid robots to the real world - Science News Magazine

Futurist Transhuman News Blog

Category Archives: Machine Learning

Get started quickly with AWS Trainium and AWS Inferentia using AWS Neuron DLAMI and AWS Neuron DLC | Amazon … – AWS Blog

A machine learning-based approach for constructing remote photoplethysmogram signals from video cameras … – Nature.com

Arteris Selected by Esperanto Technologies to Integrate RISC-V Processors for High-Performance AI and Machine … – Design and Reuse

Build a FedRAMP compliant generative AI-powered chatbot using Amazon Aurora Machine Learning and Amazon … – AWS Blog

AI better predicts back surgery outcomes – Futurity: Research News

Developing a prognostic model using machine learning for disulfidptosis related lncRNA in lung adenocarcinoma … – Nature.com

Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare … – Nature.com

Want an AI Job? Check out these new AWS AI certifications – ZDNet

Enhancing customer retention in telecom industry with machine learning driven churn prediction | Scientific Reports – Nature.com

Apples AI leaders talk Siri, Apple Intelligence, and more – The Verge

AuthenticID Releases Deep Fake and Generative AI Detection Solution for Businesses – AiThority

Study on Climate Change Control with ML Named Best Paper – AZoCleantech

Machine learning-guided realization of full-color high-quantum-yield carbon quantum dots – Nature.com

Predictive approach for liberation from acute dialysis in ICU patients using interpretable machine learning | Scientific … – Nature.com

5 Key Ways AI and ML Can Transform Retail Business Operations – InformationWeek

Exploring the relationship between heavy metals and diabetic retinopathy: a machine learning modeling approach … – Nature.com

TikTok moves toward ‘performance automation vision’ with latest machine learning ad tools – Digiday

iPadOS 18’s Smart Script uses machine learning to make your handwriting less horrible – Yahoo Movies Canada

Study uses AI and machine learning to accelerate protein engineering process – Dailyuw

Reinforcement learning AI might bring humanoid robots to the real world – Science News Magazine