Machine and deep learning are a MUST at the North-West… – Daily Maverick

Posted on September 5, 2022 by admin

The last century alone has seen a meteoric increase in the accumulation of data and we are able to store unfathomable quantities of information to help us solve problems known and unknown. At some point the ability to optimally utilise these vast amounts of data will be beyond our reach, but not beyond that of the tools we have made. At the North-West University (NWU), Professor Marelie Davel, director of the research group MUST Deep Learning, and her team are ensuring that our ever-growing data repositories will continue to benefit society.

The teams focus on machine learning and, specifically, deep learning, is creating magic to the untrained eye. Here is why.

Machine learning is a catch-all term for systems that learn in an automated way from their environment. These systems are not programmed with the steps to solve a specific task, but they are programmed to know how to learn from data. In the process, the system uncovers the underlying patterns in the data and comes up with its own steps to solve the specific task, explains Professor Davel.

According to her, machine learning is becoming increasingly important as more and more practical tasks are being solved by machine learning systems: From weather prediction to drug discovery to self-driving cars. Behind the scenes we see that many of the institutions we interact with, like banks, supermarket chains and hospitals, all nowadays incorporate machine learning in aspects of their business. Machine learning makes everyday tools from internet searches to every smartphone photo we take work better.

The NWU and MUST go a step beyond this by doing research on deep learning. This is a field of machine learning that was originally inspired by the idea of artificial neural networks, which were simple models of how neurons were thought to interact in the human brain. This was conceived in the early forties! Modern networks have come a long way since then, with increasingly complex architectures creating large, layered models that are particularly effective at solving human-like tasks, such as processing speech and language, or identifying what is happening in images.

She explains that, although these models are very well utilised, there are still surprisingly many open questions about how they work and when they fail.

We work on some of these open questions, specifically on how the networks perform when they are presented with novel situations that did not form part of their training environment. We are also studying the reasons behind the decisions the networks make. This is important in order to determine whether the steps these models use to solve tasks are indeed fair and unbiased, and sometimes it can help to uncover new knowledge about the world around us. An example is identifying new ways to diagnose and understand a disease.

The uses of this technology are nearly boundless and will continue to grow, and that is why Professor Davel encourages up-and-coming researchers to consider focusing their expertise in this field.

By looking inside these tools, we aim to be better users of the tools as well. We typically apply the tools with industry partners, rather than on our own. Speech processing for call centres, traffic prediction, art authentication, space weather prediction, even airfoil design. We have worked in quite diverse fields, but all applications build on the availability of large, complex data sets that we then carefully model. This is a very fast-moving field internationally. There really is a digital revolution that is sweeping across every industry one can think of, and machine learning is a critical part of it. The combination of practical importance and technical challenge makes this an extremely satisfying field to work in.

She confesses that, while some of the ideas of MUSTs collaborators may sound far-fetched at first, the team has repeatedly found that if the data is there, it is possible to build a tool to use it.

One can envision a future where human tasks such as speech recognition and interaction have been so well mimicked by these machines, that they are indistinguishable from their human counterparts. The famed science fiction writer Arthur C Clarke once remarked that any sufficiently advanced technology is indistinguishable from magic. At the NWU, MUST is doing their part in bringing this magic to life. DM

Author: Bertie Jacobs

Stable Diffusion Goes Public and the Internet Freaks Out – DevOps.com

Posted on September 5, 2022 by admin

Welcome to The Long Viewwhere we peruse the news of the week and strip it to the essentials. Lets work out what really matters.

Unless youve been living under a rock for the past week, youll have seen something about Stable Diffusion. Its the new open source machine learning model for creating images from text and even other images.

Like DALL-E and Midjourney, you give it a textual prompt and it generates amazing images (or sometimes utter garbage). Unlike those other models, its open source, so were already seeing an explosion of innovation.

Mark Hachman calls it The new killer app

Fine-tune your algorithmic artAI art is fascinating. Enter a prompt, and the algorithm will generate an image to your specifications. Generally, this all takes place on the Web, with algorithms like DALL-E. [But] Stability.Ai and its Stable Diffusion model broke that moldwith a model that is publicly available and can run on consumer GPUs.For now, Stability.Ai recommends that you have a GPU with at least 6.9GB of video RAM. Unfortunately, only Nvidia GPUs are currently supported. [But] if you own a powerful PC, you can take all the time youd like to fine-tune your algorithmic art and come up with something truly impressive.

From the horses mouth, its Emad Mostaque: Stable Diffusion Public Release

Use this in an ethical, moral and legal mannerIt is our pleasure to announce the public release of stable diffusion. Over the last few weeks we all have been overwhelmed by the response and have been working hard to ensure a safe and ethical release, incorporating data from our beta model tests and community for the developers to act on.As these models were trained on image-text pairs from a broad internet scrape, the model may reproduce some societal biases and produce unsafe content, so open mitigation strategies as well as an open discussion about those biases can bring everyone to this conversation. We hope everyone will use this in an ethical, moral and legal manner and contribute both to the community and discourse around it.

Yeah, right. Have you ever been on the Internet? Kyle Wiggers sounds worried: Deepfakes for all

90% are of womenStable Diffusionis now in use by art generator services like Artbreeder, Pixelz.ai and more. But the models unfiltered nature means not all the use has been completely above board.Other AI art-generating systems, like OpenAIs DALL-E 2, have implemented strict filters for pornographic material. Moreover, many dont have the ability to create art of public figures. Women, unfortunately, are most likely by far to be the victims of this. A study carried out in 2019 revealed that, of the 90% to 95% of deepfakes that are non-consensual, about 90% are of women.

Why is it such a big deal? Just ask Simon Willison:

Science fiction is realStable Diffusion is a really big deal. If you havent been paying attention to whats going onyou really should be. Its similar to models like Open AIs DALL-E, but with one crucial difference: they released the whole thing.In just a few days, there has been an explosion of innovation around it. The things people are building are absolutely astonishing. Generating images from text is one thing, but generating images from other images is a whole new ballgame. Imagine having an on-demand concept artist that can generate anything you can imagine, and can iterate with you towards your ideal result.Science fiction is real now. Machine learning generative models are here, and the rate with which they are improving is unreal. Its worth paying real attention to.

How does it compare to the DALL-E? Just ask Beyondo:

Personally, stable diffusion is better. OpenAI makes it sounds like they created the holy grail of image generation models but their images dont impress anyone who used stable diffusion.

@fabianstelzer did a bunch of comparative tests:

These image synths are like instruments its amazing well get so many of them, each with a unique sound. DALL-Es really great for facial expressions. [Midjourney] wipes the floor with the others when it comes toprompts aiming for textural details. DALL-Es usually my go to for scenes involving 2 or more clear actors. DALL-E and SD being better at photosStable Diffusion can do incredible photosbut you need to be careful to not overload the scene.The moment you put art into a prompt, Midjourney just goes nuts. DALL-Es imperfections look very digital, unlike MJs. When it comes to copying specific styles, SD is absolutely [but] DALL-E wont let you do a Botticelli painting of Trump.

And what of the training data? Heres Andy Baio:

One of the biggest frustrations of text-to-image generation AI models is that they feel like a black box. We know they were trained on images pulled from the web, but which ones? The team behind Stable Diffusion have been very transparent about how their model is trained. Since it was released publicly last week, Stable Diffusion has exploded in popularity, in large part because of its free and permissive licensing.Simon Willison [and I] grabbed the data for over 12 million images used to train Stable Diffusion. [It] was trained off three massive datasets collected by LAION. All of LAIONs image datasets are built off of Common Crawl, [which] scrapes billions of webpages monthly and releases them as massive datasets. Nearly half of the images, about 47%, were sourced from only 100 domains, with the largest number of images coming from Pinterest. WordPress-hosted blogs on wp.com and wordpress.com represented6.8% of all images. Other photo, art, and blogging sites includedSmugmugBlogspotFlickrDeviantArtWikimedia500px, andTumblr.

Meanwhile, how does it work? Letitia Parcalabescu is easy for her to say:

How do Latent Diffusion Models work? If you want answers to these questions, weve got you covered!

You have been readingThe Long ViewbyRichiJennings. You can contact him at@RiCHior[emailprotected].

Image: Stable Diffusion, via Andy Baio (Creative ML OpenRAIL-M; leveled and cropped)

See the original post here:
Stable Diffusion Goes Public and the Internet Freaks Out - DevOps.com

Senior Lecturer / Associate Professor in Fairness in Machine Learning and AI Planning job with UNIVERSITY OF MELBOURNE | 307051 – Times Higher…

Posted on September 5, 2022 by admin

Location:ParkvilleRole type:Full time;ContinuingFaculty:Faculty of Engineering and Information TechnologyDepartment/School:School of Computing and Information SystemsSalary:Level C $135,032 $155,698 orLevel D$162,590 $179,123p.a. plus 17% super

The University of Melbourne would like to acknowledge and pay respect to the Traditional Owners of the lands upon which our campuses are situated, the Wurundjeri and Boon Wurrung Peoples, the Yorta Yorta Nation, the Dja Dja Wurrung People. We acknowledge that the land on which we meet and learn was the place of age-old ceremonies, of celebration, initiation and renewal, and that the local Aboriginal Peoples have had and continue to have a unique role in the life of these lands.

About the School of Computing and Information Systems (CIS)

We are international research leaders with a focus on delivering impact and making a real difference in three key areas: data and knowledge, platforms and systems, and people and organisations.

At the School of Computing and Information Systems, you'll find curious people, big problems, and plenty of chances to create a real difference in the world.

To find out more about CIS, visit:http://www.cis.unimelb.edu.au/

About the Role

The Faculty of Engineering and Information Technology (FEIT) is seeking an aspiring academic leader with expertise in algorithms and their fairness in machine learning and/or AI (artificial intelligence) planning, or related fields, for a substantive position within the School of Computing and Information Systems (CIS).

You will join a world-class computer science research group, which has strong links to the Centre for AI & Digital Ethics (CAIDE) and will be expected to collaborate with both alongside other internationally respected groups across artificial intelligence, human-computer interaction, information systems.

You highly ambitious and eager to demonstrate world leading research through publications in key conferences (typified by, but not limited to, FAccT, The Web Conference, KDD, NeurIPS, ICAPS, AAAI, IJACI, ITCS, EC, CHI, CSCW) and in high-quality journals (typified by, but not limited to, ACM TKDD, AIJ, ACM Transactions on Economics and Computation, Proceedings of the National Academy of Sciences, Big Data and Society, AI and Society, AI and Ethics, TCS). You will make a valuable contribution to the School and broader academic community through mentorship, contributions to teaching into various Masters programs related to algorithms, theory, digital ethics and related areas and provide critical leadership in engagement activities including securing grant funding to support your program of research.

This is an exciting opportunity to further develop your academic and leadership profile and be supported to achieve your goals across all pillars of an academic career.

Responsibilities include:

About You

You are an aspiring leader with the ability to build a highly respected reputation in Machine Learning and/or AI Planning, as demonstrated through a significant track record of publications in high-impact peer-reviewed and refereed venues, and invitations to speak at national and international meetings. You are experienced in mentoring both students, colleagues and research teams and demonstrate great initiative in the establishment and nurturing of research projects. Your highly-developed communication and relationship building skills enable you to engage with a diverse range of people and institutions to develop partnerships that positively contribute to strategic initiatives.

You will also have:

For full details of responsibilities and selection criteria, including criteria for a Level D appointment, please refer to the attached position description.

To ensure the University continues to provide a safe environment for everyone, this position requires the incumbent to hold a current and valid Working with Children Check.

About - The Faculty of Engineering and Information Technology (FEIT)

The Faculty of Engineering and Information Technology (FEIT) has been the leading Australian provider of engineering and IT education and research for over 150 years. We are a multidisciplinary School organised into three key areas; Computing and Information Systems (CIS), Chemical and Biomedical Engineering (CBE) and Electrical, Mechanical and Infrastructure Engineering (EMI). FEIT continues to attract top staff and students with a global reputation and has a commitment to knowledge for the betterment of society.

https://eng.unimelb.edu.au/about/join-feit

About the University

The University of Melbourne is consistently ranked amongst the leading universities in the world. We are proud of our people, our commitment to research and teaching excellence, and our global engagement.

Benefits of Working with Us

In addition to having the opportunity to grow and be challenged, and to be part of a vibrant campus life, our people enjoy a range of rewarding benefits:

To find out more, visithttps://about.unimelb.edu.au/careers/staff-benefits.

Be Yourself

We value the unique backgrounds, experiences and contributions that each person brings to our community and encourage and celebrate diversity. First Nations people, those identifying as LGBTQIA+, females, people of all ages, with disabilities and culturally and linguistically diverse people are encouraged to apply. Our aim is to create a workforce that reflects the community in which we live.

Join Us!

If you feel this role is right for you, please apply with your CV and cover letter outlining your interest and experience. Please note that you are not required to provide responses against the selection criteria in the Position Description.

We are dedicated to ensuring barrier free and inclusive practices to recruit the most talented candidates. If you require any reasonable adjustments with the recruitment process, please contact us athr-talent@unimelb.edu.au.

Position Description:0054173_PD_C D in Fairness.pdf

Applications close:Monday 26 September 2022 11:55 PMAUS Eastern Standard Time

Go here to see the original:
Senior Lecturer / Associate Professor in Fairness in Machine Learning and AI Planning job with UNIVERSITY OF MELBOURNE | 307051 - Times Higher...

Model-Agnostic Interpretation: Beyond SHAP and LIME – Geektime

Posted on September 5, 2022 by admin

Since machine learning models are statistical models, they naturally leave themselves open to potential errors. For example, Apple cards fair lending fiasco brought into question the inherent discrimination in loan approval algorithms while a project funded by the UK government that used AI to predict gun and knife crime turned out to be wildly inaccurate.

For people to trust machine learning models, we need explanations. It makes sense for a loan to be rejected due to low income, but if a loan gets rejected based on an applicant's zip code, this might indicate theres bias in the model, i.e, it can favour more wealthy areas

When choosing a machine learning algorithm, theres usually a tradeoff between the algorithms interpretability and its accuracy. Traditional methods like decision trees and linear regression can be directly explained, but their ability to provide accurate predictions is limited. More modern methods such as Random Forests and Neural Networks give better predictions but are more difficult to interpret.

In the last few years, we've seen great advances in the interpretation of machine learning models with methods like Lime and SHAP. While these methods do require some background, analyzing the underlying data can offer a simple and intuitive interpretation. For this, we first need to understand how humans reason.

Lets think about the common example of the roosters crow: If you grew up in the countryside, you might know that roosters always crow before the sun rises. Can we infer that the roosters crow makes the sun rise? Its clear that the answer is no. But, why?

Humans have a mental model of reality. We know that if the rooster doesn't crow, the sun rises anyway. This type of reasoning is called counterfactual.

This is the common way in which people make sense of reality. Counterfactual reasoning cannot be scientifically proven. Descartes demon, or the idea of methodological skepticism, illustrates this idea: According to this concept, if Event B happens right after Event A, you can never be sure that there isnt some demon that causes B to happen right after A. The scientific field historically refrained from formalizing any discussion on causality. But, more recently, efforts have been made to create a scientific language that helps us better understand cause and effect. For additional information, be sure to read The Book of Why by Judea Pearl, a prominent computer science researcher and philosopher.

At my company, we have predictive models aimed at an assessment of customers' risk when they apply for a loan. The model uses historical data in a tabular format, in which each customer has a list of meaningful features like payment history, income and incorporation date. Using this data, we predict the customet's level of risk and divide it into six different risk groups (or buckets). We interpret the model's predictions using both local and global explanations, then we use counterfactual analysis to explain our predictions to the business stakeholders.

Local explanations are aimed to explain a single prediction. We replace each features value with the median in the representative population and display the feature that caused the largest change in score through text. In the following example, the third feature is successful repayments, and its median is 0. We calculate new predictions while replacing the original features value with the new value (the median).

Customer_1 had their prediction changed to a reduced risk, and we can devise a short explanation. A higher number of successful repayments improved the customers risk level. Or in its more detailed version: The customer had 3 successful repayments compared to a median of 0 in the population. This caused the risk level to improve from level D to E.

Global explanations are aimed to explain the features direction in the model as a whole. An individual feature value is replaced with one extreme value. For example, this value can be the 95th percentile - i.e., almost the largest value in the sample (95% of the values are smaller than it).

The changes in the scores distribution are calculated and visualized in the chart below. The figure shows the change in the customer's risk level when increasing the value to the 95th percentile.

When increasing the first listed feature (length of delay in payments) to the 95th percentile, a large portion of the customers have their risk level deteriorate one or more levels. A person who reviews this behaviour can easily accept that a delay in payments is expected to cause a worse risk level.

The second feature, monthly balance increase, has a combined effect - a small percentage of the customer's have their risk level deteriorate, while a larger percentage have their risk level improve. This combined effect might indicate theres some interaction between features, although that is not something that can be directly explained through this method.

The third feature, years since incorporation, has a positive effect on the customer's risk level when increasing it to the 95% percentile. Here too, it can be easy to accept that businesses that have been around for longer periods are likely to be more stable and therefore present less risk.

Unlike many other reasoning methods, the counterfactual approach allows for simple and intuitive data explanations that anyone can understand, which can increase the trust we have in machine learning models.

Written by Nathalie Hauser, Manager, Data Science at Bluevine

The rest is here:
Model-Agnostic Interpretation: Beyond SHAP and LIME - Geektime

Ray, the machine learning tech behind OpenAI, levels up to Ray 2.0 – VentureBeat

Posted on August 28, 2022 by admin

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Watch here.

Over the last two years, one of the most common ways for organizations to scale and run increasingly large and complex artificial intelligence (AI) workloads has been with the open-source Ray framework, used by companies from OpenAI to Shopify and Instacart.

Ray enables machine learning (ML) models to scale across hardware resources and can also be used to support MLops workflows across different ML tools. Ray 1.0 came out in September 2020 and has had a series of iterations over the last two years.

Today, the next major milestone was released, with the general availability of Ray 2.0 at the Ray Summit in San Francisco. Ray 2.0 extends the technology with the new Ray AI Runtime (AIR) that is intended to work as a runtime layer for executing ML services. Ray 2.0 also includes capabilities designed to help simplify building and managing AI workloads.

Alongside the new release, Anyscale, which is the lead commercial backer of Ray, announced a new enterprise platform for running Ray. Anyscale also announced a new $99 million round of funding co-led by existing investors Addition and Intel Capital with participation from Foundation Capital.

MetaBeat 2022

MetaBeat will bring together thought leaders to give guidance on how metaverse technology will transform the way all industries communicate and do business on October 4 in San Francisco, CA.

Ray started as a small project at UC Berkeley and it has grown far beyond what we imagined at the outset, said Robert Nishihara, cofounder and CEO at Anyscale, during his keynote at the Ray Summit.

Its hard to understate the foundational importance and reach of Ray in the AI space today.

Nishihara went through a laundry list of big names in the IT industry that are using Ray during his keynote. Among the companies he mentioned is ecommerce platform vendor Shopify, which uses Ray to help scale its ML platform that makes use of TensorFlow and PyTorch. Grocery delivery service Instacart is another Ray user, benefitting from the technology to help train thousands of ML models. Nishihara noted that Amazon is also a Ray user across multiple types of workloads.

Ray is also a foundational element for OpenAI, which is one of the leading AI innovators, and is the group behind the GPT-3 Large Language Model and DALL-E image generation technology.

Were using Ray to train our largest models, Greg Brockman, CTO and cofounder of OpenAI, said at the Ray Summit. So, it has been very helpful for us in terms of just being able to scale up to a pretty unprecedented scale.

Brockman commented that he sees Ray as a developer-friendly tool and the fact that it is a third-party tool that OpenAI doesnt have to maintain is helpful, too.

When something goes wrong, we can complain on GitHub and get an engineer to go work on it, so it reduces some of the burden of building and maintaining infrastructure, Brockman said.

For Ray 2.0, a primary goal for Nishihara was to make it simpler for more users to be able to benefit from the technology, while providing performance optimizations that benefit users big and small.

Nishihara commented that a common pain point in AI is that organizations can get tied into a particular framework for a certain workload, but realize over time they also want to use other frameworks. For example, an organization might start out just using TensorFlow, but realize they also want to use PyTorch and HuggingFace in the same ML workload. With the Ray AI Runtime (AIR) in Ray 2.0, it will now be easier for users to unify ML workloads across multiple tools.

Model deployment is another common pain point that Ray 2.0 is looking to help solve, with the Ray Serve deployment graph capability.

Its one thing to deploy a handful of machine learning models. Its another thing entirely to deploy several hundred machine learning models, especially when those models may depend on each other and have different dependencies, Nishihara said. As part of Ray 2.0, were announcing Ray Serve deployment graphs, which solve this problem and provide a simple Python interface for scalable model composition.

Looking forward, Nishiharas goal with Ray is to help enable a broader use of AI by making it easier to develop and manage ML workloads.

Wed like to get to the point where any developer or any organization can succeed with AI and get value from AI, Nishihara said.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.

Read the original here:
Ray, the machine learning tech behind OpenAI, levels up to Ray 2.0 - VentureBeat

Machine Learning Gives Cats One More Way To Control Their Humans – Hackaday

Posted on August 28, 2022 by admin

For those who choose to let their cats live a more or less free-range life, there are usually two choices. One, you can adopt the role of servant and run for the door whenever the cat wants to get back inside from their latest bird-murdering jaunt. Or two, install a cat door and let them come and go as they please, sometimes with a present for you in their mouth. Heads you win, tails you lose.

Theres another way, though: just let the cat ask to be let back in. Thats the approach that [Tennis Smith] took with this machine-learning kitty doorbell. Its based on a Raspberry Pi 4, which lives inside the house, and a USB microphone thats outside the front door. The Pi uses Tensorflow Lite to classify the sounds it picks up outside, and when one of those sounds fits the model of a cats meow, a message is dispatched to AWS Lambda. From there a text message is sent to alert [Tennis] that the cat is ready to come back in.

Theres a ton of useful information included in the repo for this project, including step-by-step instructions for getting Amazon Web Services working on the Pi. If youre a dog person, fear not: changing from meows to barks is as simple as tweaking a single line of code. And if youd rather not be at the beck and call of a cat but still want to avoid the evidence of a prey event on your carpet, machine learning can help with that too.

[via Toms Hardware]

Originally posted here:
Machine Learning Gives Cats One More Way To Control Their Humans - Hackaday

Solve the problem of unstructured data with machine learning – VentureBeat

Posted on August 28, 2022 by admin

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Watch here.

Were in the midst of a data revolution. The volume of digital data created within the next five years will total twice the amount produced so far and unstructured data will define this new era of digital experiences.

Unstructured data information that doesnt follow conventional models or fit into structured database formats represents more than 80% of all new enterprise data. To prepare for this shift, companies are finding innovative ways to manage, analyze and maximize the use of data in everything from business analytics to artificial intelligence (AI). But decision-makers are also running into an age-old problem: How do you maintain and improve the quality of massive, unwieldy datasets?

With machine learning (ML), thats how. Advancements in ML technology now enable organizations to efficiently process unstructured data and improve quality assurance efforts. With a data revolution happening all around us, where does your company fall? Are you saddled with valuable, yet unmanageable datasets or are you using data to propel your business into the future?

Theres no disputing the value of accurate, timely and consistent data for modern enterprises its as vital as cloud computing and digital apps. Despite this reality, however, poor data quality still costs companies an average of $13 million annually.

MetaBeat 2022

MetaBeat will bring together thought leaders to give guidance on how metaverse technology will transform the way all industries communicate and do business on October 4 in San Francisco, CA.

To navigate data issues, you may apply statistical methods to measure data shapes, which enables your data teams to track variability, weed out outliers, and reel in data drift. Statistics-based controls remain valuable to judge data quality and determine how and when you should turn to datasets before making critical decisions. While effective, this statistical approach is typically reserved for structured datasets, which lend themselves to objective, quantitative measurements.

But what about data that doesnt fit neatly into Microsoft Excel or Google Sheets, including:

When these types of unstructured data are at play, its easy for incomplete or inaccurate information to slip into models. When errors go unnoticed, data issues accumulate and wreak havoc on everything from quarterly reports to forecasting projections. A simple copy and paste approach from structured data to unstructured data isnt enough and can actually make matters much worse for your business.

The common adage, garbage in, garbage out, is highly applicable in unstructured datasets. Maybe its time to trash your current data approach.

When considering solutions for unstructured data, ML should be at the top of your list. Thats because ML can analyze massive datasets and quickly find patterns among the clutter and with the right training, ML models can learn to interpret, organize and classify unstructured data types in any number of forms.

For example, an ML model can learn to recommend rules for data profiling, cleansing and standardization making efforts more efficient and precise in industries like healthcare and insurance. Likewise, ML programs can identify and classify text data by topic or sentiment in unstructured feeds, such as those on social media or within email records.

As you improve your data quality efforts through ML, keep in mind a few key dos and donts:

Your unstructured data is a treasure trove for new opportunities and insights. Yet only 18% of organizations currently take advantage of their unstructured data and data quality is one of the top factors holding more businesses back.

As unstructured data becomes more prevalent and more pertinent to everyday business decisions and operations, ML-based quality controls provide much-needed assurance that your data is relevant, accurate, and useful. And when you arent hung up on data quality, you can focus on using data to drive your business forward.

Just think about the possibilities that arise when you get your data under control or better yet, let ML take care of the work for you.

Edgar Honing is senior solutions architect at AHEAD.

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even considercontributing an articleof your own!

Migraine classification by machine learning with functional near-infrared spectroscopy during the mental arithmetic task | Scientific Reports -…

Posted on August 28, 2022 by admin

The experimental process can be divided into five steps, as shown in Fig.6. First, we need to recruit subjects to take a Mental Arithmetic Task (MAT). When doing the task, the blood oxygen information could be measured by the device. Secondly, in order to obtain more valuable signal, signal filtering is required. Afterwards, do the signal segmentation according to the three stages of the task and do the feature extraction. Eventually, the features can be imported into the machine learning. The details will be left to the following subsections for more explanations.

The experimental process of this study. After the fNIRS signal is obtained, it will be filtered, segmented and extracted and imported to machine learning to get classification results, and finally confirm the credibility of the results with cross-validation.

Nowadays, more and more techniques have been investigated to explore the relationship between migraine and cerebrovascular reactivity or cerebral hemodynamics. Some studies have used positron emission tomography (PET) to scan the prefrontal cortex (PFC) and assess whether the suboccipital stimulator is effective5. Others have found that the ventromedial prefrontal cortex is more active in MOH than CM subjects through functional magnetic resonance imaging (fMRI)6. Both PET and fMRI are non-invasive imaging modalities but the former requires the application of radioactive imaging agents which lead to the concern for ionizing radiation. Although the latter does not involve radioactive agents, the use of a strong magnetic field excludes patients with an artificial pacemaker or any metal implants.

As early as 2007, there was a study using near-infrared spectroscopy (NIRS) to evaluate the difference in regional cerebral blood flow (rCBF) changes of the middle cerebral artery between migraine patients and the healthy control group during the breath-holding task7. In recent years, NIRS has gradually emerged in the pain field8,9,10,11. Moreover, NIRS has the advantages of non-invasive, non-radioactive, instant, low system cost, portability and easy operability, etc. Therefore, NIRS has an extremely high potential as a tool for investigating migraine.

The continuous-wave NIRS system used in this experiment is a self-developed instrument in our laboratory, as shown in Fig.7. Optode is the core of the system, consisting of three light detectors and two near-infrared light emitters staggered with a spacing of 3 cm. The four channels of the system cover the PFC, approximately at the positions of F7, Fp1, Fp2, and F8 in the International 1020 system, shown in Fig.8. The photodetector uses OPT101 (Texas Instruments Inc), which has the advantages of small size and high sensitivity to the near-infrared light band. The multi-wavelength LEDs (Epitex Inc, L4*730/4*805/4*850-40Q96-I) contain three wavelengths of 730 nm, 805 nm, and 850 nm. In this study, we use only 730 nm and 850 nm. The sampling frequency is about 17 Hz. The rear end of the device is equipped with an adjustment knob, which can make the device fit properly, and reduce the influence of external light. The power supply of the hardware uses a rechargeable 7.4 V battery, composed of two 3.7 V lithium batteries in series, and is directly connected to the microcontroller unit (MCU), an Arduino Pro Mini. The other components (including light detectors, a Bluetooth module, and a current regulator) are powered by the output pin of the MCU. The current regulator uses TLC5916 (Texas Instruments Inc), which can provide a constant current for the LEDs in the circuit. The MCU converts the original light intensity signal into the hemoglobin value and sends these data back to the computer through Bluetooth for storage. Finally, the computer displays the hemoglobin value in real time.

The wearable functional near-infrared spectroscopy system. (a) OPT101 (b) LED (c) Power source (d) MCU (e) Bluetooth module (f) Regulator knob.

The Schematic positions of fNIRS optodes in the international 1020 system.

MAT is a common and effective stress task. Research has confirmed that the MAT can produce mental stress in healthy subjects13,14 or migraine subjects15. Subjects were arranged in a quiet space to avoid interference from the outside world, informed of the process, and given a short practice opportunity to eliminate the experimental deviation due to unfamiliarity with operation. The MAT architecture was divided into three stages (Rest, Task, and Recovery) with a total duration of 555 s16, which shows in Fig.9. At the rest stage, subjects were asked to close their eyes and relax in the seat for 1 minute. At the task stage, subjects were asked to watch the questions and answer through a touch screen. At the recovery stage, subjects had to do the same things as the rest stage for 3 minutes. The computer saved the data in the form of comma-separated values after the completion of the MAT.

The MAT architecture. (a) A two-/three-digit addition/subtraction question will be displayed at the center of the screen for 1 second. (b) A countdown circle will be displayed on the screen for 4 seconds to remind the subject the remaining time to think. (c) The screen will be divided into two areas to display an answer separately. Subjects had 1 second to select the correct answer. (d) The screen shows a feedback for the result for 1 second. If the answer was correct, a green circle would be displayed; if the answer was wrong, a red cross would be displayed; if the correct answer was not selected in time, a white question mark would be displayed. Performing (ad) once is a cycle, and the task stage includes 45 cycles.

Recruitment was started only after the approval of the Institutional Review Board (IRB) of the Taipei Veterans General Hospital (No.: 2017-01-010C). All methods in this research were performed in accordance with the relevant guidelines and regulations. The inclusion criteria are subjects from 20 to 60 years old, meeting the diagnostic criteria of the third edition of the International Headache Classification (ICHD-3), and those can fully record the migraine attack pattern and basic personal data. Exclusion criteria are those with any major mental or neurological diseases (including brain damage, brain tumors), smoking habits or alcohol abuse. HC include 13 medical staff of Taipei Veterans General Hospital with an average age of 44.9 8.7 years old. Both CM and MOH are patients in the Neurology Clinic of Taipei Veterans General Hospital. There are 9 and 12 patients with an average age of 34.8 10.9 years old and 45.8 11.2 years old respectively. Informed consent was obtained from all subjects.

The signal of fNIRS can be divided into three aspects: (i) source (intracerebral vs. extracerebral), (ii) stimulus/task relation (evoked vs. non-evoked), and (iii) cause (neuronal vs. systemic)17. In our study, task-evoked neurovascular coupling and spontaneous neurovascular coupling is of primary interest. In order to obtain different types of fNIRS signals for subsequent feature extraction, two different filters were used in parallel in this procedure. The first was the Low-pass filter, a fourth-order Butterworth filter, with a cutoff frequency of 0.1 Hz18, which could filter out systemic noise such as breathing, heartbeat, and Mayer wave, which was 1 Hz, 0.3 Hz, 0.1 Hz respectively. Then the changes of neurovascular coupling signal caused by the entire MAT can be obtained. The second was a band-pass filter with a frequency band of 0.01 Hz0.3 Hz19. The hemodynamics response of the PFC, the signal changes after every stimulation, could be observed.

As the purpose of MAT was to stimulate the PFC, the corresponding two channels, Ch2 and Ch3, were focused on. The collected signals included oxygenated hemoglobin (HbO) and deoxygenated hemoglobin (HHb). In addition, two different signals could be obtained by adding or subtracting these two signals, total hemoglobin (HbT) and brain oxygen exchange (COE) respectively. These data were divided into three parts by different stages of MAT (rest, task, recovery).

Feature extraction is a method of sorting out available features from a large range of data. Proper feature extraction will improve the quality of model training. The features used in the experiment, demonstrating in Fig.10, will be introduced one by one below

Low-pass filter

Stage mean difference The average difference of hemoglobin at each stage. In order to observe the average change of fNIRS signal of the subject at different stages.

Transition slope Referring to the article published by Coyle et al.22 in 2004, which is mentioned that the maximum value of light intensity can be detected by fNIRS at about five to eight seconds after stimulation, so we took the maximum value of eight seconds. The slope of the fNIRS signal when the first eight seconds after entering a new stage . Fitting the value of the interval with a linear formula, and the coefficient of the first term is the slope. In order to observe the changes of the fNIRS signal under different stimulation.

Transition slope difference The difference of transition slope. In order to observe the difference in the changes of the fNIRS signal under different stimulation.

Normalization Normalization is a procedure for moving and rescaling data. Feature 1 (sim) 3 were calculated again after this process. The normalized data fall between zero and one, which could compare the differences in the ratio of the characteristics of fNIRS signal among the subjects to the changes in their own signal amplitude.

Band-pass filter

Stage standard deviation The standard deviation of the fNIRS signal at each stage. In order to observe the dispersion level of data.

Stage skewness The skewness of fNIRS signal at each stage. In order to observe the asymmetry of the distribution of the signal value.

Stage kurtosis The kurtosis of the fNIRS signal at each stage, which described the tail length of the distribution of the signal value23. Compared with the value near the average, outliers had a greater impact on the value of kurtosis.

Combining the above-mentioned features, a total of 144 features were obtained. These features were the inputs of linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA).

Logistic regression is a model commonly used for classification, but it has some disadvantages. First, logistic regression can only deal with the problem of two classifications, and it will be tricky when it encounters multiple classifications; second, it cannot handle well when faced with a large number of features or variables. The most important thing is that if the amount of data is too small, the results will be unstable due to a lack of basis for optimizing parameters. LDA can offset this disadvantage, especially multi-group performance. LDA has two basic hypotheses. First, the algorithm assumes that each group of data is Gaussian distribution. Second, in order to make the decision boundary have a clear geometric meaning, the covariance matrix of each group of data must be equal. On the other hand, QDA does not have the limitation of covariance matrix. In addition, the credibility of the model was evaluated by leave-one-out cross-validation (LOOCV), which was often used in the small data set and made the performance of the fNIRS diagnostic ability more confident.

Excerpt from:
Migraine classification by machine learning with functional near-infrared spectroscopy during the mental arithmetic task | Scientific Reports -...

Tesla wants to take machine learning silicon to the Dojo – The Register

Posted on August 28, 2022 by admin

To quench the thirst for ever larger AI and machine learning models, Tesla has revealed a wealth of details at Hot Chips 34 on their fully custom supercomputing architecture called Dojo.

The system is essentially a massive composable supercomputer, although unlike what we see on the Top 500, it's built from an entirely custom architecture that spans the compute, networking, and input/output (I/O) silicon to instruction set architecture (ISA), power delivery, packaging, and cooling. All of it was done with the express purpose of running tailored, specific machine learning training algorithms at scale.

"Real world data processing is only feasible through machine learning techniques, be it natural-language processing, driving in streets that are made for human vision to robotics interfacing with the everyday environment," Ganesh Venkataramanan, senior director of hardware engineering at Tesla, said during his keynote speech.

However, he argued that traditional methods for scaling distributed workloads have failed to accelerate at the rate necessary to keep up with machine learning's demands. In effect, Moore's Law is not cutting it and neither are the systems available for AI/ML training at scale, namely some combination of CPU/GPU or in rarer circumstances by using speciality AI accelerators.

"Traditionally we build chips, we put them on packages, packages go on PCBs, which go into systems. Systems go into racks," said Venkataramanan. The problem is each time data moves from the chip to the package and off the package, it incurs a latency and bandwidth penalty.

So to get around the limitations, Venkataramanan and his team started over from scratch.

"Right from my interview with Elon, he asked me what can you do that is different from CPUs and GPUs for AI. I feel that the whole team is still answering that question."

Tesla's Dojo Training Tile

This led to the development of the Dojo training tile, a self-contained compute cluster occupying a half-cubic foot capable of 556 TFLOPS of FP32 performance in a 15kW liquid-cooled package.

Each tile is equipped with 11GBs of SRAM and is connected over a 9TB/s fabric using a custom transport protocol throughout the entire stack.

"This training tile represents unparalleled amounts of integration from computer to memory to power delivery, to communication, without requiring any additional switches," Venkataramanan said.

At the heart of the training tile is Tesla's D1, a 50 billion transistor die, based on TSMC's 7nm process. Tesla says each D1 is capable of 22 TFLOPS of FP32 performance at a TDP of 400W. However, Tesla notes that the chip is capable of running a wide range of floating point calculations including a few custom ones.

Tesla's Dojo D1 die

"If you compare transistors for millimeter square, this is probably the bleeding edge of anything which is out there," Venkataramanan said.

Tesla then took 25 D1s, binned them for known good dies, and then packaged them using TSMC's system-on-wafer technology to "achieve a huge amount of compute integration at very low latency and very-high bandwidth," he said.

However, the system-on-wafer design and vertically stacked architecture introduced challenges when it came to power delivery.

According to Venkataramanan, most accelerators today place power directly adjacent to the silicon. And while proven, this approach means a large area of the accelerator has to be dedicated to those components, which made it impractical for Dojo, he explained. Instead, Tesla designed their chips to deliver power directly though the bottom of the die.

"We could build an entire datacenter or an entire building out of this training tile, but the training tile is just the compute portion. We also need to feed it," Venkataramanan said.

Tesla's Dojo Interface Processor

For this, Tesla also developed the Dojo Interface Processor (DIP), which functions as a bridge between the host CPU and training processors. The DIP also serves as a source of shared high-bandwidth memory (HBM) and as a high-speed 400Gbit/sec NIC.

Each DIP features 32GB of HBM and up to five of these cards can be connected to a training tile at 900GB/s for an aggregate of 4.5TB/s to the host for a total of 160GB of HBM per tile.

Tesla's V1 configuration pairs of these tiles or 150 D1 dies in array supported four host CPUs each equipped with five DIP cards to achieve a claimed exaflop of BF16 or CFP8 performance.

Tesla's V1 Arrangement

Put together, Venkataramanan says the architecture detailed in depth here by The Next Platform enables Tesla to overcome the limitations associated with traditional accelerators from the likes of Nvidia and AMD.

"How traditional accelerators work, typically you try to fit an entire model into each accelerator. Replicate it, and then flow the data through each of them," he said. "What happens if we have bigger and bigger models? These accelerators can fall flat because they run out of memory."

This isn't a new problem, he noted. Nvidia's NV-switch for example enables memory to be pooled across large banks of GPUs. However, Venkataramanan argues this not only adds complexity, but introduces latency and compromises on bandwidth.

"We thought about this right from the get go. Our compute tiles and each of the dies were made for fitting big models," Venkataramanan said.

Such a specialized compute architecture demands a specialized software stack. However, Venkataramanan and his team recognized that programmability would either make or break Dojo.

"Ease of programmability for software counterparts is paramount when we design these systems," he said. "Researchers won't wait for your software folks to write a handwritten kernel for adapting to a new algorithm that we want to run."

To do this, Tesla ditched the idea of using kernels, and designed Dojo's architecture around compilers.

"What we did was we used PiTorch. We created an intermediate layer, which helps us parallelize to scale out hardware beneath it.Underneath everything is compiled code," he said. "This is the only way to create software stacks that are adaptable to all those future workloads."

Despite the emphasis on software flexibility, Venkataramanan notes that the platform, which is currently running in their labs, is limited to Tesla use for the time being.

"We are focused on our internal customers first," he said. "Elon has made it public that over time, we will make this available to researchers, but we don't have a time frame for that.

The latest in applications of machine learning and artificial intelligence, in one place – EurekAlert

Posted on August 28, 2022 by admin

image:Photo of "Handbook on Computer Learning and Intelligence (in 2 Volumes)" view more

Credit: World Scientific

A new two-volume publication edited by Prof Plamen Angelov, IEEE Fellow and Director of Research at the School of Computing and Communications and Director of Lancaster Intelligent, Robotic and Autonomous systems (LIRA) Research Centre, both at Lancaster University, UK has been published.

Containing 26 chapters, the Handbook on Computer Learning and Intelligence (in 2 Volumes) explores different aspects of Explainable AI, Supervised Learning, Deep Learning, Intelligent Control and Evolutionary Computation providing a unique range of aspects of the Computer (or Machine) Learning and Intelligence. This is complemented by a set of applications to practical problems.

This handbook is a one-stop-shop compendium covering a wide range of aspects of Computer (or Machine Learning) and Intelligence. The chapters detail the theory, methodology and applications of computer (machine) learning and intelligence, and are authored by some of the leading experts in the respective areas. It is a must have reading and tool for early career researchers, graduate students and specialists alike.

The Handbook on Computer Learning and Intelligence (in 2 Volumes) retails for US$298 / 240 (hardcover set) and is also available in electronic formats. To order or know more about the book, visit http://www.worldscientific.com/worldscibooks/10.1142/12498.

###

About the Editor

Plamen Angelov holds a Personal Chair in Intelligent Systems and is Director of Research at the School of Computing and Communications at Lancaster University, UK. He obtained his PhD in 1993 and DSc (Doctor of Sciences) degree in 2015 when he also become Fellow of IEEE. Prof. Angelov is the founding Director of the Lancaster Intelligent, Robotic and Autonomous systems (LIRA) Research Centre, which brings together over 70 researchers across fifteen different departments of Lancaster University. Prof. Angelov is a Fellow of the European Laboratory for Learning and Intelligent Systems (ELLIS) and of the Institution of Engineering and Technology (IET) as well as a Governor-at-large of the International Neural Networks Society (INNS) for a third consecutive three-year term following two consecutive terms holding the elected role of Vice President. In the last decade, Prof. Angelov founded two research groups (the Intelligent Systems Research group in 2010 and the Data Science group at 2014) and was a founding member of the Data Science Institute and of the CyberSecurity Academic Centre of Excellence at Lancaster.

About World Scientific Publishing Co.

World Scientific Publishing is a leading international independent publisher of books and journals for the scholarly, research and professional communities. World Scientific collaborates with prestigious organisations like the Nobel Foundation and US National Academies Press to bring high quality academic and professional content to researchers and academics worldwide. The company publishes about 600 books and over 140 journals in various fields annually. To find out more about World Scientific, please visit http://www.worldscientific.com.

For more information, contact WSPC Communications at communications@wspc.com.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

View post:
The latest in applications of machine learning and artificial intelligence, in one place - EurekAlert