Five practical issues in machine learning and the business implications – ITProPortal

Posted on July 20, 2020 by admin

Businesses today are dealing with huge amounts of data and it's arriving faster than ever before. At the same time, the competitive landscape is changing rapidly and its critical to be able to make decisions fast.

As Jason Jennings and Laurence Haughton put it Its not the big that eat the small It's the fast that eat the slow.

Business success comes from making fast decisions using the best possible information.

Machine learning (ML) is powering that evolution. Whether a business is trying to make recommendations to customers, hone its manufacturing processes or anticipate changes to a market, ML can assist by processing large volumes of data to better support companies as they seek a competitive advantage.

However, while machine learning offers great opportunities, there are some challenges. ML systems rely on lots of data and the ability to execute complex computations. External factors, such as shifting customer expectations or unexpected market fluctuations, mean ML models need to be monitored and maintained.

In addition, there are several practical issues in machine learning that need to be solved. Here we will take a close look at five of the key practical issues and their business implications.

Machine learning systems rely on data. That data can be broadly classified into two groups: features and labels.

Features are the inputs to the ML model. For example, this could be data from sensors, customer questionnaires, website cookies or historical information.

The quality of these features can be variable. For example, customers may not fill questionnaires correctly or omit responses. Sensors can malfunction and deliver erroneous data, and website cookies may give incomplete information about a users precise actions on a website. The quality of datasets is important so that models can be correctly trained.

Data can also be noisy, filled with unwanted information that can mislead a machine learning model into making incorrect predictions.

The outputs of ML models are labels. The sparsity of labels, where we know the inputs to a system but are unsure of what outputs have occurred, is also an issue. In such cases, it can be extremely challenging to detect the relationships between features and the labels of a model. In many cases, this can be labour intensive as it requires human intervention to associate labels to inputs.

Without accurate mapping of inputs to outputs, the model might not be able to learn the correct relationship between the inputs and outputs.

Machine learning relies on the relationships between input and output data to create generalisations that can be used to make predictions and provide recommendations for future actions. When the input data is noisy, incomplete or erroneous, it can be extremely difficult to understand why a particular output, or label, occurred.

Building robust machine learning models requires substantial computational resources to process the features and labels. Coding a complex model requires significant effort from data scientists and software engineers. Complex models can require substantial computing power to execute and can take longer to derive a usable result.

This represents a trade-off for businesses. They can choose a faster response but a potentially less accurate outcome. Or they can accept a slower response but receive a more accurate result from the model. But these compromises arent all bad news. The decision of whether to go for a higher cost and more accurate model over a faster response comes down to the use case.

For example, making recommendations to shoppers on a retail shopping site requires real-time responses, but can accept some unpredictability in the result. On the other hand, a stock trading system requires a more robust result. So, a model that uses more data and performs more computations is likely to deliver a better outcome when a real-time result is not needed.

As Machine Learning as a Service (MLaaS) offerings enter the market, the complexity and quality of trade-offs will get greater attention. Researchers from the University of Chicago looked at the effectiveness of MLaaS and found that they can achieve results comparable to standalone classifiers if they have sufficient insight into key decisions like classifiers and feature selection.

Many companies use machine learning algorithms to assist them in recruitment. For example, Amazon discovered that the algorithm they used to assist with selecting candidates to work in the business was biased. Also, researchers from Princeton found that European names were favoured by other systems, mimicking some human biases.

The problem here isnt the model specifically. The problem is that the data used to train the model comes with its own biases. However, when we know the data is biased, there are ways to debias or to reduce the weighting given to that data.

The first challenge is determining if there is inherent bias in the data. That means conducting some pre-processing. And while it may not be possible to remove all bias from the data, its impact can be minimised by injecting human knowledge.

In some cases, it may also be necessary to limit the number of features in the data. For example, omitting traits such as race or gender can help limit the impact of biased data on the results from a model.

Machine learning models operate within specific contexts. For example, ML models that power recommendation engines for retailers operate at a specific time when customers are looking at certain products. However, customer needs change over time, and that means the ML model can drift away from what it was designed to deliver.

Models can decay for a number of reasons. Drift can occur when new data is introduced to the model. This is called data drift. It can also occur when our interpretation of the data changes. This is concept drift.

To accommodate this drift, you need a model that continuously updates and improves itself using data that comes in. That means you need to keep checking the model.

That requires the collection of features and labels and to react to changes so the model can be updated and retrained. While some aspects of the retraining can be conducted automatically, some human intervention is needed. Its critical to recognise that the deployment of a machine learning tool is not a one-off activity.

Machine learning tools require regular review and update to remain relevant and continue to deliver value.

Creating a model is easy. Building a model can be automatic. However, maintaining and updating the models requires a plan and resources.

Machine learning models are part of a longer pipeline that starts with the features that are used to train the model. Then there is the model itself, which is a piece of software that can require modification and updates. That model requires labels so that the results of an input can be recognised and used by the model. And there may be a disconnect between the model and the final signal in a system.

In many cases when an unexpected outcome is delivered, its not the machine learning that has broken down but some other part of the chain. For example, a recommendation engine may have offered a product to a customer, but sometimes the connection between the sales system and the recommendation could be broken, and it takes time to find the bug. In this case, it would be hard to tell the model if the recommendation was successful. Troubleshooting issues like this can be quite labour intensive.

Machine learning offers significant benefits to businesses. The ability to predict future outcomes to anticipate and influence customer behaviour and to support business operations are substantial. However, ML also brings challenges to businesses. By recognising these challenges and developing strategies to address them, companies can ensure they are prepared and equipped to handle them and get the most out of machine learning technology.

Dr. Shou-De Lin, Chief Machine Learning Scientist, Appier

Original post:
Five practical issues in machine learning and the business implications - ITProPortal

University of Derby to collaborate with Aquis on machine learning research project – Institutional Asset Manager

Posted on July 20, 2020 by admin

The University of Derby has secured government funding for a new knowledge transfer partnership with Aquis Exchange based around machine learning, with the aim of enhancing Aquis' ability to monitor trading behaviour and identify abuses within financial markets.

Machine learning is the study of computer algorithms which adapt and improve through experience, and a form of artificial intelligence.

The Universitys role is to recruit recent graduates with the skills required by the company to undertake the project and ensure that Aquis remains a specialist provider of innovative technology to the finance sector.The partnership is backed by a GBP151,000 government grant delivered by Innovate UK, which enables businesses to use KTPs to improve productivity and performance by providing the funding to take forward new ideas.Jackie Edwards, Knowledge Exchange Manager at the University of Derby, said: We are delighted to have secured this partnership with Aquis says, and to be supporting the companys objective of establishing itself as one of the leading suppliers of market trading surveillance.Alasdair Haynes, CEO of Aquis Exchange, adds: We are very excited to be working with the University of Derby to develop further our ML and AI capabilities for surveillance. We firmly believe that this is where the future of surveillance lies and our philosophy at Aquis is to be always at the cutting edge of innovation.The University has worked on KTPs with companies in a variety of different industries, from engineering to tourism, including one to measure the economic impact of Chatsworth House, the Derbyshire home of the Duke of Devonshire.Edwards adds: KTPs have been a very successful means of collaboration between the University and business, not-for-profit organisations and the public sector. Whether it is developing new products or improving management practices, our task is to identify and recruit graduates with the skills and knowledge to make a real difference to the employers they will work for.As well as bringing that expertise with them, it is also crucial that they embed that knowledge within the organisation they are working with, so it can build on the work that the graduates have carried out during their time with the company.

See the article here:
University of Derby to collaborate with Aquis on machine learning research project - Institutional Asset Manager

If your data science rollout is failing, this may be why – VentureBeat

Posted on July 19, 2020 by admin

Ive seen some advanced analytics transformations succeed and many others fail. Its well known that most of these transformations end in failure, wasting huge amounts of time and money. And what is worse, these failures typically sour the organization on data science and machine learning for the future. So failures in advanced analytics transformations cause longer-term damage than is typically appreciated.

I was recently challenged by a colleague of mine to write down what I thought our business would look like in five years if our own analytics transformation was successful. A few items I wrote were typical: wed have the ability to deploy machine learning models into production quickly; wed have data science embedded across the org chart, and so on. But what I quickly realized was that although all of those goals are good ones, the real difference Id like to see in five years is not technical, but cultural. We should be aiming first and foremost at cultural change. I now believe that this is the right way to think about how to add data science and machine learning to your business.

Think about how typical business decisions are made. In a well-run business, experts are consulted and possible solutions to business problems are debated in an open and collaborative way. In those discussions, there is a lot of conventional wisdom and many assumptions about potential risks and rewards. Decision makers adopt a measure twice, cut once approach that is risk-averse and intended to commit to a reasonable course of action over the long run.

In contrast, think about how science is done. I happen to have worked alongside some excellent researchers in my academic career at universities and in the national laboratory system. An effective scientist, in my experience, prioritizes generating and testing hypotheses over commitment to a solution. The most brilliant researchers dont adopt a measure twice, cut once mentality. Instead, they generate many hypotheses and they focus on testing them rapidly. If this sounds like a startup mentality, thats because it is. The best methodologies in the tech startup world, in my opinion, are successful because theyre scientific.

Lets consider an example. Suppose your business wants to do a better job reaching high-value customers. A common, but unscientific, approach is to bring in a team of consultants who will develop personas of your customers over the course of weeks or months. Some of those personas will be of more desirable customers. The assumptions built into the desirable personas are used to develop marketing campaigns. It is understood that the company is committing over the long term to using those personas to segment its customers and measure performance.

If this story seems natural, and you can easily imagine it happening in your business, then you dont have a scientific culture. No scientist worth their salt would be willing to commit to a theory in such a manner. A scientific approach would be to come up with lots of hypotheses and a wide variety of potential personas. To a good scientist, those hypotheses could come from anywhere. They could be conventional wisdom, or they could have occurred to someone in a fever dream. They might be the result of some exploratory analysis, or not. But they are all provisional and tentative. All the hypotheses would be parked somewhere until theyve been tested. Discussion quickly turns from generating hypotheses to the more important question of how to test them.

Of course, this is a simplistic view of what a scientific culture looks like. But the main elements are there, the most important of which is a focus on generating and testing hypotheses.

When my colleague asked me to describe what our organization would look like after our analytics transformation, this sort of scientific culture is what I described. I will judge our transformation as successful if the default way to make decisions is scientific. And most importantly, this is not limited to the technical workers. Well only be successful if this mentality is adopted across the entire org chart.

All the technical and organizational goals of an analytics transformation fall out of this overarching goal. For example, its unfair to ask your marketing team to generate and test hypotheses if they dont have access to data and the ability to rapidly roll out marketing campaigns. You cant ask your designers to be scientific if you dont provide the ability to A-B test their designs.

Adopting a scientific culture entails all the technical and organizational changes that were used to hearing about, and it places those changes into context, making them understandable. For example, its commonly suggested that an organization should integrate its data science team with the other teams across the business. This is perfectly true, but why should we do this?

A common but inadequate answer is that the data scientists need to understand the business context of their work, and the other people need to be able to take advantage of data science expertise. This is true, so far as it goes. But it misses the real point. You wont get any benefits if the team doesnt adopt a scientific mindset. If they dont change their methods to emphasize the quick generation and testing of hypotheses, the business will not enjoy the benefits that data science has to offer. With this in mind, theres a new reason for integrating data science into other teams: to infect teams with the data-driven, scientific mindset that you (hopefully) have within the data science team.

When we think about the goal of creating a scientific culture, we can see some of the more subtle changes that have to occur. The most important of these changes, in my opinion, is that incentives have to be realigned. Returning to our previous example, if were developing customer personas, we have to reward people for testing their ideas and gathering data. Theyll be doing a good job if they incrementally improve these personas over time based on measurements. Contrast that with the usual way of doing things: People would be rewarded for rolling out marketing campaigns based on polished and detailed descriptions of customer personas a distinctly unscientific approach.

Most analytics transformations fail because a new data science team is simply bolted onto the organization without being given the support it needs. In other cases, the data science team gets the right level of technical and non-technical support, but the effort still fails because of a lack of understanding of the business. But the transformation can also fail even when it seems like the business has made all the changes necessary for success.

In my experience, this latter kind of failure is especially mysterious and frustrating for everyone. The failure seems inexplicable because it feels like the business did everything right. But thinking about your analytics transformation in terms of effecting a cultural change helps put into focus the rest of the changes that the business is undergoing. All the rest, including organizational, technological, and other changes should be understood as means toward the end of creating a more scientific culture.

Zac Ernst is Head of Data Science at insurance tech startup Clearcover.

Read more from the original source:
If your data science rollout is failing, this may be why - VentureBeat

AI Fueling The Oil And Gas Industry: Interview With Tim Custer At Apache – Forbes

Posted on July 19, 2020 by admin

In industries where data is key to gaining competitive advantage, artificial intelligence and machine learning have become necessities. This is most definitely the case in the oil and gas industries that ebb and flow over time as market demand waxes and wanes for critical resources weve come to depend on.

In a recent AI Today podcast episode, Tim Custer, Senior Vice president

Tim Custer, Apache

of North America land, business development and real estate with Apache, a major energy firm, shares how AI is impacting the way the energy business operates.. After taking the role of land manager for the past ten years, Custer has shared how tied to real estate and traditional non-energy businesses the oil and gas sector is, and the role that machine learning and AI is playing to greatly change the way that the energy industry deals with documents.

According to Custer, AI and machine learning are extracting valuable data from unstructured data. The oil and gas industry is particularly dependent on an intricate set of processes and document-centric needs for land leases. Gas

leases are vital to the energy industry as they determine legal rights and claims to an oil or gas deposit while regulating the trade and extraction of those resources. At Apache, Custer notes they have around 60,000 paper and document-centric leases which can vary in length from just two pages to over fifty. Moreover, there are provisions contained on each page that must be located and interpreted every time an inquiry is made on a lease. This task can prove quite laborious with the added task of finding the correct hardcopy lease to begin with.

The first step to wrangling control of these leases is to digitize the documents so that machines can understand them. Apache has succeeded in digitizing the majority of their gas leases using Optical Character Recognition (OCR) and natural language processing (NLP). They are capable of searching through these documents for not only the required lease but the provision within it in a matter of seconds. This not only speeds up searching processes at Apache but is also time-saving for teams in need of specific provisions for their projects. Custer continues by describing the optimization of the process as grouping provisions with like wording across vast amounts of data. These digitization and NLP systems ensure higher data integrity by increasing accuracy and removing human interpretation.

One curious particularity of these leases is that they are often old, with the documents done in handwriting, usually in dated calligraphic and handwritten styles. Some of the later documents were hand type-written. As such there is a lot of variability of legibility, fonts, spacing, and overall document quality. Apache has applied machine learning to complete data organizing and searching processes more effectively and quickly than otherwise would be possible with humans having to read and process each document. Custer notes that there are a variety of document insights that must be considered such as letters, correspondence, or internal memos that are attached to the gas lease itself. The AI-enabled systems allow for significantly improved organization of this additional information due to its ability to classify and categorize the documents. In addition to higher efficiency and effectiveness, using a digitization and ML-based approach here also eliminates the need to store documents on-hand in file cabinets. Instead, these documents, once scanned, can be moved to long-term archival storage for use only when necessary as backup.

The energy is heavily regulated and that this can pose a challenge when attempting technology implementation. Custer however sees AI realistically being applied to the energy industry to many unique use-cases that he envisions for the future. In particular, Custer notes the relative inefficiency of logistics and management of the energy industry. He notes that technological advancements have already been made within the industry and give examples such as seismic imaging that can scan for underground reservoirs and drills that can drill both vertically and horizontally into the ground. Custer recognizes these applications as huge technology advancements within the industry. However, Custer notes that there has been minimal advancement in his domain of land management and business development over the years, referring to how people were apprehensive to begin with but have slowly warmed up to the idea of AI within the energy industry. He adds that this acceptance is particularly prominent when provisions and leases are involved due to its time-saving and file organizing abilities. Apache as well as other energy firms are increasingly welcoming the continual technological advancements enjoying the cost and time-saving benefits.

Custer also elaborates on the time-saving aspect of these AI systems, in particular when applied to a provision known as the consent to assign. This provision is involved within a contractual obligation that decides whether ownership can be transferred or not. Custer notes that a consent to assign provision can take hours to review manually while AI-enabled systems can shorten the process to a matter of minutes.

In general, Custer believes that this is just the tip of the iceberg with regards to the ways that AI can dramatically impact the energy industry. He states that there are many possible advancements to be made in the industry that can be learned from other industries such as finance, healthcare, and manufacturing, looking at similar use cases where documents, processes, and data can be effectively leveraged and optimized. Custer notes just how much more efficient data analysis will become and that our ability to extract valuable information from data will only be improved as time goes on.

See the original post:
AI Fueling The Oil And Gas Industry: Interview With Tim Custer At Apache - Forbes

Want To Learn Keras? Here Are 8 Free Resources – Analytics India Magazine

Posted on July 19, 2020 by admin

A deep learning library in Python, Keras is an API designed to minimise the number of user actions required for common use cases. It is one of the most used deep learning frameworks among developers and finds a way to popularity because of its ease to run new experiments, is fast and empowers to explore a lot of ideas. Built on top of TensorFlow 2.0, Keras is an industry-strength framework that can scale to large clusters of GPUs or an entire TPU pod.

There is no denying that Keras has been used extensively in machine learning workflow from data management to hyperparameter training to deployment solutions. Its ease-of-use makes it a deep learning solution of choice amid researchers, professionals and students alike.

Here we list 8 free resources which will help you get hands-on exposure to one of the most popular libraries.

1| Introduction to Keras for Engineers: This online blog by F Chollet, the creator of Keras, is the best way to get started with the library. In this blog, he explains the nitty-gritty of using Keras to build real-world machine learning solutions. It is highly useful for machine learning engineers who are looking to use Keras to build real deep-learning powered products. The guide serves as the first introduction to core Keras API concepts. A similar guide is available for machine learning researchers which helps with more complex applications in computer vision and NLP.

2| Learn through codes on GitHub: It is one of the best options to learn Keras for free by trying reverse engineering through sample codes on GitHub. This directory of tutorials and open-source code repositories by F Chollet helps in working with Keras, the Python deep learning library.

3| Deep Learning Fundamentals with Keras by edX: This free course offered by IBM is best for someone new to deep learning. It gives an introduction to the field, eventually helping to develop your first deep learning library using Keras. It covers some of the exciting applications of deep learning, basics of neural networks, building deep learning models and more. It gives detailed insight into how to build Keras, train and test deep learning models. While the course is free, it provides an option to get a verified certificate which is paid.

4| Advanced Deep Learning with Keras by Datacamp: This course provides an overview of solving a wide range of problems using Keras functional API. Starting with simple, multi-layer networks, it progresses to more complicated architectures. It covers how to build models with multiple inputs and a single output. It also covers advanced topics such as category embeddings and multiple-output networks. The first session on the Keras Functional API is free which covers the basics of the Keras functional API. It includes building a simple functional network using functional building blocks, fitting it to data, and making predictions.

5| Introduction to Deep Learning & Neural Networks with Keras by Coursera: This free course covers an introduction to the field of deep learning and building deep learning models. On completion of this course, learners will be able to describe neural networks, understand unsupervised deep learning models such as autoencoders, understand supervised deep learning models, build deep learning models and networks using the Keras library, and more.

6| Applied AI with DeepLearning By Coursera: This advanced course offered as a part of the IBM Advanced Data Science Certificate gives access to insights into Deep Learning models used by experts in NLP, Computer Vision, Time Series Analysis, and many other disciplines. After learning the fundamentals of Linear Algebra and Neural Networks, the course takes through popular deep learning frameworks such as Keras, TensorFlow, PyTorch, DeepLearning4J and Apache SystemML, with Keras making up for the most of the course.

7| Learn Keras: Build 4 Deep Learning Applications by Udemy: This free course by Udemy covers the implementation of CNN, deep neural networks, understanding of Keras syntax, understanding of different deep learning algorithms and more. It is designed to get acquainted with deep learning using Keras. The course covers different machine learning algorithms and their use cases.

8| Youtube tutorial by Edureka: This free tutorial on Youtube helps to get started with Keras. Aimed for beginners, the video runs through creating deep learning models using Keras in Python. This quick and insightful tutorial covers the basics of working of Keras along with interesting use cases.

comments

Go here to see the original:
Want To Learn Keras? Here Are 8 Free Resources - Analytics India Magazine

Even the Best AI Models Are No Match for the Coronavirus – WIRED

Posted on July 19, 2020 by admin

The stock market appears strangely indifferent to Covid-19 these days, but that wasnt true in March, as the scale and breadth of the crisis hit home. By one measure, it was the most volatile month in stock market history; on March 16, the Dow Jones average fell almost 13 percent, its biggest one-day decline since 1987.

To some, the vertigo-inducing episode also exposed a weakness of quantitative (or quant) trading firms, which rely on mathematical models, including artificial intelligence, to make trading decisions.

Some prominent quant firms fared particularly badly in March. By mid-month, some Bridgewater Associates funds had fallen 21 percent for the year to that point, according to a statement posted by the companys co-chairman, Ray Dalio. Vallance, a quant fund run by DE Shaw, reportedly lost 9 percent through March 24. Renaissance Technologies, another prominent quant firm, told investors that its algorithms misfired in response to the months market volatility, according to press accounts. Renaissance did not respond to a request for comment. A spokesman for DE Shaw could not confirm the reported figure.

The turbulence may reflect a limit with modern-day AI, which is built around finding and exploiting subtle patterns in large amounts of data. Just as algorithms that grocers use to stock shelves were flummoxed by consumers sudden obsession with hand sanitizer and toilet paper, those that help hedge funds wring profit from the market were confused by the sudden volatility of panicked investors.

In finance, as in all things, the best AI algorithm is only as good as the data its fed.

Andrew Lo, a professor at MIT and the founder and chairman emeritus of AlphaSimplex, a quantitative hedge fund based in Cambridge, Massachusetts, says quantitative trading strategies have a simple weakness. By definition a quantitative trading strategy identifies patterns in the data, he says.

Lo notes that March bears similarities to a meltdown among quantitative firms in 2007, in the early days of the financial crisis. In a paper published shortly after that mini-crash, Lo concluded that the synchronized losses among hedge funds betrayed a systemic weakness in the market. What we saw in March of 2020 is not unlike what happened in 2007, except it was faster, it was deeper, and it was much more widespread, Lo says.

What we saw in March of 2020 is not unlike what happened in 2007, except it was faster, it was deeper, and it was much more widespread.

Andrew Lo, MIT

Zura Kakushadze, president of Quantigic Solutions, describes the March episode as a quant bust in an analysis of the events posted online in April.

Kakushadzes paper looks at one form of statistical arbitrage, a common method of mining market data for patterns that are exploited by quant funds through many frequent trades. He points out that even quant funds that employed a dollar-neutral strategy, meaning they bet equally on stocks rising and falling, did poorly in the rout.

In an interview, Kakushadze says the bust shows AI is no panacea during extreme market volatility. I don't care whether youre using AI, ML, or anything else, he says. Youre gonna break down no matter what.

In fact, Kakushadze suggests that quant funds that use overly complex and opaque AI models may have suffered worse than others. Deep learning, a form of AI that has taken the tech world by storm in recent years, for instance, involves feeding data into neural networks that are difficult to audit. Machine learning, and especially deep learning, can have a large number of often obscure (uninterpretable) parameters, he writes.

Ernie Chan, managing member of QTS Capital Management, and the author of several books on machine trading, agrees that AI is no match for a rare event like the coronavirus.

Its easy to train a system to recognize cats in YouTube videos because there are millions of them, Chan says. In contrast, only a few such large swings in the market have occured before. You can count [these huge drops] on one hand. So its not possible to use machine learning to learn from those signals.

Still, some quant funds did a lot better than others during Marchs volatility. The Medallion Fund operated by Renaissance Technologies, which is restricted to employees money, has reportedly seen 24 percent gains for the year to date, including a 9 percent lift in March.

Read the original:
Even the Best AI Models Are No Match for the Coronavirus - WIRED

OpenAIs fiction-spewing AI is learning to generate images – MIT Technology Review

Posted on July 18, 2020 by admin

At its core, GPT-2 is a powerful prediction engine. It learned to grasp the structure of the English language by looking at billions of examples of words, sentences, and paragraphs, scraped from the corners of the internet. With that structure, it could then manipulate words into new sentences by statistically predicting the order in which they should appear.

So researchers at OpenAI decided to swap the words for pixels and train the same algorithm on images in ImageNet, the most popular image bank for deep learning. Because the algorithm was designed to work with one-dimensional data (i.e., strings of text), they unfurled the images into a single sequence of pixels. They found that the new model, named iGPT, was still able to grasp the two-dimensional structures of the visual world. Given the sequence of pixels for the first half of an image, it could predict the second half in ways that a human would deem sensible.

Below, you can see a few examples. The left-most column is the input, the right-most column is the original, and the middle columns are iGPTs predicted completions. (See more examples here.)

OPENAI

The results are startlingly impressive and demonstrate a new path for using unsupervised learning, which trains on unlabeled data, in the development of computer vision systems. While early computer vision systems in the mid-2000s trialed such techniques before, they fell out of favor as supervised learning, which uses labeled data, proved far more successful. The benefit of unsupervised learning, however, is that it allows an AI system to learn about the world without a human filter, and significantly reduces the manual labor of labeling data.

The fact that iGPT uses the same algorithm as GPT-2 also shows its promising adaptability. This is in line with OpenAIs ultimate ambition to achieve more generalizable machine intelligence.

At the same time, the method presents a concerning new way to create deepfake images. Generative adversarial networks, the most common category of algorithms used to create deepfakes in the past, must be trained on highly curated data. If you want to get a GAN to generate a face, for example, its training data should only include faces. iGPT, by contrast, simply learns enough of the structure of the visual world across millions and billions of examples to spit out images that could feasibly exist within it. While training the model is still computationally expensive, offering a natural barrier to its access, that may not be the case for long.

OpenAI did not grant an interview request, but in an internal policy team meeting that MIT Technology Review attended last year, its policy director, Jack Clark, mused about the future risks of GPT-style generation, including what would happen if it were applied to images. Video is coming, he said, projecting where he saw the fields research trajectory going. In probably five years, youll have conditional video generation over a five- to 10-second horizon." He then proceeded to describe what he imagined: you'd feed in a photo of a politician and an explosion next to them, and it would generate a likely output of that politician being killed.

Update: This article has been updated to remove the name of the politician in the hypothetical scenario described at the end.

Follow this link:
OpenAIs fiction-spewing AI is learning to generate images - MIT Technology Review

Weird AI illustrates why algorithms still need people – The Next Web

Posted on July 18, 2020 by admin

These days, it can be very hard to determine where to draw the boundaries around artificial intelligence. What it can and cant do is often not very clear, as well as where its future is headed.

In fact, theres also a lot of confusion surrounding what AI really is. Marketing departments have a tendency to somehow fit AI in their messaging and rebrand old products as AI and machine learning. The box office is filled with movies about sentient AI systems and killer robots that plan to conquer the universe. Meanwhile, social media is filled with examples of AI systems making stupid (and sometimes offending) mistakes.

If it seems like AI is everywhere, its partly because artificial intelligence means lots of things, depending on whether youre reading science fiction or selling a new app or doing academic research, writes Janelle Shane inYou Look Like a Thing and I Love You, a book about how AI works.

Shane runs the famous blogAI Weirdness, which, as the name suggests, explores the weirdness of AI through practical and humorous examples. In her book, Shane taps into her years-long experience and takes us through many examples that eloquently show what AIor more specificallydeep learningis and what it isnt, and how we can make the most out of it without running into the pitfalls.

While the book is written for the layperson, it is definitely a worthy read for people who have a technical background and even machine learning engineers who dont know how to explain the ins and outs of their craft to less technical people.

In her book, Shane does a great job of explaining how deep learning algorithms work. From stacking up layers of artificial neurons, feeding examples, backpropagating errors, using gradient descent, and finally adjusting the networks weights, Shane takes you through the training ofdeep neural networkswith humorous examples such as rating sandwiches and coming up with knock-knock whos there? jokes.

All of this helpsunderstand the limitsand dangers of current AI systems, which has nothing to do with super-smart terminator bots who want to kill all humans or software system planning sinister plots. [Those] disaster scenarios assume a level of critical thinking and a humanlike understanding of the world that AIs wont be capable of for the foreseeable future, Shane writes.She uses the same context to explain some of the common problems that occur when training neural networks, such as class imbalance in the training data,algorithmic bias, overfitting,interpretability problems, and more.

Instead, the threat of current machine learning systems, which she rightly describes asnarrow AI, is to consider it too smart and rely on it to solve a problem that is broader than its scope of intelligence. The mental capacity of AI is still tiny compared to that of humans, and as tasks become broad, AIs begin to struggle, she writes elsewhere in the book.

AI algorithms are also very unhuman and, as you will see inYou Look Like a Thing and I Love You, they often find ways to solve problems that are very different from how humans would do it. They tend to ferret out the sinister correlations that humans have left in their wake when creating the training data. And if theres a sneaky shortcut that will get them to their goals (such as pausing a game to avoid dying), they will use it unless explicitly instructed to do otherwise.

The difference between successful AI problem solving and failure usually has a lot to do with the suitability of the task for an AI solution, Shane writes in her book.

As she delves into AI weirdness, Shane sheds light on another reality about deep learning systems: It can sometimes be a needlessly complicated substitute for a commonsense understanding of the problem. She then takes us through a lot of other overlooked disciplines of artificial intelligence that can prove to be equally efficient at solving problems.

InYou Look Like a Thing and I Love You, Shane also takes care to explain some of the problems that have been created as a result of the widespread use of machine learning in different fields. Perhaps the best known isalgorithmic bias, the intricate imbalances in AIs decision-making which lead to discrimination against certain groups and demographics.

There are many examples where AI algorithms, using their own weird ways, discover and copy the racial and gender biases of humans and copy them in their decisions. And what makes it more dangerous is that they do it unknowingly and in an uninterpretable fashion.

We shouldnt see AI decisions as fair just because an AI cant hold a grudge. Treating a decision as impartial just because it came from an AI is known sometimes as mathwashing or bias laundering, Shane warns. The bias is still there, because the AI copied it from its training data, but now its wrapped in a layer of hard-to-interpret AI behavior.

This mindless replication of human biases becomes a self-reinforced feedback loop thatcan become very dangerouswhen unleashed in sensitive fields such as hiring decisions, criminal justice, and loan application.

The key to all this may be human oversight, Shane concludes. Because AIs are so prone to unknowingly solving the wrong problem, breaking things, or taking unfortunate shortcuts, we need people to make sure their brilliant solution isnt a head-slapper. And those people will need to be familiar with the ways AIs tend to succeed or go wrong.

Shane also explores several examples in which not acknowledging the limits of AI has resulted in humans being enlisted to solve problems that AI cant. Also known asThe Wizard of Oz effect, this invisible use of often-underpaid human bots is becoming a growing problem as companies try to apply deep learning to anything and everything and are looking for an excuse to put an AI-powered label on their products.

The attraction of AI for many applications is its ability to scale to huge volumes, analyzing hundreds of images or transactions per second, Shane writes. But for very small volumes, its cheaper and easier to use humans than to build an AI.

All the egg-shell-and-mud sandwiches, the cheesy jokes, the senseless cake recipes, the mislabeled giraffes, and all the other weird things AI does bring us to a very important conclusion. AI cant do much without humans, Shane writes. A far more likely vision for the future, even one with the widespread use of advanced AI technology, is one in which AI and humans collaborate to solve problems and speed up repetitive tasks.

While we continuethe quest toward human-level intelligence, we need to embrace current AI as what it is, not what we want it to be. For the foreseeable future, the danger will not be that AI is too smart but that its not smart enough, Shane writes. Theres every reason to be optimistic about AI andevery reason to be cautious. It all depends on how well we use it.

This article was originally published by Ben Dickson on TechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech and what we need to look out for. You can read the original article here.

Published July 18, 2020 13:00 UTC

The Difference Between AI ML and Deep Learning And Why They Matter – AiThority

Posted on July 18, 2020 by admin

Technology is developing today at a pace thats never been seen before. New advancements and breakthroughs happen far more readily than at any time in the past. One of the most talked-about areas of cutting-edge tech is that of artificial intelligence (AI).

AI is driving the digital transformation of organizations in all manner of niches. So wide-ranging are the applications of AI, that youve probably already interacted with an example of the tech today. Despite AIs growing ubiquity, though, its still not an area thats readily understood.

One of the main reasons that its tricky to get your head around AI is that the field has its own lexicon of phrases. Video conferencing software may get described as AI-driven or using machine or deep learning. That could tempt you to choose the solution. But, do you truly understand what it all means?

If the answers no, youll want to read on. Youll also be by no means alone in giving that response. The terminology of AI is far from straightforward. Get to grips with AI, machine learning, and deep learning, though, and youre well on your way.

When many people hear about AI, their first thoughts may still be for science fiction films of years gone by. There was even a blockbuster named, simply, AI. The truth is, though, that artificial intelligence has been a part of real-life for years now.

As a phrase, AI refers to any technology which works to mimic human intelligence. Some of the hallmarks of human intelligence that AI aims to replicate include:

A common misconception is that a solution or piece of software can either be or use AI. Such tools are better described as displaying or exhibiting AI. Theyre artificial theyre machines, after all and are displaying intelligence.

Aside from migration to the Cloud, adoption of processes that exhibit AI is probably the most widespread tech trend of recent years. Everything from a web meeting platform to an analytics suite may adopt some elements of AI.

Recommended:Great SaaS Companies Need Great SaaS Ecosystems: 5 Ways To Build A Defensible Business In The Cloud

Two of the most widespread AI processes are machine learning and deep learning. These are often the innovations that so-called AI-driven solutions leverage. That begs the obvious question, then, of what is machine learning?

As mentioned, the first thing to understand about machine learning is that it is an example of AI. Its one particular process by which an artificial system can display intelligence. Put simply and as the name suggests its when a machine can learn.

By machine in this context, we mean an algorithm. By learning, we mean taking ideally a large volume of data and using it for specific pre-defined tasks. Typically, machine learning algorithms analyze sets of data and identify patterns. They then use those patterns to generate conclusions or take defined actions.

Machine learning algorithms get smarter as they go along. The more data the algorithms analyze, the better their predictions, conclusions, or actions. A straightforward example is an algorithm used by video or music streaming services.

Those algorithms collect data on the choices that users make. That means things like which artists people listen to or the genre of programs they watch. They then use the data to predict and recommend new bands or shows that users may like. The more data the algorithms process, the better they can forecast what a user will enjoy.

The applications of machine learning go far beyond streaming and entertainment. But well talk about that later. First, we need to discuss deep learning and how its different.

We must begin our definition of deep learning in a similar way to that of machine learning. In this case, its vital to understand that deep learning is machine learning AND an example of AI. In many ways, its the next evolution of machine learning.

Machine learning algorithms deal with structured and labeled data.

They analyze it, create more data, and use that to generate conclusions or predictions. When outcomes arent as desired, the algorithms can get retrained. Thats via human intervention. As compared to the human brain, machine learning algorithms are simplistic.

Deep learning is the process thats attempting to close the gap between artificial and human intelligence further.

Rather than a hard-coded algorithm, deep learning utilizes many-layered and interconnected examples. Thats to replicate better our brains, which combine tens of thousands of neurons.

Systems of deep learning algorithms are known as artificial neural networks. Theyre literal if still more straightforward copies of the human brain. Being built as they are, allows deep learning networks to do far more than machine learning algorithms.

Such networks dont need structured data to operate. Theyre able to make sense of unstructured and unlabeled data sets. Whats more, with enough training they can make sense of far more complex information. And can do so at the first time of asking.

One of the areas to which deep learning is crucial is that of self-driving cars. Its the ability of artificial neural networks to assess and process complex information that makes such things possible. They help cars to understand the environment around them.

Autonomous vehicles can recognize road signs, pedestrians, other road users, and more. They can also spot patterns in how those things are behaving. Doing so is crucial in the case of pedestrians and other vehicles. It allows the cars to react accordingly and is all down to deep learning.

Fascinating as AI and its elements undoubtedly are, why should you care? Its a fair question with a definitive answer. You must care about AI, machine, and deep learning because its impacting marketing in a big way.

The MarTech solutions of the future and the present will almost all exhibit elements of AI.

If youre in the SaaS niche or involved in other marketing, youll soon come across AI-enabled solutions. Thats assuming you havent already.

Machine learning is already getting leveraged in a broad array of MarTech solutions. From webinar services to chatbots, the element of AI is making all kinds of software or tools smarter.

Weve already talked about how machine learning gets utilized for recommendations. That extends to products, as well as artists or TV shows. Its the basis of the kind of you might also like sections that you often see on e-commerce websites.

Thats far from the only marketing application of machine learning. With a machine learning algorithm, you can make better sense of your customer data. And do so in a fraction of the time.

Say, for instance, that you want to run a targeted campaign. One focused on a specific sector of your target audience. Machine learning allows you to segment the audience with greater accuracy.

An algorithm can use the data from your current CRM or other sources to produce sample personas. It can then ID those leads in your email list or database who match the characteristics of the personas. This can be like a silver bullet for customer acquisition.

Chatbots, too, can use machine learning to aid an organizations marketing. Many companies now implement a chatbot on their website. Theyre those little chat windows which pop up to see if you need help when you load a webpage.

Thanks to machine learning, visitors to a site can hold a full and useful conversation with the chatbot. The algorithms behind the tool get trained to recognize queries. And to provide the right responses. From a marketing point of view, that may mean pointing a site visitor to the correct product or service.

Chatbots and other channels of written communication are also hotbeds for deep learning in MarTech. Thats thanks to the possibilities afforded by natural language processing (NLP). NLP is a further aspect of AI and is made possible by the artificial neural networks of deep learning.

Human language is incredibly complex. Just think back to the maddening grammar rules and exceptions of high school English. For computers or machines, the nuances of language have long been impossible to decipher. That, though, is no longer the case.

Via deep learning and NLP, algorithms can now grasp meaning and context in language. That has profound implications for marketing as well as customer service, where the tech is more often applied. Take, for instance, the two vital aspects of marketing that are SEO and content creation.

Keyword research is a crucial element of SEO. Its about recognizing the words and phrases for which your target audience search. NLP can supercharge this process. You can use algorithms to generate more accurate keywordsall through using existing written communications from your customers.

In content marketing, its vital that all content is useful and interesting to its audience. NLP can help in this regard, too. With NLP, you can better recognize whats important to your customers. You might, for instance, use an algorithm to ID common topics on a company forum. That shows you your customers interests and shows useful topic ideas for content.

Thats purely the tip of the iceberg in terms of deep learnings potential applications. The area remains a comparatively young one. Its sure to be explored and utilized further in years to come.

Artificial intelligence is making its presence felt across industries and disciplines. The broad array of processes under the umbrella of AI are revolutionizing fields. That includes, but is by no means limited to, MarTech.

If youre going to keep up with the modern trends of marketing, then, you must understand all things AI. Hopefully, know youve read this guide, youve got a good grounding. You should at least know your machine from your deep learning. And understand how theyre both examples of AI.

(To share your insights on AI in Martech, please write to us at sghosh@martechseries.com)

Share and Enjoy !

Excerpt from:
The Difference Between AI ML and Deep Learning And Why They Matter - AiThority

New stellar stream, born outside the Milky Way, discovered with machine learning – Penn: Office of University Communications

Posted on July 18, 2020 by admin

Researchers have discovered a new cluster of stars in the Milky Way disk, the first evidence of this type of merger with another dwarf galaxy. Named after Nyx, the Greek goddess of night, the discovery of this new stellar stream was made possible by machine learning algorithms and simulations of data from the Gaia space observatory. The finding, published in Nature Astronomy, is the result of a collaboration between researchers at Penn, the California Institute of Technology, Princeton University, Tel Aviv University, and the University of Oregon.

The Gaia satellite is collecting data to create high-resolution 3D maps of more than one billion stars. From its position at the L2 Lagrange point, Gaia can observe the entire sky, and these extremely precise measurements of star positions have allowed researchers to learn more about the structures of galaxies, such as the Milky Way, and how they have evolved over time.

In the five years that Gaia has been collecting data, astronomer and study co-author Robyn Sanderson of Penn says that the data collected so far has shown that galaxies are much more dynamic and complex than previously thought. With her interest in galaxy dynamics, Sanderson is developing new ways to model the Milky Ways dark matter distribution by studying the orbits of stars. For her, the massive amount of data generated by Gaia is both a unique opportunity to learn more about the Milky Way as well as a scientific challenge that requires new techniques, which is where machine learning comes in.

One of the ways in which people have modeled galaxies has been with hand-built models, says Sanderson, referring to the traditional mathematical models used in the field. But that leaves out the cosmological context in which our galaxy is forming: the fact that its built from mergers between smaller galaxies, or that the gas that ends up forming stars comes from outside the galaxy. Now, using machine learning tools, researchers like Sanderson can instead recreate the initial conditions of a galaxy on a computer to see how structures emerge from fundamental physical laws without having to specify the parameters of a mathematical model.

The first step in being able to use machine learning to ask questions about galaxy evolution is to create mock Gaia surveys from simulations. These simulations include details on everything that scientists know about how galaxies form, including the presence of dark matter, gas, and stars. They are also among the largest computer models of galaxies ever attempted. The researchers used three different simulations of galaxies to create nine mock surveysthree from each simulationwith each mock survey containing 2-6 billion stars generated using 5 million particles. The simulations took months to complete, requiring 10 million CPU hours to run on some of the worlds fastest supercomputers.

The researchers then trained a machine learning algorithm on these simulated datasets to learn how to recognize stars that came from other galaxies based on differences in their dynamical signatures. To confirm that their approach was working, they verified that the algorithm was able to spot other groups of stars that had already been confirmed as coming from outside the Milky Way, including the Gaia Sausage and the Helmi stream, two dwarf galaxies that merged with the Milky Way several billion years ago.

In addition to spotting these known structures, the algorithm also identified a cluster of 250 stars rotating with the Milky Ways disk towards the galaxys center. The stellar stream, named Nyx by the papers lead author Lina Necib, would have been difficult to spot using traditional hand-crafted models, especially since only 1% of the stars in the Gaia catalog are thought to originate from other galaxies. This particular structure is very interesting because it would have been very difficult to see without machine learning," says Necib.

But machine learning approaches also require careful interpretation in order to confirm that any new discoveries arent simply bugs in the code. This is why the simulated datasets are so crucial, since algorithms cant be trained on the same datasets that they are evaluating. The researchers are also planning to confirm Nyxs origins by collecting new data on its streams chemical composition to see if this cluster of stars differs from ones that originated in the Milky Way.

For Sanderson and her team members who are studying the distribution of dark matter, machine learning also provides new ways to test theories about the nature of the dark matter particle and where its distributed. Its a tool that will become especially important with the upcoming third Gaia data release, which will provide even more detailed information that will allow her group to more accurately model the distribution of dark matter in the Milky Way. And, as a member of the Sloan Digital Sky Survey consortium, Sanderson is also using the Gaia simulations to help plan future star surveys that will create 3D maps of the entire universe.

The reason that people in my subfield are turning to these techniques now is because we didnt have enough data before to do anything like this. Now, were overwhelmed with data, and were trying to make sense of something thats far more complex than our old models can handle, says Sanderson. My hope is to be able to refine our understanding of the mass of the Milky Way, the way that dark matter is laid out, and compare that to our predictions for different models of dark matter.

Despite the challenges of analyzing these massive datasets, Sanderson is excited to continue using machine learning to make new discoveries and gain new insights about galaxy evolution. Its a great time to be working in this field. Its fantastic; I love it, she says.

Robyn Sanderson is an assistant professor in the Department of Physics and Astronomy in the School of Arts & Sciences at the University of Pennsylvania.

Gaia is a space observatory of the European Space Agency whose mission is to make the largest, most precise three-dimensional map of the Milky Way Galaxy by measuring the positions, distances, and motions of stars with unprecedented precision.

Supercomputers used for this research included Blue Waters at the National Center for Supercomputing Applications, NASA's High-End Computing facilities, and Stampede2 at the Texas Advanced Computing Center.

Go here to read the rest:
New stellar stream, born outside the Milky Way, discovered with machine learning - Penn: Office of University Communications