Facebook, AWS team up to produce open-source PyTorch AI libraries, grad student says he successfully used GPT-2 to write his homework…. – The…

Roundup Hello El Reg readers. If you're stuck inside, and need some AI news to soothe your soul, here's our weekly machine-learning roundup.

Nvidia GTC virtual keynote coming to YouTube: Nvidia cancelled its annual GPU Technology Conference in Silicon Valley in March over the ongoing coronavirus pandemic. The keynote speech was promised to be screened virtually, and then that got canned, too. Now, its back.

CEO Jensen Huang will present his talk on May 14 on YouTube at 0600 PT (1300 UTC). Yes, thats early for people on the US West Coast. And no, Jensen isnt doing it live at that hour: the video is prerecorded.

Still, graphics hardware and AI fans will probably want to keep an eye on the presentation. Huang is expected to unveil specs for a new GPU architecture reportedly named the A100, which is expected to be more powerful than its Tesla V100 chips. Youll be able to watch the keynote when it comes out on Nvidias YouTube channel, here.

Also, Nvidia has partnered up with academics at Kings College London to release MONAI, an open-source AI framework for medical imaging.

The framework packages together tools to help researchers and medical practitioners process image data for computer vision models built with PyTorch. These include things like segmenting features in 3D scans or classifying objects in 2D.

Researchers need a flexible, powerful and composable framework that allows them to do innovative medical AI research, while providing the robustness, testing and documentation necessary for safe hospital deployment, said Jorge Cardoso, chief technology officer of the London Medical Imaging & AI Centre for Value-based Healthcare. Such a tool was missing prior to Project MONAI.

You can play with MONAI on GitHub here, or read about it more here.

New PyTorch libraries for ML production: Speaking of PyTorch, Facebook and AWS have collaborated to release a couple of open-source goodies for deploying machine-learning models.

There are now two new libraries: TorchServe and TorchElastic. TorchServe provides tools to manage and perform inference with PyTorch models. It can be used in any cloud service, and you can find the instructions on how to install and use it here.

TorchElastic allows users to train large models over a cluster of compute nodes with Kubernetes. The distributed training means that even if some servers go down for maintenance or random network issues, the service isnt completely interrupted. It can be used on any cloud provider that supports Kubernetes. You can read how to use the library here.

These libraries enable the community to efficiently productionize AI models at scale and push the state of the art on model exploration as model architectures continue to increase in size and complexity, Facebook said this week.

MIT stops working with blacklisted AI company: MIT has discontinued its five-year research collaboration with iFlyTek, a Chinese AI company the US government flagged as being involved in the ongoing persecution of Uyghur Muslims in China.

Academics at the American university made the decision to cut ties with the controversial startup in February. iFlyTek is among 27 other names that are on the US Bureau of Industry and Securitys Entity List, which forbids American organizations from doing business with without Uncle Sam's permission. Breaking the rules will result in sanctions.

We take very seriously concerns about national security and economic security threats from China and other countries, and human rights issues, Maria Zuber, vice president of research at MIT, said, Wired first reported.

MIT entered a five-year deal with iFlyTek in 2018 to collaborate on AI research focused on human-computer interaction, speech recognition, and computer vision.

The relationship soured when it was revealed iFlyTek was helping the Chinese government build a mass automated voice recognition and monitoring system, according to the non-profit Human Rights Watch. That technology was sold to police bureaus in the provinces of Xinjiang and Anhui, where the majority of the Uyghur population in China resides.

OpenAIs GPT-2 writes university papers: A cheeky masters degree student admitted this week to using OpenAIs giant language model GPT-2 to help write his essays.

The graduate student, named only as Tiago, was interviewed by Futurism. We're told that although he passed his assignments using the machine-learning software, he said the achievement was down to failings within the business school rather than to the prowess of state-of-the-art AI technology.

In other words, his science homework wasn't too rigorously marked in this particular unnamed school, allowing him to successfully pass off machine-generated write-ups of varying quality as his own work and GPT-2's output does vary in quality, depending on how you use it.

You couldnt write an essay on science that could be anywhere near convincing using the methods that I used," he said. "Many of the courses that I take in business school wouldnt make it possible as well.

"However, some particular courses are less information-dense, and so if you can manage to write a few pages with some kind of structure and some kind of argument, you can get through. Its not that great of an achievement, I would say, for GPT-2.

Thanks to the Talk to Transformer tool, anyone can use GPT-2 on a web browser. Tiago would feed opening sentences to the model, and copy and paste the machine-generated responses to put in his essay.

GPT-2 is pretty convincing at first: it has a good grasp of grammar, and there is some level of coherency in its opening paragraphs when responding to a statement or question. Its output quality begins to fall apart, becoming incoherent or absurd, as it rambles in subsequent paragraphs. It also doesnt care about facts, which is why it wont be good as a collaborator for subjects such as history and science.

Sponsored: Practical tips for Office 365 tenant-to-tenant migration

Original post:
Facebook, AWS team up to produce open-source PyTorch AI libraries, grad student says he successfully used GPT-2 to write his homework.... - The...

Researchers say deep learning will power 5G and 6G cognitive radios – VentureBeat

For decades, amateur two-way radio operators have communicated across entire continents by choosing the right radiofrequency at the right time of day, a luxury made possible by having relatively few users and devices sharing the airwaves. But as cellular radios multiply in both phones and Internet of Things devices, finding interference-free frequencies is becoming more difficult, so researchers are planning to use deep learning to create cognitive radios that instantly adjust their radio frequencies to achieve optimal performance.

As explained by researchers with Northeastern Universitys Institute for the Wireless Internet of Things, the increasing varieties and densities of cellular IoT devices are creating new challenges for wireless network optimization; a given swath of radio frequencies may be shared by a hundred small radios designed to operate in the same general area, each with individual signaling characteristics and variations in adjusting to changed conditions. The sheer number of devices reduces the efficacy of fixed mathematical models when predicting what spectrum fragments may be free at a given split second.

Thats where deep learning comes in. The researchers hope to use machine learning techniques embedded within the wireless devices hardware to improve frequency utilization, such that the devices can develop AI-optimized spectrum usage strategies by themselves. Early studies suggest that deep learning models average 20% higher classification accuracy than traditional systems when dealing with noisy radio channels, and will be able to scale to hundreds of simultaneous devices, rather than dozens. Moreover, the deep learning architecture developed for this purpose will be usable for multiple other tasks, as well.

One key challenge in implementing deep learning for this application is the massive amount of data that will need to be processed rapidly to do continuous analysis. Deep learning can rely on tens of millions of parameters, and here might require measurements of over a hundred megabytes per second of data on amillisecond level. This is beyond the capability of even the most powerful embedded devices currently available, the researchers note, and low latency demands that the results not be processed in the cloud.

So the goal will be to help shrink deep learning models to the point where they can run on small devices, and use complex testing facilities wireless data factories to improve the software as hardware improves, including raising its resilience against adversarial attacks. The researchers expect to use the learning in both 5G millimeter wave and future 6G terahertz hardware, which are expected to become even more ubiquitous than 4G devices over the next two decades, despite their ultra high-frequency signals susceptibility to physical interference.

See the original post here:
Researchers say deep learning will power 5G and 6G cognitive radios - VentureBeat

AI used to predict Covid-19 patients’ decline before proven to work – STAT

Dozens of hospitals across the country are using an artificial intelligence system created by Epic, the big electronic health record vendor, to predict which Covid-19 patients will become critically ill, even as many are struggling to validate the tools effectiveness on those with the new disease.

The rapid uptake of Epics deterioration index is a sign of the challenges imposed by the pandemic: Normally hospitals would take time to test the tool on hundreds of patients, refine the algorithm underlying it, and then adjust care practices to implement it in their clinics.

Covid-19 is not giving them that luxury. They need to be able to intervene to prevent patients from going downhill, or at least make sure a ventilator is available when they do. Because it is a new illness, doctors dont have enough experience to determine who is at highest risk, so they are turning to AI for help and in some cases cramming a validation process that often takes months or years into a couple weeks.

advertisement

Nobody has amassed the numbers to do a statistically valid test of the AI, said Mark Pierce, a physician and chief medical informatics officer at Parkview Health, a nine-hospital health system in Indiana and Ohio that is using Epics tool. But in times like this that are unprecedented in U.S. health care, you really do the best you can with the numbers you have, and err on the side of patient care.

Epics index uses machine learning, a type of artificial intelligence, to give clinicians a snapshot of the risks facing each patient. But hospitals are reaching different conclusions about how to apply the tool, which crunches data on patients vital signs, lab results, and nursing assessments to assign a 0 to 100 score, with a higher score indicating an elevated risk of deterioration. It was already used by hundreds of hospitals before the outbreak to monitor hospitalized patients, and is now being applied to those with Covid-19.

advertisement

At Parkview, doctors analyzed data on nearly 100 cases and found that 75% of hospitalized patients who received a score in a middle zone between 38 and 55 were eventually transferred to the intensive care unit. In the absence of a more precise measure, clinicians are using that zone to help determine who needs closer monitoring and whether a patient in an outlying facility needs to be transferred to a larger hospital with an ICU.

Meanwhile, the University of Michigan, which has seen a larger volume of patients due to a cluster of cases in that state, found in an evaluation of 200 patients that the deterioration index is most helpful for those who scored on the margins of the scale.

For about 9% of patients whose scores remained on the low end during the first 48 hours of hospitalization, the health system determined they were unlikely to experience a life-threatening event and that physicians could consider moving them to a field hospital for lower-risk patients. On the opposite end of the spectrum, it found 10% to 12% of patients who scored on the higher end of the scale were much more likely to need ICU care and should be closely monitored. More precise data on the results will be published in coming days, although they have not yet been peer-reviewed.

Clinicians in the Michigan health system have been using the score thresholds established by the research to monitor the condition of patients during rounds and in a command center designed to help manage their care. But clinicians are also considering other factors, such as physical exams, to determine how they should be treated.

This is not going to replace clinical judgement, said Karandeep Singh, a physician and health informaticist at the University of Michigan who participated in the evaluation of Epics AI tool. But its the best thing weve got right now to help make decisions.

Stanford University has also been testing the deterioration index on Covid-19 patients, but a physician in charge of the work said the health system has not seen enough patients to fully evaluate its performance. If we do experience a future surge, we hope that the foundation we have built with this work can be quickly adapted, said Ron Li, a clinical informaticist at Stanford.

Executives at Epic said the AI tool, which has been rolled out to monitor hospitalized patients over the past two years, is already being used to support care of Covid-19 patients in dozens of hospitals across the United States. They include Parkview, Confluence Health in Washington state, and ProMedica, a health system that operates in Ohio and Michigan.

Our approach as Covid was ramping up over the last eight weeks has been to evaluate does it look very similar to (other respiratory illnesses) from a machine learning perspective and can we pick up that rapid deterioration? said Seth Hain, a data scientist and senior vice president of research and development at Epic. What we found is yes, and the result has been that organizations are rapidly using this model in that context.

Some hospitals that had already adopted the index are simply applying it to Covid-19 patients, while others are seeking to validate its ability to accurately assess patients with the new disease. It remains unclear how the use of the tool is affecting patient outcomes, or whether its scores accurately predict how Covid-19 patients are faring in hospitals. The AI system was initially designed to predict deterioration of hospitalized patients facing a wide array of illnesses. Epic trained and tested the index on more than 100,000 patient encounters at three hospital systems between 2012 and 2016, and found that it could accurately characterize the risks facing patients.

When the coronavirus began spreading in the United States, health systems raced to repurpose existing AI models to help keep tabs on patients and manage the supply of beds, ventilators and other equipment in their hospitals. Researchers have tried to develop AI models from scratch to focus on the unique effects of Covid-19, but many of those tools have struggled with bias and accuracy issues, according to a review published in the BMJ.

The biggest question hospitals face in implementing predictive AI tools, whether to help manage Covid-19 or advanced kidney disease, is how to act on the risk score it provides. Can clinicians take actions that will prevent the deterioration from happening? If not, does it give them enough warning to respond effectively?

In the case of Covid-19, the latter question is the most relevant, because researchers have not yet identified any effective treatments to counteract the effects of the illness. Instead, they are left to deliver supportive care, including mechanical ventilation if patients are no longer able to breathe on their own.

Knowing ahead of time whether mechanical ventilation might be necessary is helpful, because doctors can ensure that an ICU bed and a ventilator or other breathing assistance is available.

Singh, the informaticist at the University of Michigan, said the most difficult part about making predictions based on Epics system, which calculates a score every 15 minutes, is that patients ratings tend to bounce up and down in a sawtooth pattern. A change in heart rate could cause the score to suddenly rise or fall. He said his research team found that it was often difficult to detect, or act on, trends in the data.

Because the score fluctuates from 70 to 30 to 40, we felt like its hard to use it that way, he said. A patient whos high risk right now might be low risk in 15 minutes.

In some cases, he said, patients bounced around in the middle zone for days but then suddenly needed to go to the ICU. In others, a patient with a similar trajectory of scores could be managed effectively without need for intensive care.

But Singh said that in about 20% of patients it was possible to identify threshold scores that could indicate whether a patient was likely to decline or recover. In the case of patients likely to decline, the researchers found that the system could give them up to 40 hours of warning before a life-threatening event would occur.

Thats significant lead time to help intervene for a very small percentage of patients, he said. As to whether the system is saving lives, or improving care in comparison to standard nursing practices, Singh said the answers will have to wait for another day. You would need a trial to validate that question, he said. The question of whether this is saving lives is unanswerable right now.

Here is the original post:
AI used to predict Covid-19 patients' decline before proven to work - STAT

One Supercomputers HPC And AI Battle Against The Coronavirus – The Next Platform

Normally, supercomputers installed at academic and national laboratories get configured once, acquired as quickly as possible before the money runs out, installed and tested, qualified for use, and put to work for a four or five or possibly longer tour of duty. It is a rare machine that is upgraded even once, much less a few times.

But that is not he case with the Corona system at Lawrence Livermore National Laboratory, which was commissioned in 2017 when North America had a total solar eclipse and hence its nickname. While this machine, procured under the Commodity Technology Systems (CTS-1) to not only do useful work, but to assess the CPU and GPU architectures provided by AMD, was not named after the coronavirus pandemic that is now spreading around the Earth, the machine is being upgraded one more time to be put into service as a weapon against the SARS-CoV-2 virus which caused the COVID-19 illness that has infected at least 2.75 million people (confirmed by test, with the number very likely being higher) and killed at least 193,000 people worldwide.

The Corona system was built by Penguin Computing, which has a long-standing relationship with Lawrence Livermore National Laboratory, Los Alamos National Laboratory, and Sandia National Laboratories the so-called Tri-Labs that are part of the US Department of Energy and that coordinate on their supercomputer procurements. The initial Corona machine installed in 2018 had 164 compute nodes, each equipped with a pair of Naples Epyc 7401 processors, which have 24 cores each running at 2 GHz with an all core turbo boost of 2.8 GHz. The Penguin Tundra Extreme servers that comprise this cluster have 256 GB of main memory and 1.6 TB of PCI-Express flash. When the machine was installed in November 2018, half of the nodes were equipped with four of AMDs Radeon Instinct MI25 GPU accelerators, which had 16 GB of HBM2 memory each and which had 768 gigaflops of FP64 performance, 12.29 teraflops of FP32 performance, and 24.6 teraflops of FP16 performance. The 7,872 CPU cores in the system delivered 126 teraflops at FP64 double precision all by themselves, and the Radeon Instinct MI25 GPU accelerators added another 251.9 teraflops at FP64 double precision. The single precision performance for the machine was obviously much higher, at 4.28 petaflops across both the CPUs and GPUs. Interestingly, this machine was equipped with 200 Gb/sec HDR InfiniBand switching from Mellanox Technologies, which was obviously one of the earliest installations of this switching speed.

In November last year, just before the coronavirus outbreak or, at least we think that was before the outbreak, that may turn out to not be the case AMD and Penguin worked out a deal to installed four of the much more powerful Radeon Instinct MI60 GPU accelerators, based on the 7 nanometer Vega GPUs, in the 82 nodes in the system that didnt already have GPU accelerators in them. The Radeon Instinct MI60 has 32 GB of HBM2 memory, and has 6.6 teraflops of FP64 performance, 13.3 teraflops of FP32 performance, and 26.5 teraflops of FP16 performance. Now the machine has 8.9 petaflops of FP32 performance and 2.54 petaflops of FP64 performance, and this is a much more balanced 64-bit to 32-bit performance, and it makes these nodes more useful for certain kinds of HPC and AI workloads. Which turns out to be very important to Lawrence Livermore in its fight against the COVID-19 disease.

To find out more about how the Corona system and others are being deployed in the fight against COVID-19, and how HPC and AI workloads are being intertwined in that fight, we talked to Jim Brase, deputy associate director for data science at Lawrence Livermore.

Timothy Prickett Morgan: It is kind of weird that this machine was called Corona. Foreshadowing is how you tell the good literature from the cheap stuff. The doubling of performance that just happened late last year for this machine could not have come at a better time.

Jim Brase: It pretty much doubles the overall floating point performance of the machine, which is great because what we are mainly running on Corona is both the molecular dynamics calculations of various viral and human protein components and then machine learning algorithms for both predictive models and design optimization.

TPM: Thats a lot more oomph. So what specifically are you doing with it in the fight against COVID-19?

Jim Brase: There are two basic things were doing as part of the COVID-19 response, and this machine is almost entirely dedicated to this although several of our other clusters at Lawrence Livermore are involved as well.

We have teams that are doing both antibody and vaccine design. They are mainly focused on therapeutic antibodies right now. They are basically designing proteins that will interact with the virus or with the way the virus interacts with human cells. That involves hypothesizing different protein structures and computing what those structures actually look like in detail, then computing using molecular dynamics the interaction between those protein structures and the viral proteins or the viral and human cell interactions.

With this machine, we do this iteratively to basically design a set of proteins. We have a bunch of metrics that we try to optimize on binding strength, the stability of the binding, stuff like that and then we do a detailed molecular dynamics calculations to figure out the effective energy of those binding events. These metrics determine the quality of the potential antibody or vaccine that we design.

TPM: To wildly oversimplify, this SARS-CoV-2 virus is a ball of fat with some spikes on it that wreaks havoc as it replicates using our cells as raw material. This is a fairly complicated molecule at some level. What are we trying to do? Stick goo to it to try to keep it from replicating or tear it apart or dissolve it?

Jim Brase: In the case of in the case of antibodies, which is what were mostly focusing on right now, we are actually designing a protein that will bind to some part of the virus, and because of that the virus then changes its shape, and the change in shape means it will not be able to function. These are little molecular machines that they depend on their shape to do things.

TPM: Theres not something that will physically go in and tear it apart like a white blood cell eats stuff.

Jim Brase: No. Thats generally done by biology, which comes in after this and cleans up. What we are trying to do is what we call neutralizing antibodies. They go in and bind and then the virus cant do its job anymore.

TPM: And just for a reference, what is the difference between a vaccine and an antibody?

Jim Brase: In some sense, they are the opposite of each other. With a vaccine, we are putting in a protein that actually looks like the virus but it doesnt make you sick. It stimulates the human immune system to create its own antibodies to combat that virus. And those antibodies produced by the body do exactly the same thing we were just talking about Producing antibodies directly is faster, but the effect doesnt last. So it is more of a medical treatment for somebody who is already sick.

TPM: I was alarmed to learn that for certain coronaviruses, immunity doesnt really last very long. With the common cold, the reason we get them is not just because they change every year, but because if you didnt have a bad version of it, you dont generate a lot of antibodies and therefore you are susceptible. If you have a very severe cold, you generate antibodies and they last for a year or two. But then youre done and your body stops looking for that fight.

Jim Brase: The immune system is very complicated and for some things it creates antibodies that remembers them for a long time. For others, its much shorter. Its sort of a combination of the of the what we call the antigen the thing about that, the virus or whatever that triggers it and then the immune system sort of memory function together, cause the immunity not to last as long. Its not well understood at this point.

TPM: What are the programs youre using to do the antibody and protein synthesis?

Jim Brase: We are using a variety of programs. We use GROMACS, we use NAMD, we use OpenMM stuff. And then we have some specialized homegrown codes that we use as well that operate on the data coming from these programs. But its mostly the general, open source molecular mechanics and molecular dynamics codes.

TPM: Lets contrast this COVID-19 effort with like something like SARS outbreak in 2003. Say you had the same problem. Could you have even done the things you are doing today with SARS-CoV-2 back then with SARS? Was it even possible to design proteins and do enough of them to actually have an impact to get the antibody therapy or develop the vaccine?

Jim Brase: A decade ago, we could do single calculations. We could do them one, two, three. But what we couldnt do was iterate it as a design optimization. Now we can run enough of these fast enough that we can make this part of an actual design process where we are computing these metrics, then adjusting the molecules. And we have machine learning approaches now that we didnt have ten years ago that allow us to hypothesize new molecules and then we run the detailed physics calculations against this, and we do that over and over and over.

TPM: So not only do you have a specialized homegrown code that takes the output of these molecular dynamics programs, but you are using machine learning as a front end as well.

Jim Brase: We use machine learning in two places. Even with these machines and we are using our whole spectrum of systems on this effort we still cant do enough molecular dynamics calculations, particularly the detailed molecular dynamics that we are talking about here. What does the new hardware allow us to do? It basically allows us to do a higher percentage of detailed molecular dynamics calculations, which give us better answers as opposed to more approximate calculations. So you can decrease the granularity size and we can compute whole molecular dynamics trajectories as opposed to approximate free energy calculations. It allows us to go deeper on the calculations, and do more of those. So ultimately, we get better answers.

But even with these new machines, we still cant do enough. If you think about the design space on, say, a protein that is a few hundred amino acids in length, and at each of those positions you can put in 20 different amino acids, you on the order of 20200 in the brute force with the possible number of proteins you could evaluate. You cant do that.

So we try to be smart about how we select where those simulations are done in that space, based on what we are seeing. And then we use the molecular dynamics to generate datasets that we then train machine learning models on so that we are basically doing very smart interpolation in those datasets. We are combining the best of both worlds and using the physics-based molecular dynamics to generate data that we use to train these machine learning algorithms, which allows us to then fill in a lot of the rest of the space because those can run very, very fast.

TPM: You couldnt do all of that stuff ten years ago? And SARS did not create the same level of outbreak that SARS-CoV-2 has done.

Jim Brase: No, these are all fairly new early new ideas.

TPM: So, in a sense, we are lucky. We have the resources at a time when we need them most. Did you have the code all ready to go for this? Were you already working on this kind of stuff and then COVID-19 happened or did you guys just whip up these programs?

Jim Brase: No, no, no, no. Weve been working on this kind of stuff for her for a few years.

TPM: Well, thank you. Id like to personally thank you.

Jim Brase: It has been an interesting development. Its both been both in the biology space and the physics space, and those two groups have set up a feedback loop back and forth. I have been running a consortium called Advanced Therapeutic Opportunities in Medicine, or ATOM for short, to do just this kind of stuff for the last four years. It started up as part of the Cancer Moonshot in 2016 and focused on accelerating cancer therapeutics using the same kinds of ideas, where we are using machine learning models to predict the properties, using both mechanistic simulations like molecular dynamics, but all that combined with data, but then also using it other the other way around. We also use machine learning to actually hypothesize new molecules given a set of molecules that we have right now and that we have computed properties on them that arent quite what we want, how do we just tweak those molecules a little bit to adjust their properties in the directions that we want?

The problem with this approach is scale. Molecules are atoms that are bonded with each other. You could just take out an atom, add another atom, change a bond type, or something. The problem with that is that every time you do that randomly, you almost always get an illegal molecule. So we train these machine learning algorithms these are generative models to actually be able to generate legal molecules that are close to a set of molecules that we have but a little bit different and with properties that are probably a little bit closer to what we what we want. And so that allows us to smoothly adjust the molecular designs to move towards the optimization targets that we want. If you think about optimization, what you want are things with smooth derivatives. And if you do this in sort of the discrete atom bond space, you dont have smooth derivatives. But if you do it in these, these are what we call learned latent spaces that we get from generative models, then you can actually have a smooth response in terms of the molecular properties. And thats what we want for optimization.

The other part of the machine learning story here is these new types of generative models. So variational autoencoders, generative adversarial models the things you hear about that generate fake data and so on. Were actually using those very productively to imagine new types of molecules with the kinds of properties that we want for this. And so thats something we were absolutely doing before COVID-19 hit. We have taken these projects like ATOM cancer project and other work weve been doing with DARPA and other places focused on different diseases and refocused those on COVID-19.

One other thing I wanted to mention is that we havent just been applying biology. A lot of these ideas are coming out of physics applications. One of our big things at Lawrence Livermore is laser fusion. We have 192 huge lasers at the National Ignition Facility to try to create fusion in a small hydrogen deuterium target. There are a lot of design parameters that go into that. The targets are really complex. We are using the same approach. Were running mechanistic simulations of the performance of those targets, we are then improving those with real data using machine learning. So now we now have a hybrid model that has physics in it and machine learning data models, and using that to optimize the designs of the laser fusion target. So thats led us to a whole new set of approaches to fusion energy.

Those same methods actually are the things were also applying to molecular design for medicines. And the two actually go back and forth and sort of feed on each other and support each other. In the last few weeks, some of the teams that have been working on the physics applications have actually jumped over onto the biology side and are using some of the same sort of complex workflows that were using on these big parallel machines that theyve developed for physics and applying those to some of the biology applications and helping to speed up the applications on these on this new hardware thats coming in. So it is a really nice synergy going back and forth.

TPM: I realize that machine learning software uses the GPUs for training and inference, but is the molecular dynamics software using the GPUs, too?

Jim Brase: All of the molecular dynamics software has been set up to use GPUs. The code actually maps pretty naturally onto the GPU.

TPM: Are you using the CUDA variants of the molecular dynamics software, and I presume that it is using the Radeon Open Compute, or ROCm, stack from AMD to translate that code so it can run on the Radeon Instinct accelerators?

Jim Brase: There has been some work to do, but it works. Its getting its getting to be pretty solid now, thats one of the reasons we wanted to jump into the AMD technology pretty early, because you know, any time you do first-in-kind machines its not always completely smooth sailing all the way.

TPM: Its not like Lawrence Livermore has a history of using novel designs for supercomputers. [Laughter]

Jim Brase: We seldom work with machines that are not Serial 00001 or Serial 00002.

TPM: Whats the machine learning stack you use? I presume it is TensorFlow.

Jim Brase: We use TensorFlow extensively. We use PyTorch extensively. We work with the DeepChem group at Stanford University that does an open chemistry package built on TensorFlow as well.

TPM: If you could fire up an exascale machine today, how much would it help in the fight against COVID-19?

Jim Brase: It would help a lot. Theres so much to do.

I think we need we need to show the benefits of computing for drug design and we are concretely doing that now. Four years ago, when we started up ATOM, everybody thought this was nuts, the general idea that we could lead with computing rather than experiment and do the experiments to focus on validating the computational models rather than the other way around. Everybody thought we were nuts. As you know, with the growth of data, the growth of machine learning capabilities, more accessibility to sophisticated molecular dynamics, and so on its much more accepted that computing is a big part of this. But we still have a long way to go on this.

The fact is, machine learning is not magic. Its a fancy interpolator. You dont get anything new out of it. With the physics codes, you actually get something new out of it. So the physics codes are really the foundation of this. You supplement them with experimental data because theyre not right necessarily, either. And then you use the machine learning on top of all that to fill in the gaps because you havent been able to sample that huge chemical and protein space adequately to really understand everything at either the data level or the mechanistic level.

So thats how I think of it. Data is truth sort of and what you also learn about data is that it is not always the same as you go through this. But data is the foundation. Mechanistic modeling allows us to fill in where we just cant measure enough data it is too expensive, it takes too long, and so on. We fill in with mechanistic modeling and then above that we fill in that then with machine learning. We have this stack of experimental truth, you know, mechanistic simulation that incorporates all the physics and chemistry we can, and then we use machine learning to interpolate in those spaces to support the design operation.

For COVID-19, there are there are a lot of groups doing vaccine designs. Some of them are using traditional experimental approaches and they are making progress. Some of them are doing computational designs, and that includes the national labs. Weve got 35 designs done and we are experimentally validating those now and seeing where we are with them. It will generally take two to three iterations of design, then experiment, and then adjust the designs back and forth. And were in the first round of that right now.

One thing were all doing, at least on the public side of this, is we are putting all this data out there openly. So the molecular designs that weve proposed are openly released. Then the validation data that we are getting on those will be openly released. This is so our group working with other lab groups, working with university groups, and some of the companies doing this COVID-19 research can contribute. We are hoping that by being able to look at all the data that all these groups are doing, we can learn faster on how to sort of narrow in on the on the vaccine designs and the antibody designs that will ultimately work.

The rest is here:
One Supercomputers HPC And AI Battle Against The Coronavirus - The Next Platform

IBM’s The Weather Channel app using machine learning to forecast allergy hotspots – TechRepublic

The Weather Channel is now using artificial intelligence and weather data to help people make better decisions about going outdoors based on the likelihood of suffering from allergy symptoms.

Amid the COVID-19 pandemic, most people are taking precautionary measures in an effort to ward off coronavirus, which is highly communicable and dangerous. It's no surprise that we gasp at every sneeze, cough, or even sniffle, from others and ourselves. Allergy sufferers may find themselves apologizing awkwardly, quickly indicating they don't have COVID-19, but have allergies, which are often treated with sleep-inducing antihistamines that cloud critical thinking.

The most common culprits and indicators to predict symptomsragweed, grass, and tree pollen readingsare often inconsistently tracked across the country. But artificial intelligence (AI) innovation from IBM's The Weather Channel is coming to the rescue of those roughly 50 million Americans that suffer from allergies.

The Weather Channel's new tool shows a 15-day allergy forecast based on ML.

Image: Teena Maddox/TechRepublic

IBM's The Weather Channel is now using machine learning (ML) to forecast allergy symptoms. IBM data scientists developed a new tool on The Weather Channel app and weather.com, "Allergy Insights with Watson" to predict your risk of allergy symptoms.

Weather can also drive allergy behaviors. "As we began building this allergy model, machine learning helped us teach our models to use weather data to predict symptoms," said Misha Sulpovar, product leader, consumer AI and ML, IBM Watson media and weather. Sulpovar's role is focused on using machine learning and blockchain to develop innovative and intuitive new experiences for the users of the Weather Channel's digital properties, specifically, weather.com and The Weather Channel smart phone apps.

SEE: IBM's The Weather Channel launches coronavirus map and app to track COVID-19 infections (TechRepublic)

Any allergy sufferer will tell you it can be absolutely miserable. "If you're an allergy sufferer, you understand that knowing in advance when your symptom risk might change can help anyone plan ahead and take action before symptoms may flare up," Sulpovar said. "This allergy risk prediction model is much more predictive around users' symptoms than other allergy trackers you are used to, which mostly depend on pollenan imperfect factor."

Sulpovar said the project has been in development for about a year, and said, "We included the tool within The Weather Channel app and weather.com because digital users come to us for local weather-related information," and not only to check weather forecasts, "but also for details on lifestyle impacts of weather on things like running, flu, and allergy."

He added, "Knowing how patients feel helps improve the model. IBM MarketScan (research database) is anonymized data from doctor visits of 100 million patients."

Daily pollen counts are also available on The Weather Channel app.

Image: Teena Maddox/TechRepublic

"A lot of what drives allergies are environmental factors like humidity, wind, and thunderstorms, as well as when specific plants in specific areas create pollen," Sulpovar said. "Plants have predictable behaviorfor example, the birch tree requires high humidity for birch pollen to burst and create allergens. To know when that will happen in different locations for all different species of trees, grasses, and weeds is huge, and machine learning is a huge help to pull it together and predict the underlying conditions that cause allergens and symptoms. The model will select the best indicators for your ZIP code and be a better determinant of atmospheric behavior."

"Allergy Insights with Watson" anticipates allergy symptoms up to 15 days in advance. AI, Watson, and its open multi-cloud platform help predict and shape future outcomes, automate complex processes, and optimize workers' time. IBM's The Weather Channel and weather.com are using this machine learning Watson to alleviate some of the problems wrought by allergens.

Sulpovar said, "Watson is IBM's suite of enterprise-ready AI services, applications, and tooling. Watson helps unlock value from data in new ways, at scale."

Data scientists have discovered a more accurate representation of allergy conditions. "IBM Watson machine learning trained the model to combine multiple weather attributes with environmental data and anonymized health data to assess when the allergy symptom risk is high, Sulpovar explained. "The model more accurately reflects the impact of allergens on people across the country in their day-to-day lives."

The model is challenged by changing conditions and the impact of climate change, but there has been a 25% to 50% increase in better decision making, based on allergy symptoms.

It may surprise long-time allergy sufferers who often cite pollen as the cause of allergies that "We found pollen is not a good predictor of allergy risk alone and that pollen sources are unreliable and spotty and cover only a small subset of species," Sulpovar explained. "Pollen levels are measured by humans in specific locations, but sometimes those measurements are few and far between, or not updated often. Our team found that using AI and weather data instead of just pollen data resulted in a 25-50% increase in making better decisions based on allergy symptoms."

Available on The Weather Channel app for iOS and Android, you can also find the tool online atwww.weather.com. Users of the tool will be given an accurate forecast, be alerted to flare-ups, and be provided with practical tips to reduce seasonal allergies.

This story was updated on April 23, 2020 to correct the spelling of Misha Sulpovar's name.

If you can only read one tech story a day, this is it. Delivered Weekdays

Image: Getty Images/iStockphoto

Original post:
IBM's The Weather Channel app using machine learning to forecast allergy hotspots - TechRepublic

Bringing Machine Learning To Finance? Don’t Overlook These Three Key Considerations – Forbes

Half of enterprises have adopted machine learning (ML) technologies as part of their enterprise business. The rest are exploring it. Clearly, the age of machine learning is upon us.

Nowhere is this more intriguing than in the office of finance, which is where every organizations financial and operational data comes together. More than merely reporting what has happened, modern finance organizations wield the latest technologies to help their businesses anticipate what will happen.

One of those technologies is ML, which leverages the advantages of automation, scalable cloud computing and data analytics to generate predictions based on historical and real-time data. Over time, you can train your ML engine to improve the accuracy of its predictions by feeding it more data (known as training data). Your ML engine grows even more intelligent through a built-in feedback loop that further teaches the platform by choosing to act (or not) on its predictions.

Predictions Versus Judgment And Why It Matters

Machines are very good at automating and accelerating the act of predicting. ML makes them even better at it. But judgment is very much a human strength, and its likely to remain so for some time. We can program machines to make limited judgments based on a preprogrammed set of variables and tolerances. If you have assisted driving features on your car, then youre already seeing this in action. These systems are trained to detect potential problems and then take specific actions based on that data.

But its important to recognize that these systems are designed to operate in relatively contained, discrete scenarios: keeping your automobile in its lane or braking when your car spots an object in your blind spot. For now, at least, they lack the contextual awareness required to make the countless decisions necessary to safely navigate your way.

For that, you need people.

In a larger business context, situational awareness helps us weigh factors that may not have been ingested by the ML engine. We know to question a prediction or proposed action that doesnt fit with our companys values or culture. The numbers might add up, but the action doesnt. We need people to make that call. A well-designed finance platform will leave room for you to make those calls, because in a world awash in data, even the best ML engines can be fooled by spurious data and false correlations. Thats why ML complements, rather than replaces, humans.

Is ML A DIY Project?

Ive overseen the development and implementation of ML at two companies the first to spot potentially fraudulent health insurance claims, and the other to model accurate forecasts and develop insightful what-if scenarios. Ive learned a lot from the experience.

But my experience may not look like yours, at least not in the details. As a technologist, I was responsible for bringing ML-powered solutions to life working with development teams to incorporate ML into products for our customers. And, theres every chance that ML will enter your environment via a SaaS (software-as-a-service) financial management or planning platform.

If youre a SaaS platform customer, the actual implementation of ML in your finance environment may be relatively transparent a built-in algorithm that drives powerful next-level features, intuits business drivers and helps support decision making (at least, thats how it should work). But since every ML engine is so dependent on data, and on the decisions you make around that data, you still must address some considerations.

Here are three important ones:

1. Understand where your data is coming from. Your ML predictions will only be as relevant as the data you use to train them. So one of the first steps is to decide what data youll want to input into the system. Theres general ledger (GL) and operational data, of course. But how much historical data is enough? What other sources do you want to tap? HCM? CRM? Do those platforms integrate with your ML-driven finance management or planning platform? Sit down with your IT team to craft a data ingestion strategy that will set you up for success.

2. Appreciate the cost of anomalies. No system is perfect, and occasionally yours will output outlier data that can skew your predictions. Understanding and acknowledging what these anomalies can cost your business is critical. In fact, one of the first uses we defined for ML in business planning purposes was to detect anomalies that could unwittingly put decision-makers on the wrong track. We designed this feature to flag outliers so managers can determine for themselves if they want to accept or disregard them.

3. Acknowledge and avoid bias. This is a big one. Whether we like to admit it or not, bias of all kinds affects much of our decision-making process, and it can threaten the success of your use of ML. Say you want your workforce planning system to model the ideal FP&A hires over the next eight quarters. One reasonable approach is to pick your highest-performing talent, define their key characteristics, and model your future hires after them. But if the previous managers tended to hire menwhether they were high performers or not youll be left with a skewed ingest data sampling that is unwittingly tainted by historical bias.

Harnessing the promise and power of machine learning is an exciting prospect for finance executives. Before long, planning systems will operate much like a navigation system for finance teams a kind of Waze for business. The business specifies its goals, where it would like to go, and the planning system will analyze all available data about past and current business performance, intuit the most important drivers, and offer different potential scenarios along with their relative pros and cons.

Think of ML as a way to make better, smarter use of data at a time when the way forward is increasingly uncertain. For businesses seeking agility, ML offers a way for them to find their true north in the office of finance.

Go here to see the original:
Bringing Machine Learning To Finance? Don't Overlook These Three Key Considerations - Forbes

Researchers Rebuild the Bridge Between Neuroscience and Artificial Intelligence – Global Health News Wire

The origin of machine and deep learning algorithms, which increasingly affect almost all aspects of our life, is the learning mechanism of synaptic (weight) strengths connecting neurons in our brain. Attempting to imitate these brain functions, researchers bridged between neuroscience and artificial intelligence over half a century ago. However, since then experimental neuroscience has not directly advanced the field of machine learning and both disciplines neuroscience and machine learning seem to have developed independently.

In an article published today in the journalScientific Reports, researchers reveal that they have successfully rebuilt the bridge between experimental neuroscience and advanced artificial intelligence learning algorithms. Conductingnew types of experimentson neuronal cultures, the researchers were able to demonstrate a new accelerated brain-inspired learning mechanism. When the mechanism was utilized on the artificial task of handwritten digit recognition, for instance, its success rates substantially outperformed commonly-used machine learning algorithms.

To rebuild this bridge, the researchers set out to prove two hypotheses: that the common assumption that learning in the brain is extremely slow might be wrong, and that the dynamics of the brain might include accelerated learning mechanisms.Surprisingly, both hypotheses were proven correct.

A learning step in our brain is believed to typically last tens of minutes or even more, while in a computer it lasts for a nanosecond, or one million times one million faster, said the studys lead author Prof. Ido Kanter, of Bar-Ilan Universitys Department of Physics and Gonda (Goldschmied) Multidisciplinary Brain Research Center. Although the brain is extremely slow, its computational capabilities outperform, or are comparable, to typical state-of-the-art artificial intelligence algorithms, added Kanter, who was assisted in the research by Shira Sardi, Dr. Roni Vardi, Yuval Meir, Dr. Amir Goldental, Shiri Hodassman and Yael Tugendfaft.

The teams experiments indicated thatadaptationin our brain is significantly accelerated with training frequency. Learning by observing the same image 10 times in a second is as effective as observing the same image 1,000 times in a month, said Shira Sardi, a main contributor to this work. Repeating the same image speedily enhances adaptation in our brain to seconds rather than tens of minutes. It is possible that learning in our brain is even faster, but beyond our current experimental limitations, added Dr. Roni Vardi, another main contributor to the research. Utilization of this newly-discovered, brain-inspired accelerated learning mechanism substantially outperforms commonly-used machine learning algorithms, such as handwritten digit recognition, especially where small datasets are provided for training.

The reconstructed bridge from experimental neuroscience to machine learning is expected to advance artificial intelligence and especially ultrafast decision making under limited training examples, similar to many circumstances of human decision making, as well as robotic control and network optimization.

Go here to see the original:
Researchers Rebuild the Bridge Between Neuroscience and Artificial Intelligence - Global Health News Wire

Not All AI and ML is Created Equal – Security Boulevard

Throughout the tech community, artificial intelligence has become a blanket term often used to describe any computing process that requires little human input. Tasks like routine database functions, scheduled system scans, and software that adds automation to repetitive actions are regularly referred to as AI.

In truth, AI can play a part in these processes, but there are some major differences between basic machine learning and true AI.

It is vital to consider this distinction when reading blogs and articles that try to outline the key aspects of AI and present an objective analysis of the shortcomings inherent to certain types of AI.

MixMode has made a name for itself as an AI-Powered network traffic analysis leader with the most powerful and advanced AI in the cybersecurity industry, and want to share a few considerations to keep in mind when researching products claiming to use AI and ML.

Not all artificial intelligence is created equal.

Advances in machine learning over the past several years, enhancing processes like facial recognition technologies and revolutionizing the self-driving car industry. These supervised learning AI applications are remarkable and signal a societal shift in how humans interact with technology.

However, supervised learning is limited in its ability to handle complex, sprawling tasks like discovering the threats lurking on an organizations network. Supervised AI can only locate specific threats it has seen or labeled before. Unsupervised learning, on the other hand, never ceases in its search for network anomalies.

Supervised learning relies on labeling to understand information. Once a SecOps professional has labeled data, supervised learning can recognize it and respond according to set parameters. A supervised learning platform might automate a message alerting the security team to a concerning data point. However, it cannot label data on its own.

These limitations would be sufficient for securing networks if SecOps teams knew exactly what to tell supervised learning platforms to find. The reality is that cybersecurity doesnt work that way. Bad actors are always a few steps ahead of the game, coming up with new methods of attack all the time.

Unsupervised AI to the rescue.

No matter what tactic a hacker uses, unsupervised machine learning AI seeks out patterns outside the network norm. SecOps teams can immediately focus on issues as they arise and even swat down attacks before they cause damage or lead to data loss.

When it comes to chatter within the VC and Startup community about startups claiming to have a foundation in AI and deep learning (and their opportunity for success in the market), there are four market claims that MixMode CTO and Chief Scientist, Igor Mezic, would like to address.

Mezic developed the patented unsupervised AI that drives the MixMode platform and shares his thoughts here:

The computational cost related to training can quickly offset the potential benefits of deep learning. Mezic explains that this cost is a concern only when deep learning is used exclusively to reach customer data. Mezic developed the MixMode platform as a layered approach an unsupervised, or semi-supervised architecture that yields efficiency in computation, widening profit margins.

Additionally, Mezic says, MixMode can train AI via an efficient seven-day initial period. (Check out our most recent video on Network Baselines here.) From there, the AI keeps learning in an unsupervised manner.

Mezic stresses that the MixModes semi-supervised algorithms require less human touch, as well, further driving down expenses. MixMode requires little labeling and depends on customer interactions with the AI rather than internal MixMode resources, he explains.

Tech differentiation is difficult to achieve in the cluttered AI landscape. Mezic points to MixModes Third Wave class of AI algorithms that are not commoditized. Mezic says, the MixMode algorithm has led to several implementation patents and substantial additional network effects that enhance defensive IP moats.

Some claim that each organization on the customer side might have different data and associated requirements. However, enterprise networks are similar in the type of data they produce. Thus, Mezic argues, a single algorithm can be applied to all such data without substantial modifications on the ingest side.

Many machine learning startups are service-oriented. However, Mezic makes a distinction between the norm and what MixMode brings to the table.

We have structured the AI here to apply to all the types of data that occurs in our narrow, specifically defined problem, he explains. This level of specificity provides much more robust network protection over services that approach data handling generically and only according to a specific set of requirements.

Some industry experts claim large organizations have inherent data and process inefficiencies, which make them ideal targets for meaningful machine learning benefits but that the process of adopting machine learning AI is too arduous and time-consuming. However, Mezic says the security data in many large organizations is well-structured and organized, which contributes to a streamlined system set-up.

In the end, while startup industry experts make valid points about the current state of AI-enhanced services and software, these arguments must be evaluated with an overarching understanding that AI can be a limiting term. MixModes unstructured machine-learning AI is distinct, and far more powerful, than garden-variety, supervised AI offerings.

Learn more about how MixMode is helping organizations detect, analyze, visualize, investigate, and respond to threats in real-time.

What the Clearview AI Breach Tells Us About Cybersecurity Today

The Big Switch: A Lack of Employable Security Professionals Causes Companies to Make the Switch to AI

New Video: Does MixMode work in the cloud, on premise, or in hybrid environments?

IDC Report: MixMode An Unsupervised AI-Driven Network Traffic Analysis Platform

MixMode Raises $4 Million in Series A Round Led by Entrada Ventures

In Case You Missed It: MixMode Whitepapers & Case Studies

How a Massive Shift to Working From Home Leaves an Enterprises Cybersecurity Vulnerable

In Case You Missed It: MixMode Integrations of 2020

View original post here:
Not All AI and ML is Created Equal - Security Boulevard

Linear Regression: Concepts and Applications with TensorFlow 2.0 – Built In

Linear regression is probably the first algorithm that one would learn when commencing a career in machine or deep learning because its simple to implement and easy to apply in real-time. This algorithm is widely used in data science and statistical fields to model the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent variables). Several types of regression techniques are available based on the data being used. Although linear regression involves simple mathematical logic, its applications are put into use across different fields in real-time. In this article, well discuss linear regression in brief, along with its applications, and implement it using TensorFlow 2.0.

Regression analysis is used to estimate the relationship between a dependent variable and one or more independent variables. This technique is widely applied to predict the outputs, forecasting the data, analyzing the time series, and finding the causal effect dependencies between the variables. There are several types of regression techniques at hand based on the number of independent variables, the dimensionality of the regression line, and the type of dependent variable. Out of these, the two most popular regression techniques are linear regression and logistic regression.

Researchers use regression to indicate the strength of the impact of multiple independent variables on a dependent variable on different scales. Regression has numerous applications. For example, consider a dataset consisting of weather information recorded over the past few decades. Using that data, we could forecast weather for the next couple of years. Regression is also widely used in organizations and businesses to assess risk and growth based on previously recorded data.

You can find the implementation of regression analysis directly as a deployable code chunk. In modern machine learning frameworks like TensorFlow and PyTorch, in-built libraries are available to directly proceed with the implementation of our desired application.

The goal of linear regression is to identify the best fit line passing through continuous data by employing a specific mathematical criterion. This technique falls under the umbrella of supervised machine learning. Prior to jumping into linear regression, though, we first should understand what supervised learning is all about.

Machine learning is broadly classified into three types; supervised learning, unsupervised learning, and reinforcement learning. This classification is based on the data that we give to the algorithm. In supervised learning, we train the algorithm with both input and output data. Unsupervised learning occurs when theres no output data given to the algorithm and it has to learn the underlying patterns by analyzing the input data. Finally, reinforcement learning involves an agent taking an action in an environment to maximize the reward in a particular situation. It paves the way for choosing the best possible path for an algorithm to traverse. Now, lets look more closely at linear regression itself.

Linear regression assumes that the relationship between the features and the target vector is approximately linear. That is, the effect (also called coefficient, weight, or parameter) of the features on the target vector is constant. Mathematically, linear regression is represented by the equationy = mx + c + .

In this equation, y is our target, x is the data for a single feature, m and c are the coefficients identified by fitting the model, and is the error.

Now, our goal is to tune the values of m and c to establish a good relationship between the input variable x and the output variable y. The variable m in the equation is called variance and is defined as the amount by which the estimate of the target function changes if different training data were used. The variable c represents the bias, the algorithms tendency to consistently learn the wrong things by not taking into account all the information in the data. For the model to be accurate, bias needs to be low. If there are any inconsistencies or missing values in the dataset, bias increases. Hence, we must carry out proper preprocessing of the data before we train the algorithm.

The two main metrics we use to evaluate linear regression models are accuracy and error. For a model to be highly accurate with minimum error, we need to achieve low bias and low variance. We partition the data into training and testing datasets to keep bias in check and ensure accuracy.

Before we build a supervised machine learning model, all we have is data comprising inputs and outputs. To estimate the dependency between them using linear regression, we pick two random values, variance and bias. Thereby, we consider a tuple from the dataset, feed the input values to the equation y = mx + c, and predict the new values. Later, we calculate the loss incurred by the predicted value using a loss function.

The values of m and c are picked randomly, but they must be updated to minimize the error. We thereby consider loss function as a metric to evaluate the model. Our goal is to obtain a line that best reduces the error.

The most common loss function used is mean squared error. It is mathematically represented as

If we dont square the error, the positive and negative points cancel each other out. The static mathematical equations of bias and variance are as follows:

When we train a network to find the ideal variance and bias, different values can yield different errors. Out of all the values, there will be one point where the error value will be minimized, and the parameters corresponding to this value will yield an optimal solution. At this point, gradient descent comes into the picture.

Gradient descent is an optimization algorithm that finds the values of parameters (coefficients) of a function (f) to minimize the cost function (cost). The learning rate defines the rate at which the parameters are updated. It controls the rate at which we would be adjusting the weights of our network with respect to the loss gradient. The lower the value, the slower we travel the downward slope along which the weights get updated at every step.

Both the m and c values are updated as follows:

Once the model is trained and achieves a minimum error, we can fix the values of bias and variance. Ultimately, this is how the best fit line looks like when plotted between the data points:

So far, weve seen the fundamentals of linear regression, and now its time to implement one. We could use several data science and machine learning libraries to directly import linear regression functions or APIs and apply them to the data. In this section, we will build a model with TensorFlow thats based on the math which we talked about in the previous sections. The code is organized as a sequence of steps. You can simultaneously implement these chunks of code in your local machine or in any of the cloud platforms like Paperspace or Google Colab. If its your local machine, make sure to install Python and TensorFlow. If you are using Google Colab Notebooks, TensorFlow is preinstalled. To install any other modules like sklearn or matplotlib, you can use pip. Make sure you add an exclamation (!) symbol as a prefix to the pip command, which allows you to access the terminal from the notebook.

Step 1: Importing the Necessary Modules

Getting started, first and foremost, we need to import all the necessary modules and packages. In Python, we use the import keyword to do this. We can also alias them using the keyword as. For example, to create a TensorFlow variable, we import TensorFlow first, followed by the class tensorflow.Variable(). If we create an alias for TensorFlow as tf, we can create the variable as tf.Variable(). This saves time and makes the code look clean. We then import a few other methods from the __future__ library to help port our code from Python 2 to Python 3. We also import numpy to create a few samples of data. We declare a variable rng with np.random which is later used to initialize random weights and biases.

Step 2: Creating a Random Dataset

The second step is to prepare the data. Here, we use numpy to initialize both the input and output arrays. We also need to make sure that both arrays are the same shape so that every element in the input array would correspond to every other element in the output array. Our goal is to identify the relationship between each corresponding element in the input array and the output array using Linear Regression. Below is the code snippet that we would use to load the input values into variable x and output values into variable y.

Step 3: Setting up the Hyperparameters

Hyperparameters are the core components of any neural network architecture because they ensure accuracy of a model. In the code snippet below, we define learning rate, number of epochs, and display steps. You can also experiment by tweaking the hyperparameters to achieve a greater accuracy.

Step 4: Initializing Weights and Biases

Now that we have our parameters equipped, lets initialize weights and biases with random numerics. We do this using the rng variable that was previously declared. We define two tensorflow variables W and b and set them to random weights and biases, respectively, using the tf.Variable class.

Step 5: Defining Linear Regression and Cost Function

Here comes the essential component of our code! We now define linear regression as a simple function, linear_regression. The function takes input x as a parameter and returns the weighted sum, weights * inputs + bias. This function is later called in the training loop while training the model with data. Further, we define loss as a function called mean_square. This function takes a predicted value that is returned by the linear_regression method and a true value that is picked from the dataset. We then use tf to replicate the math equation discussed above and return the computed value from the function thereupon.

Step 6: Building Optimizers and Gradients

We now define our optimizer as stochastic gradient descent and plug in learning rate as a parameter to it. Next, we define the optimization process as a function, run_optimization, where we calculate the predicted values and the loss that they generate using our linear_regression() and mean_square() functions as defined in the previous step. Thereafter, we compute the gradients and update the weights in the optimization process. This function is invoked in the training loop that well discuss in the upcoming section.

Step 7: Constructing the Training Loop

This marks the end of our training process. We have set all the parameters, declared our models, loss function, and the optimization function. In the training loop, we stack all these and iterate the data for a certain number of epochs. The model gets trained and with every iteration, the weights get updated. Once the total number of iterations is complete, we get the ideal values of W and b.

Lets work through the code chunk below. We write a simple for loop in Python and iterate the data until the total number of epochs is complete. We then run our optimization function by invoking the run_optimization method where the weights get updated using the previously defined SGD rule. We then display the loss and the step number using the print function, along with the metrics.

Step 8: Visualizing Linear Regression

While concluding the code, we visualize the best fit line using matplotlib library.

Linear regression is a powerful statistical technique that can generate insights on consumer behavior, help to understand business better, and comprehend factors influencing profitability. It can also be put to service evaluating trends and forecasting data in a variety of fields. We can use linear regression to solve a few of our day-to-day problems related to supporting decision making, minimizing errors, increasing operational efficiency, discovering new insights, and creating predictive analytics.

In this article, we have reviewed how linear regression works, along with its implementation in TensorFlow 2.0. This method sets the baseline to further explore the various ways of chalking out machine learning algorithms. Now that you have a handle on linear regression and TensorFlow 2.0, you can try experimenting further with a lot other frameworks by considering various datasets to check how each one of those fares.

Vihar Kurama is a machine learning engineer who writes regularly about machine learning and data science.

Expert Contributor Network

Built Ins expert contributor network publishes thoughtful, solutions-oriented stories written by innovative tech professionals. It is the tech industrys definitive destination for sharing compelling, first-person accounts of problem-solving on the road to innovation.

See the original post:
Linear Regression: Concepts and Applications with TensorFlow 2.0 - Built In

KDD 2020 Invites Top Data Scientists To Compete in 24th Annual KDD Cup – Monterey County Weekly

SAN DIEGO, April 23, 2020 /PRNewswire/ --The Association for Computing Machinery's Special Interest Group on Knowledge Discovery and Data Mining officially opened registration for its annual KDD Cup, the organization's signature data science competition. This year's competition features four distinct tracks that welcome participants to tackle challenges in e-commerce, generative adversarial networks, automatic graph representation learning (AutoGraph) and mobility-on-demand (MoD) platforms. Winners will be recognized at KDD 2020, the leading interdisciplinary conference in data science, in San Diego on August 23-27, 2020.

"As one of the first competitions of its kind, the KDD Cup has a long history of solving problems by crowd sourcing participation and has given rise to many other popular competition platforms," said Iryna Skrypnyk, co-chair of KDD Cup 2020 and head of the AI Innovation Lab at Pfizer. "Today, KDD Cup is not only an opportunity for data scientists to build their profiles and connect with leading companies but apply their skillset to emerging areas with machine learning on graphs like knowledge graph or drug design, and growth markets like the rideshare industry."

In 2019, more than 2,800 teams registered for the KDD Cup, representing 39 countries and 230 academic or corporate institutions. KDD Cup competition winners are selected by an entirely automated process. In 2020, the KDD Cup features different types of data science including regular machine learning, automated machine learning and reinforcement learning. The competition tracks include:

In addition to Iryna Skrypnyk, KDD Cup 2020 is co-chaired by Claudia Perlich, senior data scientist at Two Sigma; Jie Tang, professor of Computer Science at Tsinghua University; and Jieping Ye, vice president of research at Didi Chuxing and associate professor of Computer Science at the University of Michigan. For updates on this year's KDD Cup and links to each challenge, please visit:www.kdd.org.

AboutACM SIGKDD:ACM is the premier global professional organization for researchers and professionals dedicated to the advancement of the science and practice of knowledge discovery and data mining.SIGKDD is ACM's Special Interest Group on Knowledge Discovery and Data Mining.The annual KDD International Conference on Knowledge Discovery and Data Miningis thepremierinterdisciplinary conference for data mining, data science and analytics.

Contact or Follow SIGKDD on:Facebookhttps://www.facebook.com/SIGKDDTwitterhttps://twitter.com/kdd_newsLinkedInhttps://www.linkedin.com/groups/160888/

Read more:
KDD 2020 Invites Top Data Scientists To Compete in 24th Annual KDD Cup - Monterey County Weekly