The Media’s Coverage of AI is Bogus – Scientific American

Posted: November 25, 2019 at 2:46 pm

Headlines about machine learning promise godlike predictive power. Here are four examples:

With articles like these, the press will have you believe that machine learning can reliably predict whether you're gay, whether you'll develop psychosis, whether youll have a heart attack and whether you're a criminalas well as other ambitious predictions such as when you'll die and whether your unpublished book will be a bestseller.

It's all a lie. Machine learning cant confidently tell such things about each individual. In most cases, these things are simply too difficult to predict with certainty.

Here's how the lie works. Researchers report high "accuracy," but then later revealburied within the details of a technical paperthat they were actually misusing the word "accuracy" to mean another measure of performance related to accuracy but in actuality not nearly as impressive.

But the press runs with it. Time and again, this scheme succeeds in hoodwinking the media and generating flagrant publicity stunts that mislead.

Now, don't get me wrong; machine learning does deserve high praise. The ability to predict better than random guessing, even if not with high confidence for most cases, serves to improve all kinds of business and health care processes. That's pay dirt. And, in certain limited areas, machine learning can deliver strikingly high performance, such as for recognizing objects like traffic lights within photographs or recognizing the presence of certain diseases from medical images.

But, in other cases, researchers are falsely advertising high performance. Take Stanford University's infamous "gaydar" study. In its opening summary, the 2018 report claims its predictive model achieves 91 percent accuracy distinguishing gay and straight males from facial images. This inspired journalists to broadcast gross exaggerations. The Newsweek article highlighted above kicked off with "Artificial intelligence can now tell whether you are gay or straight simply by analyzing a picture of your face."

This deceptive media coverage is to be expected. The researchers opening claim has tacitly conveyedto lay readers, nontechnical journalists and even casual technical readersthat the system can tell who's gay and who isn't and usually be correct about it.

That assertion is false. The model can't confidently "tell" for any given photograph. Rather, what Stanford's model can actually do 91 percent of the time is much less remarkable: It can identify which of a pair of two males are gay when it's already been established that one is and one is not.

This "pairing test" tells a seductive story, but it's a deceptive one. It translates to low performance outside the research lab, where there's no contrived scenario presenting such pairings. Employing the model in the real world would require a tough trade-off. You could tune the model to correctly identify, say, two thirds of all gay individuals, but that would come at a price: When it predicted someone to be gay, it would be wrong more than half of the timea high false positive rate. And if you configure its settings so that it correctly identifies even more than two thirds, the model will exhibit an even higher false positive rate.

The reason for this is that one of the two categories is infrequentin this case, gay individuals, which amount to about 7 percent of males (according to the Stanford report). When one category is in the minority, that intrinsically makes it more challenging to reliably predict.

Now, the researchers did report on a viable measure of performance, called AUCalbeit mislabeled in their report as "accuracy." AUC (Area Under the receiver operating characteristic Curve) indicates the extent of performance trade-offs available. The higher the AUC, the better the trade-off options offered by the predictive model.

In the field of machine learning, accuracy means something simpler: How often the predictive model is correctthe percent of cases it gets right. When researchers use the word to mean anything else, they're at best adopting willful ignorance and at worst consciously laying a trap to ensnare the media.

But researchers face two publicity challenges: How can you make something as technical as AUC sexy and at the same time sell your predictive models performance? No problem. As it turns out, the AUC is mathematically equal to the result you get running the pairing test. And so, a 91 percent AUC can be explained with a story about distinguishing between pairs that sounds to many journalists like "high accuracy"especially when the researchers commit the cardinal sin of just baldlyand falselycalling it "accuracy." Voila! Both the journalists and their readers believe the model can "tell" whether you're gay.

This accuracy fallacy scheme is applied far and wide, with overblown claims about machine learning accurately predicting, among other things, psychosis, criminality, death, suicide, bestselling books, fraudulent dating profiles, banana crop diseases and various medical conditions. For an addendum to this article that covers 20 more examples, click here.

In some of these cases, researchers perpetrate a variation on the accuracy fallacy scheme: they report the accuracy you would get if half the cases were positivethat is, if the common and rare categories took place equally often. Mathematically, this usually inflates the reported "accuracy" a bit less than AUC, but it's a similar maneuver and overstates performance in much the same way.

In popular culture, "gaydar" refers to an unattainable form of human clairvoyance. We shouldnt expect machine learning to attain supernatural abilities either. Many human behaviors defy reliable prediction. Its like predicting the weather many weeks in advance. There's no achieving high certainty. There's no magic crystal ball. Readers at large must hone a certain vigilance: Be wary about claims of "high accuracy" in machine learning. If it sounds too good to be true, it probably is.

Original post:

The Media's Coverage of AI is Bogus - Scientific American

Related Posts