This article is part of ourreviews of AI research papers, a series of posts that explore the latest findings in artificial intelligence.
Human-level performance. Human-level accuracy. Those are terms you hear a lot from companies developing artificial intelligence systems, whether its facial recognition, object detection, or question answering. And to their credit, the recent years have seen many great products powered by AI algorithms, mostly thanks to advances in machine learning and deep learning.
But many of these comparisons only take into account the end-result of testing the deep learning algorithms on limited data sets. This approach can create false expectations about AI systems and yield dangerous results when they are entrusted with critical tasks.
In a recent study, a group of researchers from various German organizations and universities have highlighted the challenges of evaluating the performance of deep learning in processing visual data. In their paper, titled, The Notorious Difficulty of Comparing Human and Machine Perception, the researchers highlight the problems in current methods that compare deep neural networks and the human vision system.
In their research, the scientist conducted a series of experiments that dig beneath the surface of deep learning results and compare them to the workings of the human vision system. Their findings are reminder that we must be cautious when comparing AI to humans, even if it shows equal or better performance on the same task.
In the seemingly endless quest to reconstruct human perception, the field that has become known as computer vision, deep learning has so far yielded the most favorable results. Convolutional neural networks (CNN), an architecture often used in computer vision deep learning algorithms, are accomplishing tasks that were extremely difficult with traditional software.
However, comparing neural networks to the human perception remains a challenge. And this is partly because we still have a lot to learn about the human vision system and the human brain in general. The complex workings of deep learning systems also compound the problem. Deep neural networks work in very complicated ways that often confound their own creators.
In recent years, a body of research has tried to evaluate the inner workings of neural networks and their robustness in handling real-world situations. Despite a multitude of studies, comparing human and machine perception is not straightforward, the German researchers write in their paper.
In their study, the scientists focused on three areas to gauge how humans and deep neural networks process visual data.
The first test involves contour detection. In this experiment, both humans and AI participants must say whether an image contains a closed contour or not. The goal here is to understand whether deep learning algorithms can learn the concept of closed and open shapes, and whether they can detect them under various conditions.
For humans, a closed contour flanked by many open contours perceptually stands out. In contrast, detecting closed contours might be difficult for DNNs as they would presumably require a long-range contour integration, the researchers write.
For the experiment, the scientists used the ResNet-50, a popular convolutional neural network developed by AI researchers at Microsoft. They used transfer learning to finetune the AI model on 14,000 images of closed and open contours.
They then tested the AI on various examples that resembled the training data and gradually shifted in other directions. The initial findings showed that a well-trained neural network seems to grasp the idea of a closed contour. Even though the network was trained on a dataset that only contained shapes with straight lines, it could also performed well on curved lines.
These results suggest that our model did, in fact, learn the concept of open and closed contours and that it performs a similar contour integration-like process as humans, the scientists write.
However, further investigation showed that other changes that didnt affect human performance degraded the accuracy of the AI models results. For instance, changing the color and width of the lines caused a sudden drop in the accuracy of the deep learning model. The model also seemed to struggle with detecting shapes when they became larger than a certain size.
The neural network was also very sensitive to adversarial perturbations, carefully crafted changes that are imperceptible to the human eye but cause disruption in the behavior of machine learning systems.
To further investigate the decision-making process of the AI, the scientists used a Bag-of-Feature network, a technique that tries to localize the bits of data that contribute to the decision of a deep learning model. The analysis proved that there do exist local features such as an endpoint in conjunction with a short edge that can often give away the correct class label, the researchers found.
The second experiment tested the abilities of deep learning algorithms in abstract visual reasoning. The data used for the experiment is based on the Synthetic Visual Reasoning Test (SVRT), in which the AI must answer questions that require understanding of the relations between different shapes in the picture. The tests include same-different tasks (e.g., are two shapes in a picture identical?) and spatial tasks (e.g., is the smaller shape in the center of the larger shape?). A human observer would easily solve these problems.
For their experiment, the researchers use the ResNet-50 and tested how it performed with different sizes of training dataset. The results show that a pretrained model finetuned on 28,000 samples performs well both on same-different and spatial tasks. (Previous experiments trained a very small neural network on a million images.) The performance of the AI dropped as the researchers reduced the number of training examples, but degradation in same-different tasks was faster.
Same-different tasks require more training samples than spatial reasoning tasks, the researchers write, adding, this cannot be taken as evidence for systematic differences between feed-forward neural networks and the human visual system.
The researchers note that the human visual system is naturally pre-trained on large amounts of abstract visual reasoning tasks. This makes it unfair to test the deep learning model on a low-data regime, and it is almost impossible to draw solid conclusions about differences in the internal information processing of humans and AI.
It might very well be that the human visual system trained from scratch on the two types of tasks would exhibit a similar difference in sample efficiency as a ResNet-50, the researchers write.
The recognition gap is one of the most interesting tests of visual systems. Consider the following image. Can you tell what it is without scrolling further down?
Below is the zoomed-out view of the same image. Theres no question that its a cat. If I showed you a close-up of another part of the image (perhaps the ear), you might have had a greater chance of predicting what was in the image. We humans need to see a certain amount of overall shapes and patterns to be able to recognize an object in an image. The more you zoom in, the more features youre removing, and the harder it becomes to distinguish what is in the image.
Deep learning systems also operate on features, but they work in subtler ways. Neural networks sometimes the find minuscule features that are imperceptible to the human eye but remain detectable even when you zoom in very closely.
In their final experiment, the researchers tried to measure the recognition gap of deep neural networks by gradually zooming in images until the accuracy of the AI model started to degrade considerably.
Previous experiments show a large difference between the image recognition gap in humans and deep neural networks. But in their paper, the researchers point out that most previous tests on neural network recognition gaps are based on human-selected image patches. These patches favor the human vision system.
When they tested their deep learning models on machine-selected patches, the researchers obtained results that showed a similar gap in humans and AI.
These results highlight the importance of testing humans and machines on the exact same footing and of avoiding a human bias in the experiment design, the researchers write. All conditions, instructions and procedures should be as close as possible between humans and machines in order to ensure that all observed differences are due to inherently different decision strategies rather than differences in the testing procedure.
As our AI systems become more complex, we will have to develop more complex methods to test them. Previous work in the field shows that many of the popular benchmarks used to measure the accuracy of computer vision systems are misleading. The work by the German researchers is one of many efforts that attempt to measure artificial intelligence and better quantify the differences between AI and human intelligence. And they draw conclusions that can provide directions for future AI research.
The overarching challenge in comparison studies between humans and machines seems to be the strong internal human interpretation bias, the researchers write. Appropriate analysis tools and extensive cross checks such as variations in the network architecture, alignment of experimental procedures, generalization tests, adversarial examples and tests with constrained networks help rationalizing the interpretation of findings and put this internal bias into perspective. All in all, care has to be taken to not impose our human systematic bias when comparing human and machine perception.
Original post:
Computer vision: Why its hard to compare AI and human perception - TechTalks
- Classic reasoning systems like Loom and PowerLoom vs. more modern systems based on probalistic networks [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Using Amazon's cloud service for computationally expensive calculations [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Software environments for working on AI projects [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- New version of my NLP toolkit [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Semantic Web: through the back door with HTML and CSS [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Java FastTag part of speech tagger is now released under the LGPL [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Defining AI and Knowledge Engineering [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Great Overview of Knowledge Representation [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Something like Google page rank for semantic web URIs [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- My experiences writing AI software for vehicle control in games and virtual reality systems [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- The URL for this blog has changed [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- I have a new page on Knowledge Management [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- N-GRAM analysis using Ruby [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Good video: Knowledge Representation and the Semantic Web [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Using the PowerLoom reasoning system with JRuby [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Machines Like Us [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- RapidMiner machine learning, data mining, and visualization tool [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- texai.org [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- NLTK: The Natural Language Toolkit [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- My OpenCalais Ruby client library [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Ruby API for accessing Freebase/Metaweb structured data [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Protégé OWL Ontology Editor [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- New version of Numenta software is available [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Very nice: Elsevier IJCAI AI Journal articles now available for free as PDFs [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Verison 2.0 of OpenCyc is available [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- What’s Your Biggest Question about Artificial Intelligence? [Article] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Minimax Search [Knowledge] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Decision Tree [Knowledge] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- More AI Content & Format Preference Poll [Article] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- New Planners Solve Rescue Missions [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Neural Network Learns to Bluff at Poker [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Pushing the Limits of Game AI Technology [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Mining Data for the Netflix Prize [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Interview with Peter Denning on the Principles of Computing [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Decision Making for Medical Support [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Neural Network Creates Music CD [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- jKilavuz - a guide in the polygon soup [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Artificial General Intelligence: Now Is the Time [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Apply AI 2007 Roundtable Report [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- What Would You do With 80 Cores? [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Software Finds Learning Language Child's Play [News] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Artificial Intelligence in Games [Article] [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Artificial Intelligence Resources [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Alan Turing: Mathematical Biologist? [Last Updated On: April 25th, 2012] [Originally Added On: April 25th, 2012]
- BBC Horizon: The Hunt for AI ( Artificial Intelligence ) - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Can computers have true artificial intelligence" Masonic handshake" 3rd-April-2012 - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Kevin B. Korb - Interview - Artificial Intelligence and the Singularity p3 - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Artificial Intelligence - 6 Month Anniversary - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Science Breakthroughs [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Hitman: Blood Money - Part 49 - Stupid Artificial Intelligence! - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Research Members Turned Off By HAARP Artificial Intelligence - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Artificial Intelligence Lecture No. 5 - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- The Artificial Intelligence Laboratory, 2012 - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Charlie Rose - Artificial Intelligence - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Expert on artificial intelligence to speak at EPIIC Nights dinner [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- Filipino software engineers complete and best thousands on Stanford’s Artificial Intelligence Course [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- Vodafone xone™ Hackathon Challenges Developers and Entrepreneurs to Build a New Generation of Artificial Intelligence ... [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- Rocket Fuel Packages Up CPG Booster [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- 2 Filipinos finishes among top in Stanford’s Artificial Intelligence course [Last Updated On: May 5th, 2012] [Originally Added On: May 5th, 2012]
- Why Your Brain Isn't A Computer [Last Updated On: May 5th, 2012] [Originally Added On: May 5th, 2012]
- 2 Pinoy software engineers complete Stanford's AI course [Last Updated On: May 7th, 2012] [Originally Added On: May 7th, 2012]
- Percipio Media, LLC Proudly Accepts Partnership With MIT's Prestigious Computer Science And Artificial Intelligence ... [Last Updated On: May 10th, 2012] [Originally Added On: May 10th, 2012]
- Google Driverless Car Ok'd by Nevada [Last Updated On: May 10th, 2012] [Originally Added On: May 10th, 2012]
- Moving Beyond the Marketing Funnel: Rocket Fuel and Forrester Research Announce Free Webinar [Last Updated On: May 10th, 2012] [Originally Added On: May 10th, 2012]
- Rocket Fuel Wins 2012 San Francisco Business Times Tech & Innovation Award [Last Updated On: May 13th, 2012] [Originally Added On: May 13th, 2012]
- Internet Week 2012: Rocket Fuel to Speak at OMMA RTB [Last Updated On: May 16th, 2012] [Originally Added On: May 16th, 2012]
- How to Get the Most Out of Your Facebook Ads -- Rocket Fuel's VP of Products, Eshwar Belani, to Lead MarketingProfs ... [Last Updated On: May 16th, 2012] [Originally Added On: May 16th, 2012]
- The Digital Disruptor To Banking Has Just Gone International [Last Updated On: May 16th, 2012] [Originally Added On: May 16th, 2012]
- Moving Beyond the Marketing Funnel: Rocket Fuel Announce Free Webinar Featuring an Independent Research Firm [Last Updated On: May 23rd, 2012] [Originally Added On: May 23rd, 2012]
- MASA Showcases Latest Version of MASA SWORD for Homeland Security Markets [Last Updated On: May 23rd, 2012] [Originally Added On: May 23rd, 2012]
- Bluesky Launches Drones for Aerial Surveying [Last Updated On: May 23rd, 2012] [Originally Added On: May 23rd, 2012]
- Artificial Intelligence: What happened to the hunt for thinking machines? [Last Updated On: May 25th, 2012] [Originally Added On: May 25th, 2012]
- Bubble Robots Move Using Lasers [VIDEO] [Last Updated On: May 25th, 2012] [Originally Added On: May 25th, 2012]
- UHV assistant professors receive $10,000 summer research grants [Last Updated On: May 27th, 2012] [Originally Added On: May 27th, 2012]
- Artificial intelligence: science fiction or simply science? [Last Updated On: May 28th, 2012] [Originally Added On: May 28th, 2012]
- Exetel taps artificial intelligence [Last Updated On: May 29th, 2012] [Originally Added On: May 29th, 2012]
- Software offers brain on the rain [Last Updated On: May 29th, 2012] [Originally Added On: May 29th, 2012]
- New Dean of Science has high hopes for his faculty [Last Updated On: May 30th, 2012] [Originally Added On: May 30th, 2012]
- Cognitive Code Announces "Silvia For Android" App [Last Updated On: May 31st, 2012] [Originally Added On: May 31st, 2012]
- A Rat is Smarter Than Google [Last Updated On: June 5th, 2012] [Originally Added On: June 5th, 2012]