Advancing AI trustworthiness: Updates on responsible AI research – Microsoft

Posted: February 5, 2022 at 5:05 am

Editors note: This year in review of responsible AI research was compiled by Aether, a Microsoft cross-company initiative on AI Ethics and Effects in Engineering and Research, as outreach from their commitment to advancing the practice of human-centered responsible AI. Although many of the papers authors are participants in Aether, the research presented here expands beyond, encompassing work from across Microsoft, as well as with collaborators in academia and industry.

Inflated expectations around the capabilities of AI technologies may lead people to believe that computers cant be wrong. The truth is AI failures are not a matter of if but when. AI is a human endeavor that combines information about people and the physical world into mathematical constructs. Such technologies typically rely on statistical methods, with the possibility for errors throughout an AI systems lifespan. As AI systems become more widely used across domains, especially in high-stakes scenarios where peoples safety and wellbeing can be affected, a critical question must be addressed: how trustworthy are AI systems, and how much and when should people trust AI?

As part of their ongoing commitment to building AI responsibly, research scientists and engineers at Microsoft are pursuing methods and technologies aimed at helping builders of AI systems cultivate appropriate trustthat is, building trustworthy models with reliable behaviors and clear communication that set proper expectations. When AI builders plan for failures, work to understand the nature of the failures, and implement ways to effectively mitigate potential harms, they help engender trust that can lead to a greater realization of AIs benefits.

Pursuing trustworthiness across AI systems captures the intent of multiple projects on the responsible development and fielding of AI technologies. Numerous efforts at Microsoft have been nurtured by its Aether Committee, a coordinative cross-company council comprised of working groups focused on technical leadership at the frontiers of innovation in responsible AI. The effort is led by researchers and engineers at Microsoft Research and from across the company and is chaired by Chief Scientific Officer Eric Horvitz. Beyond research, Aether has advised Microsoft leadership on responsible AI challenges and opportunities since the committees inception in 2016.

The following is a sampling of research from the past year representing efforts across the Microsoft responsible AI ecosystem that highlight ways for creating appropriate trust in AI. Facilitating trustworthy measurement, improving human-AI collaboration, designing for natural language processing (NLP), advancing transparency and interpretability, and exploring the open questions around AI safety, security, and privacy are key considerations for developing AI responsibly. The goal of trustworthy AI requires a shift in perspective at every stage of the AI development and deployment life cycle. Were actively developing a growing number of best practices and tools to help with the shift to make responsible AI more available to a broader base of users. Many open questions remain, but as innovators, we are committed to tackling these challenges with curiosity, enthusiasm, and humility.

AI technologies influence the world through the connection of machine learning modelsthat provide classifications, diagnoses, predictions, and recommendationswith larger systems that drive displays, guide controls, and activate effectors. But when we use AI to help us understand patterns in human behavior and complex societal phenomena, we need to be vigilant. By creating models for assessing or measuring human behavior, were participating in the very act of shaping society. Guidelines for ethically navigating technologys impacts on societyguidance born out of considering technologies for COVID-19prompt us to start by weighing a projects risk of harm against its benefits. Sometimes an important step in the practice of responsible AI may be the decision to not build a particular model or application.

Human behavior and algorithms influence each other in feedback loops. In a recent Nature publication, Microsoft researchers and collaborators emphasize that existing methods for measuring social phenomena may not be up to the task of investigating societies where human behavior and algorithms affect each other. They offer five best practices for advancing computational social science. These include developing measurement models that are informed by social theory and that are fair, transparent, interpretable, and privacy preserving. For trustworthy measurement, its crucial to document and justify the models underlying assumptions, plus consider who is deciding what to measure and how those results will be used.

In line with these best practices, Microsoft researchers and collaborators have proposed measurement modeling as a framework for anticipating and mitigating fairness-related harms caused by AI systems. This framework can help identify mismatches between theoretical understandings of abstract conceptsfor example, socioeconomic statusand how these concepts get translated into mathematics and code. Identifying mismatches helps AI practitioners to anticipate and mitigate fairness-related harms that reinforce societal biases and inequities. A study applying a measurement modeling lens to several benchmark datasets for surfacing stereotypes in NLP systems reveals considerable ambiguity and hidden assumptions, demonstrating (among other things) that datasets widely trusted for measuring the presence of stereotyping can, in fact, cause stereotyping harms.

Flaws in datasets can lead to AI systems with unfair outcomes, such as poor quality of service or denial of opportunities and resources for different groups of people. AI practitioners need to understand how their systems are performing for factors like age, race, gender, and socioeconomic status so they can mitigate potential harms. In identifying the decisions that AI practitioners must make when evaluating an AI systems performance for different groups of people, researchers highlight the importance of rigor in the construction of evaluation datasets.

Making sure that datasets are representative and inclusive means facilitating data collection from different groups of people, including people with disabilities. Mainstream AI systems are often non-inclusive. For example, speech recognition systems do not work for atypical speech, while input devices are not accessible for people with limited mobility. In pursuit of inclusive AI, a study proposes guidelines for designing an accessible online infrastructure for collecting data from people with disabilities, one that is built to respect, protect, and motivate those contributing data.

When people and AI collaborate on solving problems, the benefits can be impressive. But current practice can be far from establishing a successful partnership between people and AI systems. A promising advance and direction of research is developing methods that learn about ideal ways to complement people with problem solving. In the approach, machine learning models are optimized to detect where people need the most help versus where people can solve problems well on their own. We can additionally train the AI systems to make decisions as to when a system should ask an individual for input and to combine the human and machine abilities to make a recommendation. In related work, studies have shown that people will too often accept an AI systems outputs without question, relying on them even when they are wrong. Exploring how to facilitate appropriate trust in human-AI teamwork, experiments with real-world datasets for AI systems show that retraining a model with a human-centered approach can better optimize human-AI team performance. This means taking into account human accuracy, human effort, the cost of mistakesand peoples mental models of the AI.

In systems for healthcare and other high-stakes scenarios, a break with the users mental model can have severe impacts. An AI system can compromise trust when, after an update for better overall accuracy, it begins to underperform in some areas. For instance, an updated system for predicting cancerous skin moles may have an increase in accuracy overall but a significant decrease for facial moles. A physician using the system may either lose confidence in the benefits of the technology or, with more dire consequences, may not notice this drop in performance. Techniques for forcing an updated system to be compatible with a previous version produce tradeoffs in accuracy. But experiments demonstrate that personalizing objective functions can improve the performance-compatibility tradeoff for specific users by as much as 300 percent.

System updates can have grave consequences when it comes to algorithms used for prescribing recourse, such as how to fix a bad credit score to qualify for a loan. Updates can lead to people who have dutifully followed a prescribed recourse being denied their promised rights or services and damaging their trust in decision makers. Examining the impact of updates caused by changes in the data distribution, researchers expose previously unknown flaws in the current recourse-generation paradigm. This work points toward rethinking how to design these algorithms for robustness and reliability.

Complementarity in human-AI performance, where the human-AI team performs better together by compensating for each others weaknesses, is a goal for AI-assisted tasks. You might think that if a system provided an explanation of its output, this could help an individual identify and correct an AI failure, producing the best of human-AI teamwork. Surprisingly, and in contrast to prior work, a large-scale study shows that explanations may not significantly increase human-AI team performance. People often over-rely on recommendations even when the AI is incorrect. This is a call to action: we need to develop methods for communicating explanations that increase users understanding rather than to just persuade.

The allure of natural language processings potential, including rash claims of human parity, raises questions of how we can employ NLP technologies in ways that are truly useful, as well as fair and inclusive. To further these and other goals, Microsoft researchers and collaborators hosted the first workshop on bridging human-computer interaction and natural language processing, considering novel questions and research directions for designing NLP systems to align with peoples demonstrated needs.

Language shapes minds and societies. Technology that wields this power requires scrutiny as to what harms may ensue. For example, does an NLP system exacerbate stereotyping? Does it exhibit the same quality of service for people who speak the same language in different ways? A survey of 146 papers analyzing bias in NLP observes rampant pitfalls of unstated assumptions and conceptualizations of bias. To avoid these pitfalls, the authors outline recommendations based on the recognition of relationships between language and social hierarchies as fundamentals for fairness in the context of NLP. We must be precise in how we articulate ideas about fairness if we are to identify, measure, and mitigate NLP systems potential for fairness-related harms.

The open-ended nature of languageits inherent ambiguity, context-dependent meaning, and constant evolutiondrives home the need to plan for failures when developing NLP systems. Planning for NLP failures with the AI Playbook introduces a new tool for AI practitioners to anticipate errors and plan human-AI interaction so that the user experience is not severely disrupted when errors inevitably occur.

To build AI systems that are reliable and fairand to assess how much to trust thempractitioners and those using these systems need insight into their behavior. If we are to meet the goal of AI transparency, the AI/ML and human-computer interaction communities need to integrate efforts to create human-centered interpretability methods that yield explanations that can be clearly understood and are actionable by people using AI systems in real-world scenarios.

As a case in point, experiments investigating whether simple models that are thought to be interpretable achieve their intended effects rendered counterintuitive findings. When participants used an ML model considered to be interpretable to help them predict the selling prices of New York City apartments, they had difficulty detecting when the model was demonstrably wrong. Providing too many details of the models internals seemed to distract and cause information overload. Another recent study found that even when an explanation helps data scientists gain a more nuanced understanding of a model, they may be unwilling to make the effort to understand it if it slows down their workflow too much. As both studies show, testing with users is essential to see if people clearly understand and can use a models explanations to their benefit. User research is the only way to validate what is or is not interpretable by people using these systems.

Explanations that are meaningful to people using AI systems are key to the transparency and interpretability of black-box models. Introducing a weight-of-evidence approach to creating machine-generated explanations that are meaningful to people, Microsoft researchers and colleagues highlight the importance of designing explanations with peoples needs in mind and evaluating how people use interpretability tools and what their understanding is of the underlying concepts. The paper also underscores the need to provide well-designed tutorials.

Traceability and communication are also fundamental for demonstrating trustworthiness. Both AI practitioners and people using AI systems benefit from knowing the motivation and composition of datasets. Tools such as datasheets for datasets prompt AI dataset creators to carefully reflect on the process of creation, including any underlying assumptions they are making and potential risks or harms that might arise from the datasets use. And for dataset consumers, seeing the dataset creators documentation of goals and assumptions equips them to decide whether a dataset is suitable for the task they have in mind.

Interpretability is vital to debugging and mitigating the potentially harmful impacts of AI processes that so often take place in seemingly impenetrable black boxesit is difficult (and in many settings, inappropriate) to trust an AI model if you cant understand the model and correct it when it is wrong. Advanced glass-box learning algorithms can enable AI practitioners and stakeholders to see whats under the hood and better understand the behavior of AI systems. And advanced user interfaces can make it easier for people using AI systems to understand these models and then edit the models when they find mistakes or bias in them. Interpretability is also important to improve human-AI collaborationit is difficult for users to interact and collaborate with an AI model or system if they cant understand it. At Microsoft, we have developed glass-box learning methods that are now as accurate as previous black-box methods but yield AI models that are fully interpretable and editable.

Recent advances at Microsoft include a new neural GAM (generalized additive model) for interpretable deep learning, a method for using dropout rates to reduce spurious interaction, an efficient algorithm for recovering identifiable additive models, the development of glass-box models that are differentially private, and the creation of tools that make editing glass-box models easy for those using them so they can correct errors in the models and mitigate bias.

When considering how to shape appropriate trust in AI systems, there are many open questions about safety, security, and privacy. How do we stay a step ahead of attackers intent on subverting an AI system or harvesting its proprietary information? How can we avoid a systems potential for inferring spurious correlations?

With autonomous systems, it is important to acknowledge that no system operating in the real world will ever be complete. Its impossible to train a system for the many unknowns of the real world. Unintended outcomes can range from annoying to dangerous. For example, a self-driving car may splash pedestrians on a rainy day or erratically swerve to localize itself for lane-keeping. An overview of emerging research in avoiding negative side effects due to AI systems incomplete knowledge points to the importance of giving users the means to avoid or mitigate the undesired effects of an AI systems outputs as essential to how the technology will be viewed or used.

When dealing with data about people and our physical world, privacy considerations take a vast leap in complexity. For example, its possible for a malicious actor to isolate and re-identify individuals from information in large, anonymized datasets or from their interactions with online apps when using personal devices. Developments in privacy-preserving techniques face challenges in usability and adoption because of the deeply theoretical nature of concepts like homomorphic encryption, secure multiparty computation, and differential privacy. Exploring the design and governance challenges of privacy-preserving computation, interviews with builders of AI systems, policymakers, and industry leaders reveal confidence that the technology is useful, but the challenge is to bridge the gap from theory to practice in real-world applications. Engaging the human-computer interaction community will be a critical component.

Reliability and safety

Privacy and security

AI is not an end-all, be-all solution; its a powerful, albeit fallible, set of technologies. The challenge is to maximize the benefits of AI while anticipating and minimizing potential harms.

Admittedly, the goal of appropriate trust is challenging. Developing measurement tools for assessing a world in which algorithms are shaping our behaviors, exposing how systems arrive at decisions, planning for AI failures, and engaging the people on the receiving end of AI systems are important pieces. But what we do know is change can happen today with each one of us as we pause and reflect on our work, asking: what could go wrong, and what can I do to prevent it?

The rest is here:

Advancing AI trustworthiness: Updates on responsible AI research - Microsoft

Related Posts