Monday, May 21, 2018

AI SPECIAL .......The real-world potential and limitations of artificial intelligence PART II


The real-world potential and limitations of artificial intelligence  PART II

David Schwartz: Right. I’m hearing that we’re dealing with very complicated problems, very complex issues. How would someone, outside in, ever understand what may appear to be—may in fact be—almost a black box?
James Manyika: This is the question of explainability, which is: How do we even know that? You think about where we start applying these systems in the financial world—for example, to lending. If we deny you for a mortgage application, you may want to know why. What is the data point or feature set that led to that decision? If you apply the system set to the criminal-justice system, if somebody’s been let out on bail and somebody else wasn’t, you may want to understand why it is that we came to that conclusion. It may also be an important question for purely research purposes, where you’re trying to self-discover particular behaviors, and so you’re trying to understand what particular part of the data leads to a particular set of behaviors.
This is a very hard problem structurally. The good news, though, is that we’re starting to make progress on some of these things. One of the ways in which we’re making progress is with so-called GANs. These are more generalized, additive models where, as opposed to taking massive amounts of models at the same time, you almost take one feature model set at a time, and you build on it.
For example, when you apply the neural network, you’re exploring one particular feature, and then you layer on another feature; so, you can see how the results are changing based on this kind of layering, if you like, of different feature models. You can see, when the results shift, which model feature set seemed to have made the biggest difference. This is a way to start to get some insight into what exactly is driving the behaviors and outcomes you’re getting.
Michael Chui: One of the other big drivers for explainability is regulation and regulators. If a car decides to make a left turn versus a right turn, and there’s some liability associated with that, the legal system will want to ask the question, “Why did the car make the left turn or the right turn?” In the European Union, there’s the General Data Protection Regulation that will require explainability for certain types of decisions that these machines might make. The machines are completely deterministic. You could say, “Here are a million weights that are associated with our simulated neurons. Here’s why.” But that’s not engaging to a human being.
Another technique is an acronym, LIME, which is locally interpretable model-agnostic explanations. The idea there is from the outside in—rather than look at the structure of the model, just be able to perturb certain parts of the model and the inputs and see whether that makes a difference on the outputs. If you’re taking a look at an image and trying to recognize whether an object is a pickup truck or an ordinary sedan, you might say, “If I change the wind screen on the inputs, does that cause me to have a different output? On the other hand, if I change the back end of the vehicle, it looks like that makes a difference.” That says, that what this model is paying attention to as it’s determining whether it’s a sedan or a pickup truck is the back part of the vehicle. It’s basically doing experiments on the model in order to figure out what makes a difference. Those are some of the techniques that people are trying to use in order to explain how these systems work.
David Schwartz: At some level, I’m hearing from the questions and from what the rejoinder might be that there’s a very human element. A question would be: Why is the answer such and such? And the answer could be, it’s the algorithm. But somebody built that algorithm, or somebody—or a team of somebodies—and machines built that algorithm. That brings us to a limitation that is not quite like the others: bias—human predilections. Could you speak a little bit more about what we’re up against, James?
It becomes very, very important to think through what might be the inherent biases in the data, in any direction.
James Manyika: The question of bias is a very important one. And I’d put it into two parts.
Clearly, these algorithms are, in some ways, a big improvement on human biases. This is the positive side of the bias conversation. We know that, for example, sometimes, when humans are interpreting data on CVs [curriculum vitae], they might gravitate to one set of attributes and ignore some other attributes because of whatever predilections that they bring. There’s a big part of this in which the application of these algorithms is, in fact, a significant improvement compared to human biases. In that sense, this is a good thing. We want those kinds of benefits.
But I think it’s worth having the second part of the conversation, which is, even when we are applying these algorithms, we do know that they are creatures of the data and the inputs you put in. If those inputs you put in have some inherent biases themselves, you may be introducing different kinds of biases at much larger scale.
The work of people like Julia Angwin and others has actually shown this if the data collected is already biased. If you take policing as an example, we know that there are some communities that are more heavily policed. There’s a much larger police presence. Therefore, the data we’ve got and that’s collected about those environments is much, much, much higher. If we then start to compare, say, two neighborhoods, one where it’s oversampled—meaning there’s lots and lots of data available for it because there’s a larger police presence—versus another one where there isn’t much policing so, therefore, there isn’t much data available, we may draw the wrong conclusions about the heavily policed observed environment, just simply because there’s more data available for it versus the other one.
The biases can go another way. For example, in the case of lending, the implications might go the other way. For populations or segments where we have lots and lots of financial data about them, we may actually make good decisions because the data is largely available, versus in another environment where we’re talking about a segment of the population we don’t know much about, and the little bit that we know sends the decision off in one way. And so, that’s another example where the undersampling creates a bias.
The point about this second part is that I think it becomes very, very important to make sure that we think through what might be the inherent biases in the data, in any direction, that might be in the data set itself—either in the actual way it’s constructed, or even the way it’s collected, or the degree of sampling of the data and the granularity of it. Can we debias that in some fundamental way?
This is why the question of bias, for leaders, is particularly important, because it runs a risk of opening companies up to all kinds of potential litigation and social concern, particularly when you get to using these algorithms in ways that have social implications. Again, lending is a good example. Criminal justice is another example. Provision of healthcare is another example. These become very, very important arenas to think about these questions of bias.
Michael Chui: Some of the difficult cases where there’s bias in the data, at least in the first instance, isn’t around, as a primary factor, people’s inherent biases about choosing either one or the other. It is around, in many cases, these ideas about sampling—sampling bias, data-collection bias, et cetera—which, again, is not necessarily about unconscious human bias but an artifact of where the data came from.
There’s a very famous case, less AI related, where an American city used an app in the early days of smartphones that determined where potholes were based on the accelerometer shaking when you drove over a pothole. Strangely, it discovered that if you looked at the data, it seemed that there were more potholes in affluent parts of the city. That had nothing to do with the fact there were actually more potholes in that part of the city, but you had more signals from that part of the city because more affluent people had more smartphones at the time. That’s one of those cases where it wasn’t because of any intention to not pay attention to certain parts of the city. Understanding the providence of data—understanding what’s being sampled—is incredibly important.
There’s another researcher who has a famous TED Talk, Joy Buolamwini at MIT Media Lab. She does a lot of work on facial recognition, and she’s a black woman. And she says, “Look, a lot of the other researchers are more male and more pale than I am. And as a result, the accuracy for certain populations in facial recognition is far higher than it is for me.” So again, it’s not necessarily because people are trying to exclude populations, although sometimes that happens, it really has to do with understanding the representativeness of the sample that you’re using in order to train your systems.
So, as a business leader, you need to understand, if you’re going to train machine-learning systems: How representative are the training sets there that you’re using?
People forget that one of the things in the AI machine-deep-learning world is that many researchers are using largely the same data sets that are shared—that are public.
James Manyika: It actually creates an interesting tension. That’s why I described the part one and the part two. Because in the first instance, when you look at the part-one problem, which is the inherent human biases in normal day-to-day hiring and similar decisions, you get very excited about using AI techniques. You say, “Wow, for the first time, we have a way to get past these human biases in everyday decisions.” But at the same time, we should be thoughtful about where that takes us to when you get to these part-two problems, where you now are using large data sets that have inherent biases.
I think people forget that one of the things in the AI machine-deep-learning world is that many researchers are using largely the same data sets that are shared—that are public. Unless you happen to be a company that has these large, proprietary data sets, people are using this famous CIFAR data set, which is often used for object recognition. It’s publicly available. Most people benchmark their performance on image recognition based on these publicly available data sets. So, if everybody’s using common data sets that may have these inherent biases in them, we’re kind of replicating large-scale biases. This tension between part one and part two and this bias question are very important ones to think through. The good news, though, is that in the last couple years, there’s been a growing recognition of the issues we just described. And I think there are now many places that are putting real research effort into these questions about how you think about bias.
David Schwartz: What are best practices for AI, given what we’ve discussed today about the wide range of applications, the wide range of limitations, and the wide range of challenges before us?
Michael Chui: It is early, so to talk about best practices might be a little bit preliminary. I’ll steal a phrase that I once heard from Gary Hamel: we might be talking about next practices, in a certain sense. That said, there a few things that we’ve observed from leaders who are pioneers and vanguards.
The first thing is one we’ve described as “get calibrated,” but it’s really just to start to understand the technology and what’s possible. For some of the things that we’ve talked about today, business leaders over the past few years have had to understand technology more. This is really on the tip of the spear, on the cutting edge. So, really try to understand what’s possible in the technology.
Then, try to understand what the potential implications are across your entire business. As we said, these technologies are widely applicable. So, understand where in your business you’re deriving value and how these technologies can help you derive value, whether it’s marketing and sales, whether it’s supply chain, whether it’s manufacturing, whether it’s in human capital or risk.
AI's impact is likely to be most substantial in M&S, supply-chain management, and manufacturing
And then, don’t be afraid to be bold. At least experiment. This is a type of technology where it’s a learning curve, and the earlier you to start to learn, the faster you’ll go up the curve and the quicker you’ll learn where you can add value, where you can find data, and how you can have a data strategy in order to unlock the data you need to do machine learning. Getting started early—there’s really no substitute for that.
James Manyika: The only other thing I would add is something you’ve been working a lot on, Michael. One of the things that leaders are going to have to understand, or make sure that their teams understand, is this question of which techniques map to which kinds of problems, and also which techniques lead to what kind of value.
We know that the vast majority of the techniques, in the end, are largely classifiers. Knowing that is helpful. Then knowing if the kind of problem sets in your business system are ones that look like classification problems; if so, you have an enormous opportunity. This leads to where you then think about where economic value is and if you have the data available.
There’s a much more granular understanding that leaders are going to have to have, unfortunately. The reason why this matters, back to Michael’s next-practice point, is that we are already seeing, if you like, a differentiation between those leaders and company who are at the frontier of understanding this and applying these techniques, versus others who are, quite frankly, dabbling—or, at least, paying lip service.
It’s worth occasionally as a leader, I would think, visiting or spending time with researchers at the frontier, or at least talking to them, just to understand what’s going on and what’s not possible. Because this field is moving so quickly. Things that may have been seen as limitations two years ago may not be anymore. And if you’re still relying on a conversation you had with an AI scientist two years ago, you may be behind already.
David Schwartz: James and Michael, absolutely fascinating. Thank you for joining us.
James Manyika: Thank you.
Michael Chui: Thank you.
https://www.mckinsey.com/featured-insights/artificial-intelligence/the-real-world-potential-and-limitations-of-artificial-intelligence?cid=podcast-eml-alt-mkq-mck-oth-1805&hlkid=5484d298eec3407fa2e0bb8aee2c561c&hctky=1627601&hdpid=8b6826bd-a3a2-403c-85bf-4bae5416f343

No comments: