Skip to content

Does this artificial intelligence think like a human? | MIT News

In machine learning, understanding why a model makes certain decisions is often as important as whether those decisions are correct. For example, a machine learning model could correctly predict that a skin lesion is cancerous, but it could have done so using an unrelated signal in a clinical photo.

While there are tools to help experts make sense of a model’s reasoning, often these methods only provide insight into one decision at a time, and each must be manually evaluated. Models are commonly trained using millions of data inputs, making it nearly impossible for a human to evaluate enough decisions to identify patterns.

Now researchers at MIT and IBM Research have created a method that allows a user to aggregate, classify, and classify these individual explanations to quickly analyze the behavior of a machine learning model. His technique, called Shared Interest, incorporates quantifiable metrics that compare how well a model’s reasoning matches that of a human.

Shared Interest could help a user easily discover worrying trends in a model’s decision making; for example, perhaps the model is often confused by distracting irrelevant features, such as background objects in photos. Adding these insights could help the user quickly and quantitatively determine if a model is reliable and ready to be implemented in a real-world situation.

“In developing Shared Interest, our goal is to be able to extend this analysis process so that you can understand at a more global level what your model’s behavior is,” says lead author Angie Boggust, a graduate student in the Visualization Group at Computer Science and Artificial Intelligence Laboratory (CSAIL).

Boggust wrote the paper with his adviser, Arvind Satyanarayan, an assistant professor of computer science who heads the Visualization Group, as well as Benjamin Hoover and lead author Hendrik Strobelt, both of IBM Research. The work will be presented at the Conference on Human Factors in Computing Systems.

Boggust began work on this project during a summer internship at IBM, under Strobelt’s tutelage. After returning to MIT, Boggust and Satyanarayan expanded the project and continued collaboration with Strobelt and Hoover, who helped implement case studies showing how the technique could be used in practice.

Human-AI Alignment

Shared Interest takes advantage of popular techniques that show how a machine learning model made a specific decision, known as salience methods. If the model is classifying images, the salience methods highlight areas of an image that are important to the model when it made its decision. These areas are displayed as a type of heat map, called a prominence map, which is often overlaid on the original image. If the model classified the image as a dog and the dog’s head is highlighted, that means those pixels were important to the model when it decided the image contained a dog.

Shared Interest works by comparing salience methods with real data. In an imagery dataset, ground truth data is typically human-generated annotations that surround the relevant parts of each image. In the example above, the box would surround the entire dog in the photo. When evaluating an image classification model, Shared Interest compares model-generated salience data and actual human-generated data for the same image to see how well they align.

The technique uses various metrics to quantify that alignment (or misalignment) and then classifies a particular decision into one of eight categories. Categories range from perfectly human-aligned (the model makes a correct prediction and the highlighted area on the saliency map is identical to the human-generated box) to completely distracting (the model makes an incorrect prediction and does not use any images). features found in the human-generated box).

“At one end of the spectrum, your model made the decision for exactly the same reason a human did, and at the other end of the spectrum, your model and the human make this decision for totally different reasons. By quantifying that for all the images in your dataset, you can use that quantification to classify them,” explains Boggust.

The technique works similarly with text-based data, where keywords are highlighted instead of image regions.

quick analysis

The researchers used three case studies to show how Shared Interest could be useful to both non-experts and machine learning researchers.

In the first case study, they used Shared Interest to help a dermatologist determine whether they should trust a machine learning model designed to help diagnose cancer from photographs of skin lesions. Shared Interest allowed the dermatologist to quickly see examples of the model’s correct and incorrect predictions. Ultimately, the dermatologist decided that he couldn’t trust the model because it made too many predictions based on imaging artifacts, rather than actual lesions.

“The value here is that by using Shared Interest, we can see these patterns emerge in the behavior of our model. In about half an hour, the dermatologist was able to make a confident decision on whether or not to trust the model and whether or not to implement it,” says Boggust.

In the second case study, they worked with a machine learning researcher to show how Shared Interest can evaluate a particular salience method by revealing previously unknown errors in the model. His technique allowed the researcher to analyze thousands of correct and incorrect decisions in a fraction of the time required by typical manual methods.

In the third case study, they used Shared Interest to delve into a specific example of image classification. By manipulating the actual area of ​​the image, they were able to perform a what-if analysis to see which image features were most important for particular predictions.

The researchers were impressed by how well Shared Interest performed in these case studies, but Boggust cautions that the technique is only as good as the prominence methods it is based on. If those techniques contain bias or are inaccurate, Shared Interest will inherit those limitations.

In the future, the researchers want to apply Shared Interest to different types of data, particularly tabular data used in medical records. They also want to use Shared Interest to help improve current prominence techniques. Boggust hopes this research will inspire more work that seeks to quantify machine learning model behavior in ways that make sense to humans.

This work is supported, in part, by the MIT-IBM Watson AI Lab, the US Air Force Research Laboratory, and the US Air Force Artificial Intelligence Accelerator.

Leave a Reply

Your email address will not be published. Required fields are marked *