The (Extra) Eyes Have It

UCSB researchers investigate the wisdom of crowds in the realm of visual searches

Your doctor is an expert with many years of experience. So when she tells you, upon reviewing all the fancy tomographic imaging you had done, that the tenderness in your breast is just some minor irritation, you want to believe her and leave it at that.

But is she right?

According to researchers at UC Santa Barbara a second pair of eyes studying those same images looks to be more beneficial than previously thought when searching for hard-to-find objects in a “noisy” field — especially when that searcher is under time pressure and other constraints. The scientists’ findings, detailed in the paper “The Wisdom of Crowds for Visual Search,” are published in the Proceedings of the National Academy of Sciences.

“We show that the benefits in having more people do the task will be larger when individuals cannot exhaustively search the entire image,” said Mordechai Juni, a postdoctoral researcher in UCSB’s Department of Psychological & Brain Sciences and lead author of the study conducted with Professor Miguel Eckstein. In a fast-paced world with increasing amounts of visual information — closed-circuit TV, geospatial imaging and medical tomography, to name a few — tapping into the wisdom of crowds might be especially useful. Each individual is unlikely to look at all regions of all images.

The study builds on a longstanding concept that the aggregated answers of a large group of people are usually more accurate than the response of a single expert. A classic example occurred at an English county fair in 1907, when the averaged estimates of a crowd of people vying to guess the weight of an ox came closer than those of each individual entry, including those of cattle experts. 

“It appears then, in this particular instance, that the vox populi is correct to within one per cent of the real value,” concluded Sir Francis Galton, who conducted that study.

The benefit of the “wisdom of crowds” has been found in human judgments in the domains of estimation, detection (where the location of the target is familiar), identification and prediction. However, until now, the value of that phenomenon with regard to visual search had not been well studied.

In this preliminary work, the researchers used an eye-tracking device to record the visual scan paths of a group of undergraduate students. They did so first with a search task (requiring a yes or no response to the presence of a hard-to-find object anywhere on a field), then a single-location task (requiring a yes or no response to the presence of a hard-to-detect object in a fixed and known position on a field). Their results demonstrated that the aggregated responses — weighted with the observers’ confidence — in the search task showed better than expected performance compared to the single-location task.

“In the single-location task, all observers are looking at the exact location where the hard-to-detect object might be, and so they are all processing the same visual information,” said Juni. “But in the search task, observers’ scan paths take different patterns, and those who happen to gaze directly at the hard-to-find object tend to be highly confident that it is present — because the object is easy to detect when fixated — whereas those who do not gaze directly at the object tend to respond that it is absent, because the object is very difficult to detect in the visual periphery.”

The greater benefits for the search task, Juni explained, are “dependent on tapping into the very high confidence of those in the group who happened to gaze directly at the object.”

In this search scenario, the researchers say, the ‘wisdom of crowds’ is more nuanced than simple majority voting, which is highly effective in single-location tasks when the relevant information is present and accessible to all.

“As long as there’s some element of individual knowledge that we could tap into, you could do really well, and maybe even close to optimal in terms of the group performance,” Juni said. He cited Condorcet’s jury theorem, which indicates that if voters are more likely than not to give correct answers individually, the chance of collectively arriving at the correct answer via majority voting increases with group size.

But, in scenarios where individuals are more likely than not to give incorrect answers (due to, say, erroneous or lack of information), increasing group size might be detrimental for majority voting as even experts who tend to give correct answers will be swept away with the minority.

In many real-world scenarios where visual search is employed, the tendency toward error is present, whether it’s lack of time to perform exhaustive searches, lack of resolution in the images, too large a search field or too many images to search through. Think: search-and-rescue of planes downed in the ocean, military surveillance during times of conflict, or physicians dealing with an exponential rise in the number of images per exam.

“In those cases, majority voting would be ineffective, whereas the averaged responses of a group of observers could be very beneficial if those who happen to gaze directly at the searched-for-object express very high confidence that they found it,” Juni said.

“The next step is to verify and see if this actually happens with the large data volumes of modern 3D medical image technology,” said Eckstein. The results might raise questions over how medical diagnoses are made in the U.S. and other countries that do not typically perform multiple independent readings to find, say, breast cancer as they do in much of Europe, Australia and New Zealand. According to the researchers, combining responses from multiple readings with technologies that generate many images per exam could result in considerable gains.

“We think some of this work could potentially cause us to rethink how we should do it in the U.S. for newer 3D imaging technologies,” Juni said.

Share this article

FacebookTwitterShare