Unbind Prometheus: February 2014

Despite all the ever mounting evidence on what is really happening, we are still very confident on our intellectual abilities. And some of the confidence seems justified since, as a species, we have been able to send robots to Mars, among many other astonishing achievements. In some sense, our confidence in our abilities should be correct, at least, that seems to make sense. And yet, our common sense, as we have seen, is not something we can really rely on. A question that arises naturally from these facts is how much we can be really sure of something when we feel confident about. And, again, experimental results show we are once more in trouble, most of the time, when we compare our confidence with the accuracy of our judgments.

In 1965 Oskamp performed a series of experiments, trying to measure if confidence and accuracy in the evaluations were connected as they should. We would like to believe that, when we are more sure about something, the chance of being right should improve. Oskamp tested a group that included clinical psychologists with several years of experience, Psychology graduate students and advanced undergraduate students. The task they had to perform was to evaluate the personality of Joseph Kidd (The judges had access only to written data about him, and more data was provided at each stage of the experiment) as well as to predict his attitudes and typical actions. At the first stage, just a general demographic description of Kidd was provided and, at each stage, the judges received a page or two about a period of the patient life (childhood, high school and college years, and military service and later). And, after each stage, the judges had to provide their best answer to the same series of 25 questions, as well as to evaluate how sure they were they had chosen the right answer. Each question was presented as a multiple choice problem with five alternatives to choose from.

What Oskamp observed was that the task was actually a hard one, given the amount of data the judges received, with none of the judges ever getting to 50% correct answers. More than that, the final average level of accuracy was actually 28%, not much different from random chance (20%) (statistically, the difference was not significant). This could be just attributed to the lack of data and difficulty of the questions, of course. What was really disturbing was that, while the accuracy seemed just to oscillate from Stage 1 to Stage 4 (26%, 23%, 28.4%, and 27.8%), the confidence of the judges showed a clear steady increase (33.2%, 39.2%, 46%, and 52.8%). That is, the extra data didn't contribute to the judges getting right answers, but it did make them more confident at their quite often wrong evaluations!

Despite the oscillations in the accuracy, while the accuracy percentages observed by Oskamp do not show there was indeed an increase with the extra information, it is at least possible that there might have been a very small improvement. But more recent studies have shown that not even that is always true. By asking people to predict the results of basketball games, Hall et al tested the accuracy of those predictions by dividing the participants in the study in two groups. Both groups received the same statistical information about the teams playing (win record, halftime score). The second group was also informed the names of the teams playing, information that was withheld from the people in the first group. What they observed was that the second group, with the extra information, consistently made worse predictions, typically by choosing better known teams and disregarding the statistical evidence. That result was repeated even when there was monetary bets on which team would win. And yet, people evaluated that knowing the names actually helped them making those predictions. Clearly, the new information increased the confidence, while decreasing the accuracy of the people involved.

That extra information can lead to overconfidence was confirmed in other experiments, as the ones made by Tsai et al. They also asked participants to preditc the outcome of games, this time American football games. And they presented performance statistics of the teams (not identified by names), one at a time. What they observed was that accuracy did get better with the first pieces of information, basically for the 6 first cues that were provided. At the same time, confidence also increased. However, as more cues were provided, up to 30 values, accuracy did not improve, but confidence did. The authors observed that, if all the information had actually been used in an optimal way, the accuracy could have improved together with the confidence, basically in a way that was equivalent to the observed increase in confidence. But the extra information, apparently, was not used to make a better prediction. One possible explanation from the authors was that people might not correct their confidence estimates to account for their limited capacity of analysis, becoming more and more certain despite the fact they were no longer improving. Overconfidence was also observed in other areas, such as how sure teachers are about the evaluation they make of their students potential or consumer knowledge \cite{albahutchinson00a}.

Overconfidence, however, is not something that is observed every time. Lichtenstein and Fischhoff observed that, as accuracy gets larger and larger, that is, when people actually know the subject and, therefore, get the answer right more often, the overconfidence starts to diminish. And, as a matter of fact, as people start getting more than 80% of the questions right, overconfidence is often substituted by underconfidence.

This does not mean that when people report their confidence to be higher than 80%, they are likely to be underconfident, however. There is here one subtle, but extremely important difference in the conditionals. What that result says is the opposite condition, that, while observing situations where people get more than 80% of the answers correctly, there is a tendency for underconfidence. But, in many situations, we might have an expert providing us with her confidence and we would like to have an estimate of its accuracy. Fischhoff et al investigated what happens in the case where people report high certainty about their evaluations. What they observed was that when people stated they were 99% sure of their answers, they actually answered correctly between 73% and 87%, depending on the experiment. Even when people were so certain that they considered there was just one chance in a million that they would be wrong (0.0001%), they were actually wrong from 4% to 10% of the times.

Why we are so bad at estimating how much we know is not clear. Dunning et all in an article aptly named "Why People Fail to Recognize Their Own Incompetence" proposed that, for the case of incompetence, that is, low accuracy, there might be a double curse: incompetent people might be both incompetent enough to know the answer and to know that they don't know. But, even if this is the case (their idea does bring the name of a few people to mind), it does not really explain the whole range of observations.

Note: In my last entry, I said it might take me just a few more days to post something new. Maybe I was overconfident at predicting my capacity to get it done. And also, I didn't include the possibility of a rearrangement on what I had planned. The text I was working, I decided that it will fit better as a later entry, in other part of this writings. At least, when I get there, I already have 80% of a post ready.

Unbind Prometheus

Search This Blog

Friday, February 28, 2014

Human Stupidity: Historical: Overconfidence

Monday, February 10, 2014

I am still working