Besides all the errors we have seen so far, humans seem to have an innate ability to believe they are in control, even when that is not true, nor even possible. In 1975, Langer and Roth tested people on whether they felt they would be able to predict the outcome of random coin tosses. They rigged the outcome in such way that all participants would get the same number of correct guesses. The main difference was that the order of the correct outcomes was different, with three groups. For some of the subjects, those correct predictions would happen more often at their first attempts; the second group experienced a stable rate of success; and the third group started getting more wrong answers at first and more correct ones at the end. Consistent with the primacy effect, those who had obtained their correct answers sooner considered themselves more skilled than those who had observed more correct answers later. That was despite the fact the percentage of hits was the same for all involved. The confidence on their skills was not related with how successful they had been in the overall task, but just with how well they had performed in the beginning.
However, not everyone who participated in the experiment was asked to make predictions. A number of people were just instructed to observe the ones who were making the predictions and evaluate their skill at the task. Those who just observed evaluated the overall skill of the guessers as worse than the guessers evaluated themselves. Being in control had an effect on how people seemed to report the skill.
Interestingly, despite being clear the subjects had no influence on the outcome, those people who felt they were more skilled at predicting the outcomes would, after a while, start attributing their correct answers to their ability, while the wrong ones were blamed on chance (Anyone who has taught courses and graded the exams of their students can probably observe this effect. Many students seem to honestly (and absurdly) believe in the combination that any success in the classroom is due to their merit, while failures are to blame on the teacher, or study conditions, anything but themselves). And their false belief in their merit extend to how they evaluated different aspects of the problem. Both guessers and observers assumed that, if the guesser had the opportunity to train for that task, he would improve his performance. And they seriously felt that the existence of distractions would cause them to obtain a smaller number of correct results.
This illusion that we have some degree of control even when the task is completely random has been observed in several different tasks since these results. Pronin et al observed how this illusion of control is related to magical thinking, by making people actually believe that they have harmed others through a voodoo hex, especially when they have harboured evil thoughts about the victim or that they could influence the outcome of a basketball game by positive visualizations of their success (t should be unnecessary to say both effects are completely false, but unfortunately, this comment is very much needed). And, while failure at predicting sport events might be, for most people (except, of course, for betters), the same illusion can serious consequences in other areas. Some of those consequences might even be positive, since being in control can be related with feeling better. But this can also lead to bad decisions in all areas of human enterprise. For example, Odean discusses the consequences on the behaviour of prices of the fact traders are overconfident about their abilities and the control they actually have on the outcome of their investments. And I have often observed (and I am sure most readers have also) how people believe that their actions, sometimes just their intent, would actually influence outcomes that are mostly random.
But do not despair yet, dear reader. While the number of studies that show our mistakes is staggering, I believe I have been able to convince many of you of how we can not trust our own intuitions(and such a belief is almost certainly my own illusion of control that I have more influence on how you think than I actually have). As such, we will proceed now to more optimistic waters, first taking a cursory view on the explanations of why it is possible that we are so incompetent (we are not really incompetent, we are just far less competent than we would like to believe). And later ahead, we will ask the important question of how we can actually do better and try to avoid the many pitfalls our brains have in store for us.
Search This Blog
Friday, March 28, 2014
Tuesday, March 11, 2014
Human Stupidity: Historical: Calibration
The question of how well a person knows her real chance to get an answer right is called calibration. In general terms, a person who is 95% sure he got the correct answer, is expected to be correct 95% of the times. If he only answers correctly 70% of those questions, we can say that this person is not well calibrated on how well he knows what he knows. However, for one given question and one specific person, the answer will either be right or wrong. That means that some caution must be taken when measuring actual accuracy. Different studies can actually provide different answers depending on how the term is actually defined. This means that some discrepancy in the results and the explanations given by each author is to be expected. And, while that is indeed the case, the amount of evidence on the existence of problems with how well calibrated we tend to be is very strong.
An important question is, therefore, when should we expect to observe problems in calibration and when not. Griffin and Tversky (or also Chapter 13 in Heuristics and Biases: The Psychology of Intuitive Judgment) observed in 1992 that people seem to account wrongly for different statistical information that they call weight and strength of the evidence (personally, I find this terminology confusing, as the statistical meaning of the terms is not very clear from the names. But it is a standard way of speaking in the area) Basically, the strength of the evidence would be the proportion that was observed and the weight, the size of the sample. That is, if you toss a biased coin 20 times and obtains 16 heads, the strength of the observation is the fact that you observed heads 80% of the times, while the weight of the evidence is the fact that this was observed over 20 tosses. Both pieces of information must be used in any attempt to predict whether the coin is actually biased towards heads, as well as how likely it is that we would get heads if we toss the coin once more. However, what Griffin and Tversky observed was that, while basically accounting correctly for the observed proportion (strength), people did not take into account the weight of the data (sample size) correctly.
Quite interestingly enough, they comment, among other things, on the the problem of ``illusion of validity'', term coined by Kahneman and Tversky in 1973 (also in Chapter 4 in Judgment under Uncertainty: Heuristics and Biases This effect can be described as the fact that different questions produce different measurements of calibration. More exactly, what was observed was that people have a tendency to be more overconfident about individual cases than about their overall accuracy. For example, Cooper et al, while interviewing almost 3,000 entrepreneurs, observed that they were widely overconfident about the chance of success of their own business. On the other hand, when asked about the chance of success of a generic enterprise in their area, that overconfidence was much smaller and they proved to be just moderately overconfident. This is something that would actually be expected, due to an observation bias effect. Even if there was no average overconfidence among people on the success of businesses, some amount of random error would be unavoidable. That is, any entrepreneur was was well calibrated, in average, could show some overconfidence in some of the business area and underconfidence in others. Of course, entrepreneurs who evaluated an area as more likely to succeed would be expect to invest more in that area. And this overconfidence was not associated with those who were better prepared or actually had a better chance to succeed than their competition. What they observed was the poorly prepared entrepreneurs showed the same optimism than the better prepared ones (perhaps another example of the curse of the incompetent).
But not only calibration problems are dependent on what people are trying to answer, they are also not observed in every situation. Actually, Lichtenstein and Fischhoff observed that people can be trained. In an experiment where people had to distinguish if one phrase had been handwritten by and American or an European, they observed that, simply by providing a basic initial training, their subjects not only got more questions correctly, but also showed a better calibration about their evaluations. In his book The Psychology of Judgment and Decision Making (McGraw-Hill Series in Social Psychology), Plous reviews and compares the results of studies of calibration in two different areas, one in predicting meteorological events, by Murphy and Winkler and the other about physicians estimating the probability of a given patient to have pneumonia, by Christensen-Szalansk and Busyhead. And, contrary to popular culture assessments, the meteorologists proved to be quite well calibrated, while the physicians showed an absurd amount of overconfidence. An important part of what seems to be happening is that meteorologists get much more feedback about the accuracy of their predictions than physicians do. As a matter of fact, Lichtenstein and Fischhoff , in another study,
observed that, after some training where they provided feedback on how accurate people were on their answers, almost all their subjects improved their calibration. The exception was, actually, the few individuals who were already well calibrated before the training. This seems to make it clear the incredible importance of getting feedback on how precise one predictions were.
An important question is, therefore, when should we expect to observe problems in calibration and when not. Griffin and Tversky (or also Chapter 13 in Heuristics and Biases: The Psychology of Intuitive Judgment) observed in 1992 that people seem to account wrongly for different statistical information that they call weight and strength of the evidence (personally, I find this terminology confusing, as the statistical meaning of the terms is not very clear from the names. But it is a standard way of speaking in the area) Basically, the strength of the evidence would be the proportion that was observed and the weight, the size of the sample. That is, if you toss a biased coin 20 times and obtains 16 heads, the strength of the observation is the fact that you observed heads 80% of the times, while the weight of the evidence is the fact that this was observed over 20 tosses. Both pieces of information must be used in any attempt to predict whether the coin is actually biased towards heads, as well as how likely it is that we would get heads if we toss the coin once more. However, what Griffin and Tversky observed was that, while basically accounting correctly for the observed proportion (strength), people did not take into account the weight of the data (sample size) correctly.
Quite interestingly enough, they comment, among other things, on the the problem of ``illusion of validity'', term coined by Kahneman and Tversky in 1973 (also in Chapter 4 in Judgment under Uncertainty: Heuristics and Biases This effect can be described as the fact that different questions produce different measurements of calibration. More exactly, what was observed was that people have a tendency to be more overconfident about individual cases than about their overall accuracy. For example, Cooper et al, while interviewing almost 3,000 entrepreneurs, observed that they were widely overconfident about the chance of success of their own business. On the other hand, when asked about the chance of success of a generic enterprise in their area, that overconfidence was much smaller and they proved to be just moderately overconfident. This is something that would actually be expected, due to an observation bias effect. Even if there was no average overconfidence among people on the success of businesses, some amount of random error would be unavoidable. That is, any entrepreneur was was well calibrated, in average, could show some overconfidence in some of the business area and underconfidence in others. Of course, entrepreneurs who evaluated an area as more likely to succeed would be expect to invest more in that area. And this overconfidence was not associated with those who were better prepared or actually had a better chance to succeed than their competition. What they observed was the poorly prepared entrepreneurs showed the same optimism than the better prepared ones (perhaps another example of the curse of the incompetent).
But not only calibration problems are dependent on what people are trying to answer, they are also not observed in every situation. Actually, Lichtenstein and Fischhoff observed that people can be trained. In an experiment where people had to distinguish if one phrase had been handwritten by and American or an European, they observed that, simply by providing a basic initial training, their subjects not only got more questions correctly, but also showed a better calibration about their evaluations. In his book The Psychology of Judgment and Decision Making (McGraw-Hill Series in Social Psychology), Plous reviews and compares the results of studies of calibration in two different areas, one in predicting meteorological events, by Murphy and Winkler and the other about physicians estimating the probability of a given patient to have pneumonia, by Christensen-Szalansk and Busyhead. And, contrary to popular culture assessments, the meteorologists proved to be quite well calibrated, while the physicians showed an absurd amount of overconfidence. An important part of what seems to be happening is that meteorologists get much more feedback about the accuracy of their predictions than physicians do. As a matter of fact, Lichtenstein and Fischhoff , in another study,
observed that, after some training where they provided feedback on how accurate people were on their answers, almost all their subjects improved their calibration. The exception was, actually, the few individuals who were already well calibrated before the training. This seems to make it clear the incredible importance of getting feedback on how precise one predictions were.
Thursday, March 6, 2014
TED Talks on Irrationality
I just found a very interesting series of videos from the TED Talks people. It is a playlist entitled "Our brains: predictably irrational".
I haven't watched any of those yet, but they certainly are in my to do list. I hope we all enjoy it.
Subscribe to:
Posts (Atom)