Search This Blog

Thursday, May 22, 2014

Human Stupidity: Historical: Heuristics


Humans are considered (by humans) as the most intelligent species known (to mankind). And, when we observe how much we have been able to accomplish as a species and compare that to every other species on Earth, that statement makes a lot of sense. One could debate if some big brained animals might be individually as intelligent as an individual human (he question now seems far less absurd than decades ago, as we learn more and more about the abilities of some animals and our own shortcomings). But there is no denying that what we have achieved as a species is without precedents. We have vehicles exploring the deep ocean and other planets, while others are leaving the Solar System; we can communicate almost instantaneously around our world and we understand the world around us in ways that a few generations ago wouldn't even dream. We have been changing the appearance of our planet (for good and also for evil) in a scale not done by an any organism, probably since the appearance of the first plants that could photosynthesize (The oxygen they started producing, while vital for us, was certainly a pollution for most organisms that lived then and must have caused widespread death among the species that didn't adapt to the new environment, much like the widespread death we are causing. Polluting and killing is not our exclusivity at all). And, for the first time since life started on Earth, we have been able to subvert most of the survival rules that apply to other species, changing how evolution applies to us by making it possible for even the some of the weakest among our species to survive and reach an old age, safe from the dangerous and fatal natural environment.


Those are very impressive accomplishments and they do give us the sense that, while we are far from perfect, or even far from good enough, we have been able to do something right. Culturally, we even see ourselves as something apart from the natural world, as if we were somehow superior to nature and not just a very successful species of big apes. While the distinction between natural and artificial makes no sense (one might be tempted to say it is completely artificial), it does reflect the fact that we have, in the local scale, subverted the relation we have with the world around us. And, while there are many reasons to worry about the future, our present is actually almost unbelievably better than we our perception of it. Violent deaths have never been so rare, humans never lived such long lives, all due to the advances in science and in our cultural and political institutions as shown recently by Pinker. The data that show this to be a fact are not so hard to find and we only feel we are surrounded by violence and disasters as an effect of the news focusing on those events. And, since information circulates much better now, we can learn about almost any disaster in the planet. With billions alive, the total number of crimes and disasters is indeed large. Not only we can learn about natural disasters happening at the other side of the globe, now it is very likely that there are people living there who will be affected by it. But what really matters to any of us as individuals is the proportion of people who die or who suffer, not the total number that happens in a larger population and, much less, the total number of cases we can find in the Internet. What matters is the probability that a given tragedy will affect one person. And these probabilities have been steadily going down (with the important exception of the ills associated with old age, as, in the old days, they were quite rare, since basically nobody reached old age), to the point that, even without ever seeing the data, I would personally bet that the life expectation of an Egyptian pharaoh was much smaller than that of a poor and discriminated person, as, per example, a black poor woman living in a crime infested slum in Brazil. That this statement can be surprising to so many is just a consequence of the many problems with our reasoning.


So, what is actually happening? Are we completely stupid incompetents or are we incredible geniuses who mastered the secrets of the Universe and changed the world into a utopia? The answer is clearly that we are neither, even though there is some truth to the notion that we are very dumb and also to the notion that we are actually living in a Golden Age of mankind.

One first partial answer to the question of how we (or any other living being) can actually achieve so much while being quite dumb was suggested by Simon, in 1956. In his paper, Simon investigated if it was actually necessary for a living organism to have a well defined utility function as proposed by the EUT, as well as the intellectual capacity to analyze its environment and make the decisions that maximize that utility. Organisms need to find ways to deal with a multitude of different tasks, from feeding, to defending itself and reproducing if the species is to survive. Actually obtaining and interpreting all available data from observing its surroundings and choosing the best way to obtain the best possible outcome, when all those tasks are considered, is basically an impossible problem. It would require a mental capacity far beyond the one we possess and this basically infinite capacity would also need to happen very fast. You really don't want to sit and think what is the best choice when a lion is closing to you. Since finding the perfect answer is not achievable, organisms had to settle for less.


Assume there are a number of clues in the environment that you could use in a simple way to make some decision. If this decision will give you a better chance to survive than not using those clues, any organism that uses those clues will have an advantage when compared to organisms who don't (as long as processing this information does not consume so much energy that the benefit is smaller than the cost, of course). So, an organism does not need to find the optimum, or, in economic terms, to optimize its utility. It can actually function competently by finding efficient, but not necessarily error-proof, ways to interpret the information captured by its senses. Simon describing this non-optimal behavior as satisficing (Evolution does not requires any species to be the best to survive. Being better than the others would be sufficient, but even being better might not be a good strategy. The real concept is better adapted. Not stronger, or faster, or smarter, sometimes, being weaker can actually mean better adapted. In an environment with scarce resources, being too big and strong might require extra food that is not available. In this case, the weaker organisms, who are able to survive with less, are the best adapted to that environment. This applies to strength, but also to speed, to mental prowess or any other characteristic.).

That is, if simple rules of thumb make you more likely to survive, it makes sense to use them. Per example, if you are looking for the cause of a phenomenon, it makes sense to look for things that happen together with it. After all, if it is the cause, you do expect those things to be related. The fact that many variables can be associated with no causal connection means you will often believe that things are related when they are not.


Suppose you are belong to a family of farmers without any of our modern knowledge. You try to plant your seeds and sometimes things go well and the climate seems to be working in your favor. At other times, it gets cold too soon, or there is not enough water for your plants to grow. After a long time observing, your grandfather observed if he planted the seeds whenever a specific bright star appeared low on the sky just when the Sun went down, the climate would be right for the plant to grow. Your parents confirmed it as well as your own experience. So, you conclude that this star commands the success of your farming. While this conclusion is wrong, there is no cause there, the observation of movement of the stars is indeed associated with the calendar and the seasons. And your decision will indeed be better. If you extend the argument to the belief that the same star will influence the chance of your success in war, you will be very wrong. But, without better information, there is no way you can actually determine the better day to go to war. Going when you believe the stars support you is a costless mistake, from an evolutionary point of view, since it does not improve or decreases your chance of success.


Mistaking association for cause is indeed an incredibly common mistake. My own personal experience with association and causation is actually quite worrisome. I am used to telling my students that their exams are very likely to include a question where variables will be associated and I will ask about causes. And I make it abundantly clear, with examples and theory, that observational studies (I will define these later in this text), one can not conclude that there is cause and effect. And yet, a large percentage of these students make this very same mistake during the exams (of course, this might be related to the fact that I tell my students that, if they do not show up for class but succeed at the exam, I will give them the minimum required attendance, so it is possible that the students who make that mistake were not at those three or four classes when I tell them one of the exam questions. But my best guess is that it is not just that). Outside of the exams, this can be a low cost mistake, so, it is a reasonable rule of thumb, despite the fact that is is logically wrong.

Friday, March 28, 2014

Human Stupidity: Historical: Control Issues

Besides all the errors we have seen so far, humans seem to have an innate ability to believe they are in control, even when that is not true, nor even possible. In 1975, Langer and Roth tested people on whether they felt they would be able to predict the outcome of random coin tosses. They rigged the outcome in such way that all participants would get the same number of correct guesses. The main difference was that the order of the correct outcomes was different, with three groups. For some of the subjects, those correct predictions would happen more often at their first attempts; the second group experienced a stable rate of success; and the third group started getting more wrong answers at first and more correct ones at the end. Consistent with the primacy effect, those who had obtained their correct answers sooner considered themselves more skilled than those who had observed more correct answers later. That was despite the fact the percentage of hits was the same for all involved. The confidence on their skills was not related with how successful they had been in the overall task, but just with how well they had performed in the beginning.

However, not everyone who participated in the experiment was asked to make predictions. A number of people were just instructed to observe the ones who were making the predictions and evaluate their skill at the task. Those who just observed evaluated the overall skill of the guessers as worse than the guessers evaluated themselves. Being in control had an effect on how people seemed to report the skill.

Interestingly, despite being clear the subjects had no influence on the outcome, those people who felt they were more skilled at predicting the outcomes would, after a while, start attributing their correct answers to their ability, while the wrong ones were blamed on chance (Anyone who has taught courses and graded the exams of their students can probably observe this effect. Many students seem to honestly (and absurdly) believe  in the combination that any success in the classroom is due to their merit, while failures are to blame on the teacher, or study conditions, anything but themselves). And their false belief in their merit extend to how they evaluated different aspects of the problem. Both guessers and observers assumed that, if the guesser had the opportunity to train for that task, he would improve his performance. And they seriously felt that the existence of distractions would cause them to obtain a smaller number of correct results.

This illusion that we have some degree of control even when the task is completely random has been observed in several different tasks since these results. Pronin et al observed how this illusion of control is related to magical thinking, by making people actually believe that they have harmed others through a voodoo hex, especially when they have harboured evil thoughts about the victim or that they could influence the outcome of a basketball game by positive visualizations of their success (t should be unnecessary to say both effects are completely false, but unfortunately, this comment is very much needed). And, while failure at predicting sport events might be, for most people (except, of course, for betters), the same illusion can serious consequences in other areas. Some of those consequences might even be positive, since being in control can be related with feeling better. But this can also lead to bad decisions in all areas of human enterprise. For example, Odean discusses the consequences on the behaviour of prices of the fact traders are overconfident about their abilities and the control they actually have on the outcome of their investments. And I have often observed (and I am sure most readers have also) how people believe that their actions, sometimes just their intent, would actually influence outcomes that are mostly random.

But do not despair yet, dear reader. While the number of studies that show our mistakes is staggering, I believe I have been able to convince many of you of how we can not trust our own intuitions(and such a belief is almost certainly my own illusion of control that I have more influence on how you think than I actually have). As such, we will proceed now to more optimistic waters, first taking a cursory view on the explanations of why it is possible that we are so incompetent (we are not really incompetent, we are just far less competent than we would like to believe). And later ahead, we will ask the important question of how we can actually do better and try to avoid the many pitfalls our brains have in store for us.

Tuesday, March 11, 2014

Human Stupidity: Historical: Calibration

The question of how well a person knows her real chance to get an answer right is called calibration. In general terms, a person who is 95% sure he got the correct answer, is expected to be correct 95% of the times. If he only answers correctly 70% of those questions, we can say that this person is not well calibrated on how well he knows what he knows. However, for one given question and one specific person, the answer will either be right or wrong. That means that some caution must be taken when measuring actual accuracy. Different studies can actually provide different answers depending on how the term is actually defined. This means that some discrepancy in the results and the explanations given by each author is to be expected. And, while that is indeed the case, the amount of evidence on the existence of problems with how well calibrated we tend to be is very strong.


An important question is, therefore, when should we expect to observe problems in calibration and when not. Griffin and Tversky (or also Chapter 13 in Heuristics and Biases: The Psychology of Intuitive Judgment) observed in 1992 that people seem to account wrongly for different statistical information that they call weight and strength of the evidence (personally, I find this terminology confusing, as the statistical meaning of the terms is not very clear from the names. But it is a standard way of speaking in the area) Basically, the strength of the evidence would be the proportion that was observed and the weight, the size of the sample. That is, if you toss a biased coin 20 times and obtains 16 heads, the strength of the observation is the fact that you observed heads 80% of the times, while the weight of the evidence is the fact that this was observed over 20 tosses. Both pieces of information must be used in any attempt to predict whether the coin is actually biased towards heads, as well as how likely it is that we would get heads if we toss the coin once more. However, what Griffin and Tversky observed was that, while basically accounting correctly for the observed proportion (strength), people did not take into account the weight of the data (sample size) correctly.


Quite interestingly enough, they comment, among other things, on the the problem of ``illusion of validity'', term coined by Kahneman and Tversky in 1973 (also in Chapter 4 in Judgment under Uncertainty: Heuristics and Biases This effect can be described as the fact that different questions produce different measurements of calibration. More exactly, what was observed was that people have a tendency to be more overconfident about individual cases than about their overall accuracy. For example, Cooper et al, while interviewing almost 3,000 entrepreneurs, observed that they were widely overconfident about the chance of success of their own business. On the other hand, when asked about the chance of success of a generic enterprise in their area, that overconfidence was much smaller and they proved to be just moderately overconfident. This is something that would actually be expected, due to an observation bias effect. Even if there was no average overconfidence among people on the success of businesses, some amount of random error would be unavoidable. That is, any entrepreneur was was well calibrated, in average, could show some overconfidence in some of the business area and underconfidence in others. Of course, entrepreneurs who evaluated an area as more likely to succeed would be expect to invest more in that area. And this overconfidence was not associated with those who were better prepared or actually had a better chance to succeed than their competition. What they observed was the poorly prepared entrepreneurs showed the same optimism than the better prepared ones (perhaps another example of the curse of the incompetent).


But not only calibration problems are dependent on what people are trying to answer, they are also not observed in every situation. Actually, Lichtenstein and Fischhoff observed that people can be trained. In an experiment where people had to distinguish if one phrase had been handwritten by and American or an European, they observed that, simply by providing a basic initial training, their subjects not only got more questions correctly, but also showed a better calibration about their evaluations. In his book The Psychology of Judgment and Decision Making (McGraw-Hill Series in Social Psychology), Plous reviews and compares the results of studies of calibration in two different areas, one in predicting meteorological events, by Murphy and Winkler and the other about physicians estimating the probability of a given patient to have pneumonia, by Christensen-Szalansk and Busyhead. And, contrary to popular culture assessments, the meteorologists proved to be quite well calibrated, while the physicians showed an absurd amount of overconfidence. An important part of what seems to be happening is that meteorologists get much more feedback about the accuracy of their predictions than physicians do. As a matter of fact, Lichtenstein and Fischhoff , in another study,
observed that, after some training where they provided feedback on how accurate people were on their answers, almost all their subjects improved their calibration. The exception was, actually, the few individuals who were already well calibrated before the training. This seems to make it clear the incredible importance of getting feedback on how precise one predictions were.

Thursday, March 6, 2014

TED Talks on Irrationality

I just found a very interesting series of videos from the TED Talks people. It is a playlist entitled "Our brains: predictably irrational".

I haven't  watched any of those yet, but they certainly are in my to do list. I hope we all enjoy it.

Friday, February 28, 2014

Human Stupidity: Historical: Overconfidence

Despite all the ever mounting evidence on what is really happening, we are still very confident on our intellectual abilities. And some of the confidence seems justified since, as a species, we have been able to send robots to Mars, among many other astonishing achievements. In some sense, our confidence in our abilities should be correct, at least, that seems to make sense. And yet, our common sense, as we have seen, is not something we can really rely on. A question that arises naturally from these facts is how much we can be really sure of something when we feel confident about. And, again, experimental results show we are once more in trouble, most of the time, when we compare our confidence with the accuracy of our judgments.

In 1965 Oskamp performed a series of experiments, trying to measure if confidence and accuracy in the evaluations were connected as they should. We would like to believe that, when we are more sure about something, the chance of being right should improve. Oskamp tested a group that included clinical psychologists with several years of experience, Psychology graduate students and advanced undergraduate students. The task they had to perform was to evaluate the personality of Joseph Kidd (The judges had access only to written data about him, and more data was provided at each stage of the experiment) as well as to predict his attitudes and typical actions. At the first stage, just a general demographic description of Kidd was provided and, at each stage, the judges received a page or two about a period of the patient life (childhood, high school and college years, and  military service and later). And, after each stage, the judges had to provide their best answer to the same series of 25 questions, as well as to evaluate how sure they were they had chosen the right answer. Each question was presented as a multiple choice problem with five alternatives to choose from.

What Oskamp observed was that the task was actually a hard one, given the amount of data the judges received, with none of the judges ever getting to 50% correct answers. More than that, the final average level of accuracy was actually 28%, not much different from random chance (20%) (statistically, the difference was not significant). This could be just attributed to the lack of data and difficulty of the questions, of course. What was really disturbing was that, while the accuracy seemed just to oscillate from Stage 1 to Stage 4 (26%, 23%, 28.4%, and 27.8%), the confidence of the judges showed a clear steady increase (33.2%, 39.2%, 46%, and 52.8%). That is, the extra data didn't contribute to the judges getting right answers, but it did make them more confident at their quite often wrong evaluations!


Despite the oscillations in the accuracy, while the accuracy percentages observed by Oskamp do not show there was indeed an increase with the extra information, it is at least possible that there might have been a very small improvement. But more recent studies have shown that not even that is always true. By asking people to predict the results of basketball games, Hall et al tested the accuracy of those predictions by dividing the participants in the study in two groups. Both groups received the same statistical information about the teams playing (win record, halftime score). The second group was also informed the names of the teams playing, information that was withheld from the people in the first group. What they observed was that the second group, with the extra information, consistently made worse predictions, typically by choosing better known teams and disregarding the statistical evidence. That result was repeated even when there was monetary bets on which team would win. And yet, people evaluated that knowing the names actually helped them making those predictions. Clearly, the new information increased the confidence, while decreasing the accuracy of the people involved.

That extra information can lead to overconfidence was confirmed in other experiments, as the ones made by Tsai et al. They also asked participants to preditc the outcome of games, this time American football games. And they presented performance statistics of the teams (not identified by names), one at a time. What they observed was that accuracy did get better with the first pieces of information, basically for the 6 first cues that were provided. At the same time, confidence also increased. However, as more cues were provided, up to 30 values, accuracy did not improve, but confidence did. The authors observed that, if all the information had actually been used in an optimal way, the accuracy could have improved together with the confidence, basically in a way that was equivalent to the observed increase in confidence. But the extra information, apparently, was not used to make a better prediction. One possible explanation from the authors was that people might not correct their confidence estimates to account for their limited capacity of analysis, becoming more and more certain despite the fact they were no longer improving. Overconfidence was also observed in other areas, such as how sure teachers are about the evaluation they make of their students potential or consumer knowledge \cite{albahutchinson00a}.

Overconfidence, however, is not something that is observed every time. Lichtenstein and Fischhoff observed that, as accuracy gets larger and larger, that is, when people actually know the subject and, therefore, get the answer right more often, the overconfidence starts to diminish. And, as a matter of fact, as people start getting more than 80% of the questions right, overconfidence is often substituted by underconfidence.

This does not mean that when people report their confidence to be higher than 80%, they are likely to be underconfident, however. There is here one subtle, but extremely important difference in the conditionals. What that result says is the opposite condition, that, while observing situations where people get more than 80% of the answers correctly, there is a tendency for underconfidence. But, in many situations, we might have an expert providing us with her confidence and we would like to have an estimate of its accuracy. Fischhoff et al investigated what happens in the case where people report high certainty about their evaluations. What they observed was that when people stated they were 99% sure of their answers, they actually answered correctly between 73% and 87%, depending on the experiment. Even when people were so certain that they considered there was just one chance in a million that they would be wrong (0.0001%), they were actually wrong from 4% to 10% of the times.

Why we are so bad at estimating how much we know is not clear. Dunning et all in an article aptly named  "Why People Fail to Recognize Their Own Incompetence" proposed that, for the case of incompetence, that is, low accuracy, there might be a double curse: incompetent people might be both incompetent enough to know the answer and to know that they don't know. But, even if this is the case (their idea does bring the name of a few people to mind), it does not really explain the whole range of observations.







Note: In my last entry, I said it might take me just a few more days to post something new. Maybe I was overconfident at predicting my capacity to get it done. And also, I didn't include the possibility of a rearrangement on what I had planned. The text I was working, I decided that it will fit better as a later entry, in other part of this writings. At least, when I get there, I already have 80% of a post ready.

Monday, February 10, 2014

I am still working

Hi,

Just to let you people know that I am still working on the next entry. Between being forced to deal with some human stupidity of the bad kind in real life and the fact next entry is requiring far more article hunting work than I expected, I am late. Still, it should be out in a matter of days.

Wednesday, January 22, 2014

Human Stupidity: Historical: Opinions II


When making decisions in the real world, the situation can easily become even far more biased than our natural tendencies shown in those artificial studies. Not only we tend to keep our initial opinions much longer than we should, we also directly decide the sources of information we will use. And that almost always means looking for opinions of those we already agree with, while disregarding people who opposes our own views. Of course, this will simply make us more sure about what we thought, even when that should not be the case. While doing that, we just learn the reasons why our opinion might be right, but we rarely come to know the reasons why it might actually be wrong. Test yourself: Can you make a convincing argument about some political or religious idea you oppose? You don't have to believe the argument is enough to change your mind, but it should be considered a solid argument ("It is the mark of an educated mind to be able to entertain a thought without accepting it'', attributed to Aristotle).

Of course, anyone would like to think that their beliefs are reasonable, rational, and well justified. After all, if they weren't, we wouldn't have them, right? But evidence, unfortunately, is not on our side. In a very interesting example, Jervis  observed an effect he called irrational consistency (Baron uses the term belief overkill). This consists of the fact that when people hold a specific belief, for example in a policy, they usually have many independent ideas they believe in and all of them support the said policy. And those who oppose the policy tend to defend the opposite set of ideas. However, if those ideas are independent, any rational being could defend some and oppose others, while a consideration about the total effect would lead to the final point of view on the policy. That people are too consistent is a clear sign reason is not playing the role it should in this problem.

Jervis mentions as an example the case of people who supported or opposed a ban on nuclear tests. Among the issues behind a decision to support or ban, he presents three issues: if the tests would cause serious medical danger; if the tests would lead to major weapon improvements; and if they would be a source of international tension. It is important to notice that it is completely reasonable to believe that the tests would not cause serious medical danger but would cause international tension. These evaluations are independent and any of the four possible combinations of beliefs make just as sense as the other three. That means that, if people were reasoning in a competent and independent way, no correlation between those beliefs should be observed. And yet those who were in favor of the ban held all the beliefs that the tests would cause healthy problems, would lead to more dangerous weapons, and would increase international tension. And, as it should be obvious by now, those who opposed the ban, disagreed in all the subjects with those who were in favor. Apparently, people felt somehow led to have a consistent set of beliefs, even when there was no reason at all for that consistency.

As a matter of fact, when our beliefs seem to conflict with each other, a phenomenon called cognitive dissonance, we have a tendency to change some of those beliefs to avoid the conflict. This was observed in a series of experiments conducted by Festinger . The typical experiment included performing some task and be paid either a very small amount for it ($1.00) or a more reasonable amount ($20.00, in 1962). When the subjects were asked about their feelings about the task, those who had been paid very little had a better evaluation of it than those who had received more. The explanation proposed by Festinger is that people wouldn't perform that task for just one dollar. But they had done it, what created the cognitive dissonance that the subjects solved by evaluating the task as more entertaining. After all, doing an entertaining task for basically no money makes more sense than doing a boring task.