Search This Blog

Saturday, June 28, 2014

Human Stupidity: Historical: Heuristics II


Heuristics, in the context of the literature about human reasoning, is a fancier name for these rules of thumb. Before that, the term, based on a Greek word, was used by Pólya in his book How to Solve It, with the meaning of his methods (or advices) for solving mathematical problems. In that sense, he proposed a basic separation of four steps that he considered helpful in finding those solutions. While talking about human reasoning, however, heuristics is a simple rule (or rules) we use to guess an answer. Opposed to a mathematical proof, here, there is no guarantee the answer will be correct or good. Of course, Pólya's heuristics provide no certainty that you will arrive at an answer, they just intend to help your chances of finding it. If you do not make any mistakes, mathematics will make it sure you do get the right answer, then. While human reasoning heuristics will often give you an answer, even if not the correct one.

A classical example of how our heuristics can help us reason is the problem of trying to decide which of two cities has the biggest population. Gigerenzer and Goldstein performed a series of tests of possible procedures for guessing between two cities, when using a number of cues about the city, such as if the city had a soccer team in a major league or if the city had a university.
The researchers were interested in comparing how heuristic reasoning would compare against statistical models they considered as rational. They tested different methods for making predictions from the cues, namely, multiple regression and neural networks. To their surprise, even without using all the information, some of their simulated heuristics were often able to outperform the supposedly rational models where all the available information was used.

The "Take the best'' heuristic was particularly successful, despite its simplicity. It basically orders the cues from more informative to less and then uses the best one where information is available. The simulations included the possibility a simulated agent might not know enough to use the best cue available, forcing the agent to check the next cue. As soon as one cue provides any evidence to which city might be the biggest, "Take the best'' uses that cue and just ignore any information from the other cues.

We observe here an effect that seems similar to how humans accuracy decreases with more information. However, we can not actually make that claim, since we don't know the exact reason for the human mistakes. In the case of multiple regression models, on the other hand, the reason is clear. While it is quite surprising at first that using less information might be better when using a statistical model, it is a known problem that statistical models that use many variables can overfit the data. This phenomenon will be further discussed later, when we discuss inductive logic and the use of probabilistic models. We will see how good prediction requires the use of models that both fit the data well and are as simple as possible.