What is decision hygiene? How can following the decision hygiene principles help you filter out noise and bias and make the best choices?
The term “decision hygiene” comes from the book Noise by Daniel Kahneman, Olivier Sibony, and Cass R. Sunstein. The book focuses on how to improve the judgments that affect some of the most important aspects of our lives, including our justice system, medical care, education, and business decisions.
Here’s how to practice good decision hygiene, according to Noise.
Despite the potential advantages of mechanical judgments, the authors of Noise are most interested in finding ways to reduce noise in human judgments. They say that the best way to improve human judgments is by implementing “decision hygiene”—consistent, preventative measures put in place to minimize the chance of noise. Decision hygiene consists of a loose set of suggestions, practices, and principles which we explore below. (Shortform note: With one exception (see the Sample Hiring Procedure below), the authors don’t lay out a specific, systematic course of action. Presumably, organizations should strive to implement as many of the following suggestions as are relevant and practicable.)
Recall that our normal, causal way of thinking is prone to errors and biases that manifest as noise. To make our thinking more accurate, we have to take a statistical view. The authors suggest that instead of treating each case as its own unique item, we should learn to think of it as a member of a larger class of similar things. Then, when predicting how likely an outcome is, we should consider how likely that outcome is across the whole class. Returning to an earlier example, if we’re trying to predict the likelihood that a student will graduate from college, we first need to know what percentage of all incoming college students end up graduating from college.
|How to Think Statistically|
Our failure to think statistically is a major theme in Thinking, Fast and Slow. In that book, Kahneman offers a more detailed look at thinking errors of this type and suggests ways to overcome them. As is also suggested in Noise, the basic idea is to take base probabilities into account.
In The Signal and the Noise, Nate Silver suggests another approach to statistical thinking based on a statistical formula known as Bayes’ Theorem. When making a prediction using Bayes’ Theorem, you start with a preliminary guess about the likelihood of an event. Ideally, this guess is based on hard data, such as a base probability. Then you make some calculations in which you adjust the starting probability in the face of specific evidence relating to the thing you are trying to predict. Finally, you repeat this process as many times as you can, each time starting with your most recently updated probability.
This approach has two advantages. First, it explicitly accounts for the noise in human judgment by building human estimates and predictions into the formula. Second, it calls for repeated testing of a prediction or hypothesis in order to improve accuracy in response to updated evidence. Interestingly, Silver argues that a Bayesian approach would have prevented the replicability crisis that has recently plagued the sciences—including some of the studies in Thinking, Fast and Slow.
Choose (and Train) Better Judgers
The authors argue that it’s possible to improve the quality of human judgers. We can do so by finding better judgers in the first place and by helping judgers improve their techniques and processes.
There are two factors to keep in mind when identifying good judgers. Some fields deal with objectively right or wrong outcomes; in these cases judgers can be measured by their objective results. However, as the authors point out, other fields are based instead on expertise, which can’t be measured with a metric. But judgers in any field can be assessed by their overall intelligence, their cognitive style, and their open-mindedness; these traits are correlated with better judgment skills. The authors emphasize, however, that intelligence alone doesn’t make someone a good judger. The other two traits are just as important, if not more.
The authors also note that some members of the general population are superforecasters, and their predictions are consistently more accurate than those of the average trained expert. Ideally, these are the people we should hire or appoint as judgers. The authors identify several traits exhibited by superforecasters that we can use to choose better judgers, or to better train the judgers already in place:
- They are open-minded.
- They are willing to update their opinions and predictions when new evidence arises.
- They naturally think statistically; unlike most of us, it does occur to them to consider factors like base rates.
- They break down problems and consider elements using probability rather than relying on a holistic “gut feeling” about the answer.
|Hedgehogs and Foxes|
Noise’s discussion of superforecasters draws on the work of Philip E. Tetlock and Dan Gardner. In Superforecasting, Tetlock and Gardner offer a particularly colorful description of what makes superforecasters so super: They tend to be foxes, not hedgehogs. The basic idea is that a person with a hedgehog personality tends to see the world through the lens of one big idea, they make snap judgments about things, and they’re extremely confident in their predictions. By contrast, a person with a fox personality tends to collect little bits of information about a lot of things, approach a problem slowly and from multiple angles, and be cautious and qualified about his or her predictions.
As you might guess, Tetlock and Gardner suggest that foxes make better predictors than hedgehogs. Luckily, the rest of us can practice fox skills, too. We can learn to recognize and avoid our own cognitive biases. We can generate multiple perspectives on a problem. And we can learn how to break down problems into smaller questions.
If these techniques feel familiar, that’s because they are essentially the same as many of the recommendations in Noise.
Sequence Information Carefully
Because judgments are subject to influence from information, contextual clues, confirmation bias, and so on, it’s important to carefully control and sequence the information that judgers receive. The authors provide a few guidelines for implementing this strategy:
- As a basic rule, judgers should only be given what they need when they need it.
- We must make sure independent judgments are in fact independent; if the person verifying the result knows the first person’s conclusion, he or she is more likely to verify it.
- Finally, the authors suggest that judgers should document their conclusions at each step in the process; and if new information leads them to change their decisions, they should explain and justify why.
(Shortform note: It’s also important to consider how much information judgers receive. Both Malcolm Gladwell and Nate Silver point out that information overload leads to bad decisions, either because we don’t focus on what’s most important, or because we get overwhelmed and fall back on familiar patterns and preconceived notions.)
Another way to reduce noise, and to actually turn it into a positive, is by aggregating judgments. You can collect several independent judgments and then compare or average them; or you can assemble teams who will reach a judgment together. According to the authors, these techniques harness the wisdom of crowds, a demonstrated effect by which the judgments of a group of people tend, as a whole, to center on the correct answer.
This technique works best if you assemble a team whose strengths, weaknesses, and biases balance each other out. The idea is to get as many different perspectives on a problem as you can in hopes of finding the best answer somewhere in the middle.
(Shortform note: The authors say elsewhere that noise doesn’t average out, but that’s for a bunch of noisy decisions in a system; here we’re talking about averaging out opinions before a final decision is made and before any action is taken.)
One practical way to aggregate judgments within a typical meeting setting is the estimate-talk-estimate procedure:
- First, each member of the group privately makes an estimate—some kind of forecast, prediction, or assessment.
- Then each person explains and justifies his or her estimate.
- After the discussion, each member then makes a new estimate based on the discussion. These second-round judgments are aggregated into a final decision.
Because this procedure requires that each person start with an independent judgment, it reduces the noise that comes from information cascades and polarization. At the same time, it balances individual psychological biases by encouraging outlier opinions to move toward the middle. (Shortform note: This estimate-talk-estimate procedure has drawbacks as well. For example, because its goal is to build consensus, it can discourage dissent and lead to a false sense of agreement much like the information cascades it is meant to avoid. Alternative approaches like policy delphi and argument delphi avoid this pitfall by aiming not at consensus, but at generating a wide range of dissenting perspectives.)
|How to Make Better Judgments on Your Own|
Most of the authors’ suggestions for noise reduction are targeted at organizations, but what if you want to improve your own judgments as an individual? Some of the suggestions in this section are simple enough to adopt as an individual. For example, you can practice thinking statistically or breaking down problems on your own. But how can you aggregate judgments if you are working alone instead of in a group?
The trick is to generate as many perspectives as possible before you make a decision or a prediction. One way to do that is to read as much as you can about the problem at hand. Find as many different perspectives and opinions as possible—remember, you are trying to replicate the benefit of crowd wisdom, which only works when you bring together a diversity of viewpoints.
Another way to generate alternate perspectives is to deliberately search for information that would disprove your prediction or your preferred course of action. This technique is called negative empiricism, and it gives you more perspective on a problem while also avoiding some of the logical fallacies you might otherwise fall prey to.
Break Judgments Into Smaller Components
The authors suggest that it’s easier to avoid noise when you break an overall decision into a set of smaller, more concrete subjudgments. Standardized procedures, checklists, guidelines, and assessments help here. For example, educators can reduce noise in essay grading by using rubrics. Asking the grader to assign individual scores to the paper’s originality, logical clarity, organization, and grammar before computing a final grade makes judgment easier. Breaking down a judgment in this way also helps make sure that every judger is following the same procedures and paying attention to the same factors.
The authors concede that this strategy isn’t perfect. They point out that in the field of mental health, the DSM—a book meant to aid and standardize mental diagnoses—has hardly reduced diagnostic noise. One reason is that psychiatrists and psychologists are likely to read signs and symptoms through the lens of their training and background. In other words, different theoretical understandings of the mind and of these kinds of disorders shape how different professionals understand the facts with which they are presented.
|Are Some Fields Just Noisy?|
The authors suggest that mental health diagnoses are inconsistent because of the different training and theoretical orientations of different mental health professionals. That’s true, but there’s also reason to think that mental health might be an inherently noisy field.
One reason for this is that mental health conditions overlap and influence each other: if you suffer from depression, there’s a good chance you also suffer from anxiety. Likewise, it can be difficult to separate mental health from physical health. Moreover, professionals disagree on the best practices for diagnosing and treating mental health issues, including basic questions such as whether a given set of symptoms is a disorder or just a difference.
These factors suggest that some fields might be more prone to noise—and more resistant to noise reduction—than others. That’s not to say that mental health care, for instance, can’t be made less noisy. Doing so just might require analysis and reform that is beyond the scope of the noise hygiene techniques we’re exploring here.
Use Rules and Standards
One way to break judgments into smaller parts is to implement rules and/or standards. (Shortform note: The authors introduce rules and standards as part of a larger discussion about the pros and cons of implementing noise reduction. We think it’s worth looking at rules and standards as noise hygiene strategies, which is why we’ve included them here.)
- Rules offer explicit guidance typically tied to objective measures. For example, there is a maximum allowable blood alcohol content above which a driver can be charged with drunk driving.
- Standards are suggestive guidelines that require some amount of subjective interpretation and implementation. For example, law enforcement officers are trained to recognize potential signs of impairment (e.g., erratic driving) and to issue field sobriety tests.
In deciding between rules and standards, the authors say we should first determine which will lead to more errors. They also point out that sometimes it isn’t possible to implement rules because the people making the rules can’t agree (for example, because of political or moral differences) or because the people making the rules don’t have the information needed to write an appropriate rule.
The authors further suggest that in some cases, the best approach is to combine rules and standards. Mandatory sentencing guidelines take this approach, setting a minimum and maximum sentence for a given crime (rule) and otherwise asking judges to determine a just sentence for each individual case (standard).
Rules and standards are examples of what Sunstein and Edna Ullmann-Margalit call second-order decisions—strategies we use to reduce our cognitive burdens when decisions are too numerous, too repetitive, too difficult, or too ambiguous to make one by one. Other second-order decisions include:
Presumptions, which are rule-like guidelines that allow the possibility of exceptions in some cases.
Routines, such as always brushing your teeth right before bed.
Taking small, reversible steps, such as pet-sitting for a neighbor’s dog before making the commitment to adopt a dog of your own.
Picking at random rather than choosing deliberately, such as throwing a dart at a map to decide where to go on vacation.
Delegating, such as allowing your partner to choose dinner tonight.
Heuristics, such as the matching operation described earlier in this guide.
Use Better Scales
As noted earlier, a lot of noise comes from our attempt to judge things using scales. If the scale is unclear, too complex, or inappropriate for the task, there will be noise. If the scale requires judgers to interpret or calibrate the scale themselves, there will be noise. Therefore, in cases where scales are useful or necessary, we need to design better ones.
The authors argue that as a general rule, comparative scales are less noisy than absolute scales. The authors give the example of job performance ratings, which are noisy in part because the traditional numerical scales are unclear and are interpreted differently from one reviewer to the next. What constitutes a “6” in “communication skills,” or in “leadership?” Without explicit guidance about what the numbers mean and how they correlate to the qualities they measure, each person will have a different understanding of how to score an employee.
Instead of evaluating employees in terms of an absolute number, the authors say it’s better to rank employees. For example, on an employee’s communication, ask whether their skills fell in the top 20% of the company, or the next 20%, and so on. As noted earlier, we are generally better at comparing things than at quantifying them in the abstract.
(Shortform note: Recall the earlier discussion of matching operations and the way our minds substitute an easier question in place of a more complex one. Without clear guidance, something similar probably happens with a vague rating scale, as we replace the question “How does X’s communication rate out of 10?” with something like “How impressed am I with X’s communication?” or “How clear do I find X?”)
A comparative scale also provides concrete anchor points and clear descriptions or markers for each point. A good anchor point correlates a specific value on the scale with a relevant example of the thing being evaluated (if you’re grading a paper and you know that a “C” grade represents average work, that’s your anchor point). To minimize noise, anchor points should be provided ahead of time so that each judger starts with the same frame of reference.
(Shortform note: Anchoring is another concept drawn from Thinking, Fast and Slow. The basic idea of anchoring is that an initial piece of information (for example, a suggested donation amount) has a major influence on the actions we take (in this case, how much we decide to donate). By suggesting that scales come with clear anchor points, the authors of Noise seek to take advantage of this psychological effect by using it to calibrate judgers’ assessments.)
———End of Preview———
Like what you just read? Read the rest of the world's best book summary and analysis of Daniel Kahneman, Olivier Sibony, and Cass Sunstein's "Noise" at Shortform .
Here's what you'll find in our full Noise summary :
- A deep dive into the unexpected and unwanted variance in human judgments
- How to reduce or eliminate noise from your decision-making
- How to practice good decision hygiene