Module 12 Aggregate - Weigh and pull together the evidence

This episode accompanies Module 12 of the course, which is about how we bring together the various sources of evidence that we've gathered.

The purpose of taking an evidence-based approach is to reduce uncertainty in our decision making, looking at likelihoods and probabilities to guide our thinking and discussions. The use of Bayes rule and Bayesian thinking are explored, so that we continue to protect ourselves from falling prey to bias (particularly confirmation bias), but that we consider alternative explanations for the evidence that we found - if our initial belief is either true or false.

The use of probabilities isn't something our brains take to easily, so there is some challenge inherent in this approach, but it is simply an extension of the overall evidence-based management approach, where we look at each type of evidence and consistently question whether it is trustworthy, robust and reliable. Once we reach the 'aggregate" stage, it's time to ask how likely is it that the claim or hypothesis we are investigating is true (or false).

Further reading / sources mentioned during the episode:

Host: Karen Plum

Guests:

Eric Barends, Managing Director, Center for Evidence-Based Management
Denise Rousseau, H J Heinz University Professor, Carnegie Mellon University

Additional material with thanks to:

Julia Galef - President and co-founder of the Center for Applied Rationality - YouTube videos

Find out more about the course here: https://cebma.org/resources-and-tools/course-modules/

00:00:00 Karen Plum

Hello and welcome to the Evidence-Based Management podcast. This episode accompanies Module 12 of the course, which is all about how we bring together the various sources of evidence that we've been busy gathering, ensuring that we don't allow bias or irrational thinking to creep in at this next critical stage.

I'm Karen Plum, a fellow student of evidence-based management, and in this episode I'm joined by Eric Barends, Managing Director of the Center for Evidence-Based Management and Professor Denise Rousseau from Carnegie Mellon University. I'll also be sharing some thoughts from Julia Galef, co-founder of the Center for Applied Rationality and Michal Oleszak a machine learning engineer with a background in statistics and econometrics.

Without further ado, let's enter this world of probability.

00:01:04 Karen Plum

Having spent a good while exploring the different sources of evidence - how to gather them and how to assess their trustworthiness and reliability - I'd been wondering for a while, OK, so this is all good stuff, but what happens next? What if the sources provide contradictory results? How do we bring all that together and reach a conclusion about whether the evidence supports a claim or not?

Once I'd done module 12, I was struck by the reminder that everything we've been doing is designed to avoid us going with whatever evidence seems most persuasive or fits with our view of things. But now, with the evidence to hand and duly assessed, there's another chance for us to go off the rails! So in a sense, the Aggregate stage is the next gatekeeping step to stop us letting our biases and beliefs get the better of us.

I think it's worth reminding ourselves that the purpose of an evidence-based approach isn't to find out whether a claim or hypothesis is true or false. Evidence is always evidence for or against something. Evidence isn't the same as proof, and whereas you can prove something in mathematics - that a statement or equation is true - in our field of management, you can't actually prove anything.

What we can do is to reach an estimate of a probability that the claim or the hypothesis will result in the outcome predicted. If we do X, there's a high probability that it will affect Y in a certain way.

As is stated in the module, evidence-based management is about making decisions under conditions of uncertainty, through the use of the best available evidence from multiple sources, to increase the probability of a favorable outcome. And that brings me to the first area that we need to grapple with, and that's probability.

As Michal Oleszak points out in his article “On the importance of Bayesian thinking in everyday life”, human brains don't process probabilities very well. It's a very interesting piece, and if you'd like to check it out, there's a link in the episode show notes.

As with other aspects of the way our brains work, like being wired for safety to keep us out of harm's way, our ancestors also had a tendency to be overly cautious when faced with a potentially disastrous outcome, even if the probability of it happening was very small.

Our brains look for patterns everywhere, looking for causality where there may not be any, and ensuring that we favor a version of events that corresponds to our own beliefs about the world, however biased they may be.

So we're essentially fighting our own nature here and as such, because we're trying to operate counter to the way we're wired, that takes energy, time and is metabolically expensive, if you will. I put this point to Eric, who reminded me that the Aggregate step is actually a reinforcement of what we've been doing all the way through the process.

00:04:09 Eric Barends

The whole approach is actually reducing uncertainty, but also reducing bias. So that's why in the previous chapters or modules we have emphasized collecting evidence from professionals - how do you do that in an unbiased way? Collecting evidence from the organization - how can you collect or capture these data in a reliable way? If you search for research findings relating to your claim or hypothesis or the question you're trying to answer - you need to search in a way that you don't cherry pick. There shouldn't be selection bias that you have a preference and you try to, you know, pay attention to the studies that subscribe or support your point of view and ignore other stuff.

That's why we emphasize so much in the step Acquire, regardless whether it's scientific studies, organizational data, professional expertise, or stakeholders evidence - to do this in as stepwise, unbiased, objective way as best as possible.

So that's something that is a very important part of the evidence-based approach, and here it all comes together but here you make a second judgment on it - OK, we found evidence in favor of this claim, of this hypothesis, well, how strong, actually is the evidence, and could there still be bias?

And then of course, which is Bayes rule, is that you also try to have a look at the opposite - well is there an alternative explanation, which is actually a better explanation and the evidence fits better to this alternative explanation than the original. So that's what you do here - you sit down and think about the strength of the evidence - well, my colleagues, they all claim that X indeed has an impact on Y. OK, how strong is actually this source of evidence? And that indeed is an important step.

00:06:25 Karen Plum

It seems to me that this step is a real test of how committed we are to finding the best evidence and getting the best outcome. Taking time to look at alternatives when we've already been on the journey for a while, trying to creep up on an acceptable and authoritative outcome. But again, this is what we've been doing in previous stages, so it isn't a new approach, although it might feel like it because of the way the module presents it.

And that brings us to the key thinking in the Bayes rule approach. I must admit, being not very mathematically minded, as soon as I saw a formula my brain started to go fuzzy. But if you had that reaction, don't worry, because the math isn't the point. It's there to show that there's a solid mathematical foundation to the approach, but what's more important is how we think about the evidence we have and how we can update our thinking when we have more evidence available.

To illustrate the point, here's Julia Galef, co-founder of the Center for Applied Rationality, speaking about Bayesian thinking in one of her YouTube videos.

00:07:31 Julia Galef

Bayes rule is provably the best way to think about evidence. In other words, Bayes rule is a formalization of how to change your mind when you learn new information about the world, or have new experiences. And I don't think that the math behind the math of Bayes rule is crucial to getting benefit out of it in your own reasoning or decision making.

In fact, there are plenty of people who use Bayes rule on a daily basis in their jobs, statisticians and scientists for example. But then when they leave the lab and go home, they think like non Bayesians just like the rest of us. So what's really important is internalizing the intuitions behind Bayes rule and some of the general reasoning principles that fall out of the math and being able to use those principles in your own reasoning.

After you've been steeped in Bayes rule for a little while, it starts to produce some fundamental changes to your thing.

00:08:27 Karen Plum

If you haven't discovered Julia before now, it's really worth checking out her website and videos. She has a great way of making this topic really accessible. There are some links in our show notes and the episode transcript to help you find her.

So let's stick with the Bayes approach for a little, and then we'll come on to how it helps in our search for the best evidence to confirm our current belief or theory, or claim as we're always exercising this judgment in relation to a claim. Here's Julia again talking about how Bayes rule has fundamentally changed the way she thinks.

00:09:02 Julia Galef

So how has Bayes rule changed the way I think? Well, first of all, it's made me a lot more aware that my beliefs are grayscale. They're not black and white. I have some level of confidence in all of my beliefs between zero and 100%. I may not be consciously aware of it, but implicitly I have some sort of rough confidence level in my belief that my friend likes me, or that it will rain tomorrow, et cetera.

And it's not 0% and it's not 100%, it's somewhere in between. And more importantly, I'm more aware that that level of confidence should fluctuate over time as I learn new things about the world as I make new observations. And I think that the default approach that I used to have and that most people have towards the world is, you have your beliefs, you're confident in them and you pretty much stick to them until you're forced to change your beliefs, if you see some evidence that absolutely can't be reconciled with them.

But implicitly the question you're asking yourself is, can I reconcile what I'm seeing or what I've learned or heard - can I reconcile that with what I believe to be true about the world?

00:10:17 Karen Plum

Essentially, Julia argues that our default position is that we pretty much stick to our beliefs unless we're forced to change them because we really just can't reconcile an existing belief with what we've just learned, or heard.

And as we're learning and acquiring new evidence all the time, if we can challenge our own thinking by looking for alternative explanations, then we won't stay committed to a belief that is essentially dodgy. Both Julia and Eric stressed that plugging numbers into a formula isn't the point, and although he rarely uses it personally, Eric stresses that some disciplines do.

00:10:54 Eric Barends

Well, I can tell you that in a lot of other disciplines they do use this calculator. I mean in medical discipline, in healthcare, in diagnostics they use Bayes rule to calculate the likelihood that you have a condition or a disease, given the fact that the test was positive or given the fact that the test was negative, so there's a number there.

So you go for mammograph to see whether you have breast cancer. What is the prior probability? Well, it's not that high, given your age. Well, the test actually is positive, suggesting you have breast cancer. Well, how strong is this test? How sensitive? And well, 95%. OK then you can do the math and then the number will by the way, surprise you, that its probably way lower than you expected, so that's when they use the calculator.

In forensics, criminal justice and forensic stuff, they also use the calculator to come to a number and in a lot of other disciplines there's a calculation somewhere running in the background when you look at your emails and your spam filter uses Bayes rule to determine whether an e-mail could be spam, or not.

So in many areas there is a calculation, but you do this actually in the back of your mind, not the real numbers, but just putting this together. That's how you apply this in real life.

00:12:25 Karen Plum

Similarly, in management it's not about plugging numbers into a formula, it's much more about whether you can entertain the possibility that you might be wrong or that you're prepared to acknowledge a level of uncertainty about the position that you hold.

People typically react to evidence by trying to find a way for it to be consistent with what they already believe, and because we are creative, we can generally do that. But taking a Bayesian approach challenges us to ask ourselves - how likely is the evidence, assuming that my belief is true, versus how likely it is if my belief is false?

Given that we're typically looking for shortcuts and confirmation and patterns, this last step of thinking about whether our belief is false, is it difficult and challenging one to take.

00:13:16 Julia Galef

When people say to me things like - Bayesian reasoning or Bayesian updating, that's really just a fancy word for paying attention to evidence or a fancy word for changing your mind, right? I always want to say no, no, it's a specific claim about how to change your mind or a specific claim about you know when and how much to change your mind.

It's that extra step where you're asking yourself - suppose I was wrong, what would that world look like and how consistent is the evidence with that world? That's like the crucial step. That's the active ingredient, because if you're not doing that, if you're just asking yourself, is this evidence consistent with what I already believe, then it's just so easy to stay entrenched in your pre-existing beliefs, because there's always a way to make, or almost always, a way to make the evidence consistent with them.

So that extra step is what sometimes forces you to notice - oh, this evidence doesn't really support what I believe or maybe it does, but only a little bit.

00:14:18 Karen Plum

So now we've explored Bayes rule and how to think like a Bayesian, how do we apply this within our decision making in relation to a claim that we're investigating? As Eric has emphasized throughout the podcasts, we're trying to save ourselves from falling prey to our biases, particularly confirmation bias.

00:14:39 Eric Barends

Confirmation bias is maybe the mother of all biases. We have pre-existing beliefs and we look at the world and we look only at those pieces of evidence that confirm our pre-existing beliefs. So we should, like in the chapter on professional experience and how to deal with confirmation bias, one of the recommendations we give is actively look for examples <that> suggest that your pre-existing beliefs are incorrect. And that's the whole idea of research, of scientific method to have a look at evidence or information that your hypothesis is not true.

00:15:27 Karen Plum

If you think about it, this approach is inherent in the investigation of crime. And there are many real life situations which have shown that jumping to conclusions and not seeking alternative explanations, or suspects, has led to miscarriages of justice. Similarly, think about medical diagnosis and the search for alternative explanations for the symptoms and experiences of the patient.

For those of you familiar with House, the medical TV drama, that's what he and his team do. They experiment, collect evidence, observe and eliminate each potential diagnosis as they go along -hopefully before the patient dies!

Another warning comes from Daniel Kahneman in his book “Thinking Fast and Slow”, where he illustrates that the brain assumes that ‘what you see is all there is’, and then we look no further.

So is poor management decision making a crime or like medical malpractice? Given the potential implications and outcomes of decisions, I would argue that perhaps it is. That said, the reason we're here is to make better decisions and to constantly challenge our thinking and the robustness of our evidence.

00:16:43 Eric Barends

The big idea here is the likelihood of the evidence. How likely is it that this evidence would show up indeed, if the hypothesis or my belief or this claim would be true? But you should also ask, but could this evidence also show up, could this also fit with an alternative explanation? And it's maybe we are somehow a bit unclear, maybe that we say either the hypothesis is true or hypothesis falls. But we mean while the hypothesis is false, when there are other plausible hypotheses or explanations that could explain why the evidence is there.

And I think that is something you have to ask yourself all the time and of course we present an evidence-based approach very neatly, all these steps sequential - first you ask a question; then you acquire the evidence; then you critically appraise; and you do this four times because there are four sources; now, here we are we going to apply Bayes rule, we're going to put everything together. That's not how it works in real life!

You already work with this Bayes mindset at the beginning - you start with the question that's being posed or claim being made with your HR manager saying oh, we need to invest a lot in job satisfaction because it has a positive impact on employee performance and then Bayes says well, how likely is that to be true to start with?

00:18:21 Karen Plum

So this isn't a new approach. We've been trained to challenge and ask questions and look at alternative explanations all through the process. So if we've done that already with the different sources of evidence, do we need to revisit it again at this stage?

00:18:36 Eric Barends

At the end you sit down, you slow down, you look at all the evidence, but it was probably already there in your mind when you went through all these steps. So it's not that you're doing it again and again and again - how trustworthy is this evidence, well, you do this in the critical appraisal part obviously. Is this study a good study in terms of research design to answer this question? When the answer is, well, it's kind of weak - OK, that leaves the possibility of alternative explanations, from a Bayesian perspective, it's not very strong, the likelihood of this evidence supporting this is not that strong.

So you already do this in all the steps that you do before you got to this point of aggregation.

00:19:24 Karen Plum

And so we come to the part where we assign probabilities to help us to quantify certainty or uncertainty. The whole process is designed to help manage risk and to enable us to be as sure as we can be that the problem we're researching or the intervention we'd like to use, is likely to have the intended outcome. The point of collecting evidence and evaluating it is to tip the scales in our favor.

We have a greater chance of certainty and its way better than flipping a coin to make the decision. One of the key aspects of the Bayesian approach is to establish the prior probability of our claim or hypothesis being true. Once we've done this, we review our evidence to determine what's called the posterior probability - so the revised probability of the hypothesis being true, given the available evidence.

Eric says that students often struggle with establishing the prior probability and given what we said about our brains earlier in the episode, I can see why. I certainly did. That said, the prior probability is the thing that provides context - looking at the situation or the organization to see how likely the claim or suggested solution is in the first place.

00:20:40 Eric Barends

What we say there is that the whole idea of a prior, sometimes students struggle with it, but think of a prior this way. Think of it in terms of context and the context is really important to determine a probability or likelihood.

Now this sounds kind of unclear, but let's take this example. So we find a body - this person is dead. And I make the claim this is Mr Johnson. Now, how likely is it that this person is indeed Mr. Johnson? Well we found his body in the woods somewhere in the UK. Well, could be anybody right, I mean, how likely is it that this is Mr. Johnson to start with if we only have this information? Well, not likely - there's so many people in the UK and you know why would it be Mr Johnson?

Now the situation changes. We find a body in the house that belongs to Mr Johnson. Oh, what is now the likelihood that this is Mr Johnson? Well, that likelihood is very high. If it is indeed Mr. Johnson it would be very likely that if he would die, that we would found him in his house. And it's not very likely that it would be someone else. Why would someone else be lying dead in the house of Mr Johnson?

So the prior probability gives you context, so it's always good to have a look at your organization and the history and see how likely it is that a claim or a hypothesis or whatsoever in your context, in your organization, in your discipline is likely to begin with.

So we noticed that a lot of students find it helpful to think in terms of context, so this is the situation we're dealing with and how likely is it that given this situation that this is true to start with?

00:22:49 Karen Plum

In terms of assessing the likelihood of claims, Eric has spoken on other episodes about asking people how certain they are about a claim they've made. Would they be willing to bet a year’s salary, or a month’s salary or a bottle of wine on the outcome? Alternatively, could they put a number on it?

What's the likelihood, the probability and upon what evidence would they base that judgment, that the outcome of a particular intervention would be as they have suggested? This is something Nate Silver talks about. Nate is author of “The Signal and the Noise: Why So Many Predictions Fail but Some Don't”. His Wikipedia page describes him as an American statistician, writer, and poker player who analyzes baseball, basketball and elections. He talks about the way that asking people to make a bet on something can alter their certainty because they're being asked to quantify it.

00:23:47 Eric Barends

He points out that making a bet and try to - if someone is really certain, this person should also, if he is indeed really certain, would be willing to make a bet on it. So if you turn this into a discussion like - how much money, a bottle of wine or whatsoever you indirectly quantify the certainty of this person.

And I notice this helps. It sounds a little bit weird, you say OK, how much are you willing to bet? Or can we bet a bottle of wine or try to quantify it? But asking that question really gives people pause and we think also in black and white - dichotomous.

So you make a claim - I think if we do this, that'll come out. I think you're probably very certain of yourself, and if I ask you how certain are you, you say well, not completely, I read in a magazine so I thought that resonated with me and sounds plausible, that's all I have to be honest. That completely changes the whole discussion.

So my experience is that it's the same as asking for evidence which sometimes can be a little bit invasive or confronting and provocative. Asking how certain someone is, is sometimes an even better question to start with.

00:25:05 Karen Plum

Whether we believe them as to the level of certainty they feel is another question, but simply asking them to quantify it is going to concentrate their minds. So much of this process is about opening up a dialogue and discussing issues, options, and evidence. In our world of fast and furious decision making, where doing something is better than doing the right thing, being able to slow down and have a thoughtful discussion has to be a desirable direction to take.

But what about when you as an evidence-based practitioner, are challenged about your level of certainty about a decision that you're advising on?

00:25:43 Eric Barends

I think that if I would be on the other end of the table, I would try to put a probability on it. I would say, given the evidence that we collected, given the fact that there are actually several meta-analysis, they all show that X has hardly an impact on Y, I think that the likelihood that this intervention will increase productivity or performance or whatever is the outcome, is not very strong. It's almost like flipping a coin. It's 50:50 or maybe worse that you say it's close to 0. All the evidence suggests that this has hardly an impact. The research findings, the managers with their experience say that based on their experience, they say is very not likely to have an impact. If we look at the data of parts of the organization where we all tried this, we see hardly any change. Ergo putting things all together, close to 0.

Sometimes people say oh, but you know, we tried to apply Bayes rule, but we couldn't agree because I think actually that the evidence from practitioners, the trustworthiness, or the strength is 60%, but my colleague says 70%. You will see it does not really matter, it does not really move the needle. If you play around with Bayes rule, with the calculator, it's mostly the prior probability that determines the outcome and the strength of the evidence. When you have really strong evidence, something happens, and when the evidence is yeah 50, 60, yeah, that's somewhere in between.

So that's why you don't need with a calculator, you just run it in your head. Say nah, this is not really strong evidence. A little bit in favor, so makes it a little bit more likely, but does not change much.

00:27:49 Karen Plum

And as we've discussed elsewhere on the podcast, we also need to think about the biggest bang for our buck. Which brings us back to the effect size. How big an impact is this intervention going to have and how certain are you that it will work in the way you predict?

But also remember to be aware of another fallacy that people’s brains fall victim to, as explained by Michal Oleszak, who said in his article “when exposed to extreme percentages, be they large or small, people just analyze the magnitude of the percentage and ignore the total number. Our brains typically process the magnitude and ignore the total”. So don't be seduced by big percentages when the case numbers (or whatever) are very small.

00:28:34 Eric Barends

And that is sometimes easier to understand Bayes, that the strength of the evidence, how likely is it that the evidence comes up when the hypothesis would not be true? Sometimes it's also explained in goodness of fit - does this claim, does this theory, does this fit the evidence, or is there another theory or claim or assumption that fits the evidence even better or just as good? Because if the answer is well, it could also be this, all the evidence actually would come up if that would be the case, it fits the evidence or the claim just as well, then you've got two competing hypothesis and still you go ah, could be either this, it could be that - the evidence is not fairly strong in favor of A or B, so that's what you end up with.

00:29:31 Karen Plum

I sense that in a lot of organisations this sort of outcome wouldn't feel conclusive enough. As humans we search for certainty, but our role as evidence-based managers is to guide others in recognising the uncertainties, rather than claiming that a particular solution or intervention is guaranteed to be successful.

As managers, we've been used to people expecting us to give concrete decisions and to be confident that they're the right ones. But as we try to forge our way in creating the conditions for better decision making, the discussion and exploration of the practicality of the evidence is a key factor.

Denise Rousseau is a great advocate of discussing the evidence with others when bringing all sources together.

00:30:14 Denise Rousseau

I think one of the critical aspects of the process of building sort of a logic or a framework for - here's what I think will work, here's what my evidence says, here's why I think it works and then doing a test - that one of the most important ways in which people can de-bias themselves is in conversation with others. And to have a, let's call it a logic model or a theory of change. It's kind of a picture of here's what I think is going to happen to these conditions and how we'll get to there and this is the evidence that supports it.

You are then in a position to ask other people - check my assumptions here. Is this making sense? Check my assumptions here. Does this fit with the data? Don't ask them what they think. Ask them - check my assumptions because that's what we're trying to de-bias and to get a better handle on.

00:31:00 Karen Plum

As she points out, oftentimes the sources of evidence are on different aspects of the problem, and the evidence tells us different things about different conditions or contexts. But the key thing is to see whether the evidence makes sense when you bring it together. And of course, as you go through to implementation, will the intervention or solution fly with the various interest groups?

To wrap up this episode, here's a summary from Eric about the importance of the Aggregate stage to the evidence-based management journey and the need to slow down and take stock at this stage.

Next time we'll be looking at how to incorporate the evidence we've collected into the decision making process.

00:31:43 Eric Barends

So we would like to know that if the evidence indeed suggests that investing in employee satisfaction will lead to an increased performance of our employees, we want to know well how likely is it? Are we really, really certain or are we still in doubt? Because evidence-based management is about reducing uncertainty. I mean when we start and we don't have any evidence, we just have a claim or hypothesis or a question and when there's no evidence, it's like, you know, flipping a coin.

So one of the challenging things and that's where Bayes rule comes in, is you need to try to quantify the level of uncertainty, the probability or the likelihood (all these terms are more or less similar) that something is to be true that X indeed leads to Y or this intervention indeed has a positive income on that outcome.

And that's the moment where you sit down, slow down, look at all the evidence you've collected and try to figure out how likely it is that the claim or the hypothesis is true or not, so this is an important moment.

Evidence-Based Management

Module 12 Aggregate - Weigh and pull together the evidence

Listen to this podcast on