“Causation" is simultaneously one of the most important and misunderstood ideas in health science. A popular trope in health research is to avoid “causal" language. However, this practice can be misleading.
What makes research “causal" or not is a combination of:
- The requirements of the question we want to answer
- The methods we use to answer the question
- How well the requirements and methods match
Ultimately, the question of interest and how it informs decisions determines causality, not the words we choose to use.
Causation confusion in research
As we found in a systematic review of language and action implications of health research, researchers often intend to inform causal questions but avoid causal words, leading to mismatches and potential confusion. Rather than focusing on what words to use, we need to better understand and communicate our questions and the strength of our methods. Because our research is generally intended to aid decision making, that often means embracing the causal nature of so many of our questions rather than hiding it.
What is causation?
A practical way to think about causality is to think about influence and attribution: if we were to influence some X, would that in turn influence some outcome Y? Most healthcare decisions are ultimately causal questions. If there is a change in Y, how much of that is attributable to X? If our questions are about influence and attribution, we need to know a lot more than just whether people who have X have more Y: we need to know what happens when we change X.
- Should I choose drug A or drug B for this patient's condition?
- If I lowered premiums, how would that change our risk pool?
These questions are ultimately causal ones, and associational evidence without a causal structure is not an adequate substitute. Even if we have data that shows that people who take drug A have better outcomes than people who take drug B, or that insurance offerings that are less expensive have different kinds of participants, these associations won't answer the question in which we are interested.
There are many questions for which association without causal inference is the right approach. For example, when we want to know which population groups we should target for interventions, we might see what geographic areas are associated with our outcome of interest. Alternatively, we may want to predict clinical outcomes for some procedure given certain patient characteristics, so long as the goal truly is prediction rather than manipulation of underlying factors.
When we use “associational" language as a euphemism for poor quality causation, we are conflating 2 distinct concepts: the question we are answering and how well we are answering it.
Association is not “bad" causation, just like causation is not “good" association; they are different frameworks for answering different kinds of questions. Which framework to choose is based on the questions we are asking.
Getting to the question we need to ask in a project
Genuinely understanding the underlying question can be complicated. In the right circumstances, association, or lack thereof, can be sufficiently indicative of a (lack of) causal effect that it can influence our decision making. In other cases, what looks like a simple question can contain multiple questions with different requirements embedded within it.
For example, if we are trying to understand how to improve diversity in our clinical trials, we might want to think about targeting less served populations (non-causal) and seeing how our intervention impacts different populations differently (causal). Having a good understanding of the question of interest can often be the crux of an entire project.
Avoiding common errors in health research
Unfortunately, errors stemming from failure to understand when a question requires causality are fairly common in all areas of health research, including health economics. One very common way this appears is using “risk factors" as causal factors, and assuming that if you change the risk factor that translates to changing the outcome, according to the risk prediction model.
For example, we have cardiovascular risk prediction models that have risk factor inputs such as smoking, age, BMI, atrial fibrillation, etc. A common mistake is assuming that if an intervention causes a change in BMI, we can use a risk prediction engine to estimate the causal change in heart attacks, diabetes, etc., that would result from the causal change in BMI. In reality, any results we get from this strategy are unreliable.
Most models estimating causal effects need to be based in causal (not predictive) structures from beginning to end. Prediction and causation are fundamentally different problems.
Much of the difficulty in this area comes from matching the question of interest (which is often causal in nature) with what is feasible to answer. Causal inference is fraught with difficulty and common misunderstandings. In some scenarios, we may be able to apply a combination of data, analytic methods, and circumstances to put together a sufficiently causal estimate and extract a useful estimate.
Randomized studies can be fantastic opportunities for causal inference, provided that what is feasible to do in a randomized study is well matched to what we want to know. Sometimes we might be able to sufficiently control or adjust for alternative non-causal issues and/or be able to take advantage of a specific scenario that lets us avoid having to control for much or anything at all.
Risks of failing to use causal language
Avoiding causal language also results in weak and misleading research through deniability. If causal language is not technically used, then the work can't be critiqued for the methodological and conceptual frameworks that causality requires. Associational language for inherently causal questions serves as a rhetorical shield against methodological rigor, while allowing the authors to make action recommendations that either explicitly or implicitly would have required that their research had a rigorous causal design.
Unfortunately, identifying a causal effect is often unfeasible. It may be tempting to control for a bucket of variables, say “it's just an association," use it in place of a causal estimate, and call it the best we can do. Unfortunately, that misleading practice rarely results in reliable estimates and use of data and may actively make evidence-based decision making difficult. Just as it can be harmful to use the wrong treatment for a condition, it can be actively harmful to use unreliable evidence for “evidence-based" decision making. It is almost always better to acknowledge the limitations of what is knowable than to use inappropriate substitutes.
Arriving at the right decision
In general, best practice is to be explicit about what decisions you intend to inform and how strongly your methods answer that need. Keeping these separate and clear can greatly improve the strength of evidence and its usefulness in application.
We must have a full understanding of the decision to be made, the evidence available to us, a large toolbox of methods and expertise from which to draw, and the right circumstances to generate highly credible, decision-oriented evidence. Reach out to RTI Health Advance today to continue the conversation and put our team to work on your next project.