Aussie Med Ed- Australian Medical Education

Mastering Medical Statistics: Elevate Your Clinical Decision Making

Dr Gavin Nimon Season 5 Episode 67

Send us a text

Join host Dr Gavin Nimon (Orthopaedic Surgeon) as he unlocks the mysteries of medical statistics and take your clinical decision-making skills to new heights with insights from Dr. Adam Badenoch, an anaesthetist with a Master's in Biostatistics. Discover how essential concepts like central tendency, distribution, and variance can transform your understanding of medical research. Dr. Badenoch explains the significance of numerical and categorical data, and sheds light on how outliers can alter the mean and median, equipping you with the tools needed to critically assess statistical evidence in healthcare.

Venture into the complex world of hypothesis testing, where we explore the importance of the null hypothesis and the scrutiny needed before changing clinical practices. Dr. Badenoch demystifies the role of p-values and addresses common criticisms such as the arbitrary 0.05 significance threshold and publication bias. By emphasizing the necessity of defining clinical importance and analysis methods at the outset of studies, this discussion urges a thoughtful balance between scientific integrity and interpretation.

Our episode culminates with an insightful look into research study design and the indispensable role of statistical tools in evaluating studies. Learn about confidence intervals and their power to reveal the range of plausible values for true population parameters, standing in contrast to p-values. We also touch on the challenges of implementing evidence-based medicine in practice, with a nod to the potential and pitfalls of artificial intelligence in data analysis. This episode is a must for healthcare professionals aiming to refine their statistical acumen and apply evidence-based insights effectively.

Aussie Med Ed is sponsored by OPC Health, an Australian supplier of prosthetics, orthotics, clinic equipment, compression garments, and more. Rehabilitation devices for doctors, physiotherapists, orthotists, podiatrists, and hand therapists. If you'd like to know what OPC Health offers.

Visit opchealth. com. au and view their range online.

Aussie Med Ed is sponsored by -HealthShare is a digital health company, that provides solutions for patients, General Practitioners and Specialists across Australia.


Aussie Med Ed is sponsored by Avant  Medical Indemnity: They state that they offer holistic support to help the doctor practice safely and believe they have extensive cover that's continually evolving to meet your needs in the ever changing regulatory environment.


Dr Gavin Nimon:

I'd like to let you know that Aussie Med Ed is sponsored by OPC Health, an Australian supplier of prosthetics, orthotics, clinic equipment, compression garments, rehabilitation devices for doctors, physiotherapists, orthodists, podiatrists and hand therapists. If you'd like to know what OPC Health offers, visit opchealth. com. au and view their range online. Medical statistics plays a fundamental role in shaping modern health care. It's the foundation of evidence based medicine, allowing us to make informed decisions and Whether it's selecting the best treatment, evaluating the efficacy of interventions, or even understanding the risks and benefits for our patients. In today's episode of Aussie Med Ed, we're looking at the world of medical statistics, exploring how it influences clinical guidelines, helps us critically evaluate research papers, and informs everyday clinical practice. We'll cover the essential statistical concepts every healthcare professional should be familiar with when reviewing research, from types of data and statistical concepts to more complex topics like hypothesis testing, p values, confidence intervals, and correlation vs causation. We'll also talk about other study design, the importance of sample size, and how to spot potential bias in research papers. Badenoch, an anaesthetist who has a Masters in Biostatistics. He's going to help us break it all down for you. I'm Adam Badenoch. Whether you're new to research or looking to refresh your knowledge, this episode will give you tools to better understand and apply the evidence in your clinical practice. Welcome to Aussie Med Ed. G'day and welcome to Aussie Med Ed, the Australian Medical Education Podcast. Designed with a pragmatic approach to medical conditions by interviewing specialists in the medical field. I'm Gavin Nimon, an orthopaedic surgeon based in Adelaide and I'm broadcasting from Kaurna Land. I'd like to remind you that this podcast podcast players and is also available as a video version on YouTube. I'd also like to remind you that if you enjoy this podcast, please subscribe or leave a review or give us a thumbs up as I really appreciate the support. It helps the channel grow. I'd like to start the podcast by acknowledging the traditional owners of the land on which this podcast is produced, the Kaurna people, and pay my respects to the Elders both past, present and emerging. Well, it's my pleasure now to introduce Dr Adam Badenoch, an aethetist trained in South Australia and who has specialized fellowships in difficult airway management, medical education and simulation, as well as hepatobiliary and liver transplant anesthesia. In 2023, Adam earnt a Barstas by statistics. from the University of Adelaide, combining his clinical expertise with a deep interest in research, statistics, and anesthesia specialties, such as ENT, neuroanesthesia, and liver transplant care. Thanks Adam, thanks very much for coming on Aussie Med Ed. Statistics has always been a very difficult and confusing concept for myself, probably because it confines both mathematics and some unusual concepts. Can you please start off by explaining some basic key statistical concepts for everyone? Basic principles that people should be aware of and what they should know if they're trying to analyse medical research.

Dr Adam Badenoch:

Sure, Gavin. First of all, thanks for having me on. I, at times, find statistics a bit confusing and complicated too, so don't worry, I think, if that's how you feel. Lots of people are in the same boat, and it definitely does cover some relatively unintuitive concepts or logic at times. But I think you're right, covering some basics often helps. I think a few key concepts to understand are what sort of data are there and how do we categorise it? How can we describe different types of data? And some basic concepts related to hypothesis testing. So types of data can generally be classified into numerical or categorical. And the numerical, uh, Uh, data can be further categorised into discrete or continuous data, and categorical data can, is often further delineated into nominal or ordinal categories. A nominal categorical, uh, A categorical variable is simply one which has no logical order to it. A good example might be hair colour, as opposed to an ordinal variable, which is categorical in nature, it categorises things, but they have a natural order to them, such as small, medium and large. In terms of the numerical data, discrete data is data which essentially is like an integer. Uh, it doesn't take on a continuous range of values, but it, it falls into discrete numbers. Whereas continuous data is essentially a numerical representation of something which can theoretically be described as a entirely continuous process that can take on any specific value. So if we think of continuous data, we often describe it in terms of It's central tendency, which is where the, the largest amount of the data sits and how the data is spread around that area of central tendency. So the, the most common ways to describe central tendency would be mean, median, or mode, the mode being the most common value. The median being the 50th centile value and the mean having a number of different definitions but the usual arithmetic mean is simply the sum of all of the values divided by the number of values. Distribution can be described as a range from the lowest to the highest value. You can describe subsets of that range, such as an interquartile range. Range describes the middle 50 percent of the values, so it's less affected by occasional extreme outliers at either ends of the range. And a variance is definition of a term which describes how far each individual point is away from whichever measure of central tendency you use, usually the median.

Dr Gavin Nimon:

Right, so when we're talking about numbers in general, we'd like to average things, and that's the mean, but I understand the mean's not as good. That's affected by outliers, and that's why the median is more unuseful. Is that correct?

Dr Adam Badenoch:

Yeah, it depends on the distribution of your data. So the mean is a good value to use often because it takes information from every individual individual. A data point in the data set and it uses that information in its calculation. So you're not throwing away any information, but because of the way it's calculated it can be quite affected by a small number of particularly high or particularly low values that don't really represent the typical value of the data, if there is such a thing. The median takes the middle 50 percent of the values. So it is effectively discarding information from, you know, the top tail and the bottom tail of the data set, and for that reason is, I guess, less desirable to use than the mean if those values at either end of the range are actually considered typical and representative of the true data set, but if they are unusually high or low outliers and we don't think that they genuinely represent the true population, then it's a good thing to discard that information and just use the middle 50 percent. The range takes that concept even further and simply uses two values from the data set. So you might have a million values in your data set and to describe it simply using a range, all that does is it takes the lowest value and the highest value And you usually put a little dash between them and you say, you know, the numbers range from this to this. And it doesn't tell you anything about what else is happening in the middle of the data set. So that can be obviously hugely influenced by outlying values and doesn't tell you anything about the middle of the range. But I guess it's useful to, it's a useful concept, particularly when combined with a median. So the median looks at the middle 50 percent of values And the range looks at the extreme ends. And it can give you a nice little picture and summary when taken together of what the distribution of the data looks like. So it's horses for courses a little bit, I'd say.

Dr Gavin Nimon:

And all these things are really used as a way of describing numbers in order to interpret results. It's a way of assessing how well treatment can be useful for certain individuals or for a population in general.

Dr Adam Badenoch:

I mean, statistics is a mathematical concept which is useful not just for medicine, but it's used in finance and engineering and agriculture and all walks of life really. So anywhere where numbers can be used to represent phenomena that exist in the real world. Statistics can be useful. So the applications are virtually endless. I guess one of the key concepts for statistics and another one which often gets a little bit forgotten when people are interpreting statistics in medical literature is that any time we do a study or we analyse a data set Typically, that is a sample which has been taken from a true population. Um, so it may be, for example, that we have recruited 100 patients who have had knee operations from all of the patients that you've operated on over the last 12 months. Now, you might have operated on, you know, Well, more than a hundred patients in 12 months and what we're looking at in the study is a sample of all of the patients that you operate on or we might be trying to extrapolate our thinking to all of the patients who have knee operations, not just by you and not just in this year. So that that concept that when we analyze a study we're analyzing A sample taken from a larger true population is a really important one. And often when we come back to interpreting some of the analysis parameters and testing hypotheses that a lot of that framework is based around estimating what we think these values would truly be if we had collected data on every person that had a knee operation.

Dr Gavin Nimon:

We've introduced the concept of hypotheses. Perhaps you can explain that in more detail if you could, please. Sure.

Dr Adam Badenoch:

Um, so hypothesis testing, I guess, is a way of using statistical methods to refute a null hypothesis. And that framework of thinking is generally derived from the concept that unless we know that we're going to improve. life or medicine somehow for particularly for our patients. We would usually defer to the status quo unless we know that what we're doing can make things better.

Dr Gavin Nimon:

On the, on the basis of above all do no harm.

Dr Adam Badenoch:

Exactly, exactly. Um, and also based on the fact that, um, you can have random variation, um, in data sets as well and we don't want to infer too much into those. We want to only make changes which take a lot of effort sometimes if we know that there's a true effect there and it's not just some random variation in the data set. So typically in a study that does involve a hypothesis test there will be a null hypothesis which the simplest scenario would be a study that involves two groups and a single intervention. And the null hypothesis would be that there is no difference between the two groups, which means that the treatment doesn't have any effect. So if we, if we conduct a test, a hypothesis test, we're really looking at our data set from our single sample and trying to work out whether that data is consistent with the null hypothesis. And if it's not consistent with the null hypothesis by a small amount, we may say, well, there may be a small true effect here, but this might also be due to chance. Whereas if the data in our sample is very, very inconsistent with the null hypothesis, that's much more convincing for us to say, well, there Actually, we think we have enough evidence here collected that we can refute this null hypothesis with confidence. And in rejecting that null hypothesis, we obviously then come up with an alternative hypothesis. And that might be that the treatment improves the outcome that we're looking at, or maybe it makes it worse. And we can make an estimate as to by how much does it increase or decrease. So that's the general framework of thinking and ideas behind hypothesis testing.

Dr Gavin Nimon:

And I believe that works then on working out chance of that happening and using what we call p value.

Dr Adam Badenoch:

Absolutely right. Yeah, so a p value is the chance of observing your data set if the null hypothesis is true. And obviously the more your data deviates from what you would expect under the null hypothesis, the chance of those results arising due to chance alone without your treatment having any true effect. become smaller and smaller as the differences become more and more extreme. The important thing to remember about p values is that they have come under quite a lot of criticism in recent times, largely because that is the only information that they convey. They convey the probability that, you know, your results arose due to chance alone. And so the lower that chance is, you know, the more confidence you can have in rejecting the null hypothesis. But it doesn't tell you anything about the magnitude of the change that you're actually observing.

Dr Gavin Nimon:

I always thought that increasing the sample size reduces or increases the chance of having a positive p value and that might be one of the reasons it was criticised in that sense. I also realised too that the actual value of 0. 05 was just decided arbitrarily by Fisher, an early statistician, who thought that 1 in 20 was a reasonable number to choose.

Dr Adam Badenoch:

That's absolutely right. So the value that you choose as The threshold for what you consider a significant p value versus one which you're going to ascribe due to chance alone is completely arbitrary. There's been a convention for a long time now to set that value at five percent or 0. 05, but as medical literature becomes more and more common, you know, there are more and more p values. studies conducted and papers published every day, year after year, we start to see a little bit of a phenomenon whereby there can be other issues at play, such as publication bias and a few other bits and pieces, which mean that this conventional thinking that, you know, one in 20 chance is something which is never going to happen unless there's a true effect. Um, you know, if you, if you've got thousands of. papers being published, you know, every day, and they're all testing hypotheses, you know, that's way more than 20 papers. You're going to get lots of them that are going to have statistically significant results due to chance alone. And so for that reason, you know, there's probably a bit of a push to start using lower p values to define significance and or just encouraging readers to interpret p values without necessarily feeling forced to ascribe a single arbitrary threshold to them as to whether they're significant or not. You can certainly look at it as a probability chance as to how likely this data set arose due to chance alone and make up your own mind about whether you think there's a true effect there or not.

Dr Gavin Nimon:

I'd like to let you know that Aussie Med Ed is supported by Healthshare. Healthshare is a digital health company that provides solutions for patients, GPs and specialists across Australia. Two of Healthshare's products are Baird Consult and A pre consultation questionnaire that allows GPs to know a patient's agenda before the consult begins. As well as HealthShare's Specialist Referrals Directory. A specialist analyzed health directory helping GPs specialist. What about the opposite end of the spectrum? So what you're saying at the moment is that 0. 05 might be a bit high. What about the people who talk about, oh, things approaching significant?

Dr Adam Badenoch:

There are probably two sides to that. On, on one hand, I think to maintain integrity in the research process, you do need to be faithful to the traditional scientific method and I think to, to claim that you have found a causal link between something, you know, requires a whole lot of things to line up. One of which is to have observed A difference, a true difference due to your intervention. And the way to do that really is to define what you think is an important difference clinically before you start the study, um, and to also define exactly how you're going to analyse the outcome. And as part of that, I think you do need to define a threshold level of significance and stick to that in your analysis and in your write up. Certainly that can be frustrating for authors, I think, who might, in their data set, observe probably the effect that they were looking for. The effect might be slightly smaller than they were expecting, or the variance in the data set slightly higher than they were expecting. And as a result, their p value is not quite as small. as their, the threshold value that they had picked prior to starting the experiment. So in those scenarios, I think as the author of the published paper, you just need to stick to your a priori decision making framework. But that's not to say that as readers we can't also consider the fact that p values are are simply probabilities that results arose due to chance. Yes, the arbitrary thresholds are important, but we don't, they can be interpreted in another framework, I guess, as the reader, if you're not the person who has, has set the level, um, when you're registering a study, for example.

Dr Gavin Nimon:

So what about my other thought too, the larger the study, the greater the power of the study, the more chance it reached significance. And so it almost seemed like. If you just kept increasing the sample size, you'd end up, everything would be a 0. 05 p value?

Dr Adam Badenoch:

Yeah, that's right. In the calculation of a p value, usually the things that will influence it are the size of the effect, the variation that exists in the data set, and what the sample size is. So, for any given combination of effect size and variance in a data set, The larger your dataset, the smaller your p value is going to be. And so as things like electronic medical records and data linkage and data sharing become more and more common, the possibility of mega datasets to emerge becomes more and more realistic and more and more common, and certainly that's a phenomenon worth bearing in mind too. So for any time that a p value is very small. It means that there probably is an effect there, but the p value may be small for any one of those three reasons. So, low variance in the data set, a large magnitude of effect, or a very large sample size. So if it's a large magnitude of effect, then obviously that's clinically very important to us as clinicians. The other two are more statistical phenomena, which are not so important for how effective the treatment is. Um, so it is important to look at the sample size when considering a p value.

Dr Gavin Nimon:

So if you see a positive p value, but you actually see the actual overall result saying a very small effect, You might be thinking, well, okay, that's useful to know, but it's not going to really change my clinical practice as much as a huge variance with a positive p value and a small sample size where you go, that's really important.

Dr Adam Badenoch:

Exactly.

Dr Gavin Nimon:

What about this idea of confidence intervals that you also see talked about as well? I get a little bit confused on that because it seems to have a range and it has a number in the middle. Can you explain that to me? And does that have anything to do with what we're talking about with p values as well? No.

Dr Adam Badenoch:

Sure. So confidence intervals are a range of plausible values within which the true population value will lie with a particular degree of confidence. Typically they're presented as 95 percent confidence intervals. So it's a range of values which will include the true population value with 95 percent certainty. So, In many senses they're analogous to p values, but they have the added advantage of providing a range of values, not just a probability of whether something arose due to chance or not. So the range of values that's provided by the confidence interval can give you an idea of the magnitude of effect. Often they can be calculated around an estimated value, so you might estimate it. You know, the average difference in blood pressure between two groups. Uh, so the average difference in blood pressure has a single particular value and that's the value that you'll see in the middle of the confidence interval range and either side of that you have the, the edges of the confidence interval range. So that's the range of plausible values that the, the difference in blood pressure could take.

Dr Gavin Nimon:

So in that scenario, if you then gave a medication that adjusted the blood pressure of the group of patients, they would then have a range and a confidence interval of 95%. And then do you compare the two confidence intervals that way and how do you do it?

Dr Adam Badenoch:

Not ideally like that. So you could calculate a blood pressure value in the first group and a confidence interval around that. And you could calculate a blood pressure in the second group and a confidence interval around that. What would be preferable to do, if you knew that the aim of your study was to compare the difference in blood pressure or that how it changes when you administer your treatment, you can make the outcome of your trial deliberately the difference in blood pressure. And so that's blood pressure one minus blood pressure two, and that then becomes a single value. You can then, uh, calculate a con confidence interval around that single difference value. And that's much more useful. And a more, a more valid way of telling what the difference is between two groups than, um, simply comparing overlap of confidence intervals between two separately created confidence intervals.

Dr Gavin Nimon:

We might come back to that example in a second when we start talking about the different tests and things we use and talk about how. What, whether it's a parametric test and what, what test you'd use in that example. If we move on a bit further though, what are the common pitfalls in interpreting statistical data in medical studies and how can they be avoided?

Dr Adam Badenoch:

I would say that the most common pitfall is probably to assume that a study is well conducted and that the conclusions are valid. I find it's much better to assume the opposite and ask the authors to prove you wrong. If I can't do that, I'll just remain dubious.

Dr Gavin Nimon:

Okay, almost like a null hypothesis on the study you're reading. Exactly. So, what are the key characteristics of a well designed and robust study that you need to look for?

Dr Adam Badenoch:

So there are a number of factors, and really, there's a huge long list of things to look for, because good research is really just a process of doing lots of little things right. And if you do all of those little things right, Then you'll have done good research. If you do none of those little things, then it's bad research. And there's a huge amount that sits in the middle there somewhere, where they've done some things right, or many things right, and some things not so well. But I guess the, some of the important concepts are to understand concepts like the study design is probably the single most important factor. So, whilst it's It's possible to have a well conducted case series or a well conducted observational study. If we assume that studies have all been conducted with a similar level of rigor, then a randomised control trial that is blinded is a much better design than an observational study of any sort and an observational study that makes some attempts to adjust for confounding. is better than a case series and a case series is better than a case report. So that level of evidence that people are probably familiar with in terms of an evidence based pyramid still holds true. Probably the caveat to that is the fact that publication bias is a real phenomenon and that can certainly influence meta analyses findings. Often the meta analyses and systematic reviews sit at the top of that pyramid, but sometimes If a meta analysis shows a difference between two groups, or that a treatment is effective, it's probably actually better to go and conduct a single, really well designed, robust, large, pragmatic trial to confirm those results, to ensure that it's not due to publication bias. If you've got a systematic review or a meta analysis which demonstrates no difference between two groups, um, then we can be pretty confident that That hasn't arisen due to publication bias and you can probably take that result as a, as being a true one.

Dr Gavin Nimon:

I was going to quickly just ask, just for the listener, what a systematic review and a meta analysis is. Can you just explain to them what that involves?

Dr Adam Badenoch:

Yeah, so a systematic review is simply a systematic search through Meta analysis is a process which is used to pull results from multiple studies and it comes up with an average effect. Um, so often, um, Meta analyses and systematic reviews are pulled together. You would need to conduct a systematic review before being able to conduct a meta analysis. So that the idea of a meta analysis is it's a way of generating a large amount of data to answer a question without necessarily needing to do that within a single new trial. It's using existing results in the medical literature to come up with a It's particularly helpful if you have multiple small trials, particularly if they have some, um, difference in their results.

Dr Gavin Nimon:

And a publication bias would be where lots of studies have been pulled but the actual, all the individual studies aren't of great quality and therefore they influence the results?

Dr Adam Badenoch:

Yeah, publication bias is typically this phenomenon whereby studies which show a difference between between groups or treatment effectiveness are more likely to be published than those that don't. And so when people conduct their systematic review, generally you can only find studies that have been published. So there might be a whole range of studies that people have conducted which represent the, you know, the true effect of your treatment or intervention, which have never made it to print and therefore never make it into a systematic review and meta analysis.

Dr Gavin Nimon:

I'm learning all the time. So what other types of biases do we need to be aware of in to try and design a well robust and ideal study?

Dr Adam Badenoch:

Essentially anything you can think of that can go wrong in a study can be a potential source of bias. Depends greatly on the study design and what you're doing. You might be conducting a survey. There might be ways that you are asking the questions which are a little bit leading. That can introduce bias. You know, if you're measuring an outcome, it might be that if you, the person who's measuring the outcome isn't doing it in a particularly objective way, and if they know which, which group participant has been assigned to, you know, that we have all sorts of inherent, um, biases within us as, as humans that can happen whether, whether we mean them to or not. There are other phenomenon as well in trial conduct whereby patients might be excluded from the trials for particular reasons and that might bias the results. Or it may be that if we're looking at how much a blood pressure pill drops our blood pressure by, if the pill which drops the blood pressure the most also kills patients, then the fact that patients drop out of our data set. Because they die, it's going to sort of dilute the observed treatment effect from the pill which drops blood pressure the most. So, there's really no end to the potential sources of bias. They can be sort of typically categorised into some of the more common forms, but it can be anything really. It's anything which generates a systematic deviation from

Dr Gavin Nimon:

Yeah, but in that pill discussion, if you had a red pill versus a blue pill, the blue pill might be more calming and get a slightly better drop in blood pressure because of the placebo effect as well. And that can lead to a treatment bias as well. Is that correct?

Dr Adam Badenoch:

Yes, absolutely.

Dr Gavin Nimon:

Is there any other particular things you'd like to do in a study apart from excluding bias and make sure it's been assessed appropriately and conducted appropriately? Is there anything else that could make a better study overall? Yes, absolutely.

Dr Adam Badenoch:

Yeah, so some of the other more important points are probably to register your study prospectively. Publicly announcing what it is you're going to do, who are you going to study, what parameters are you going to collect, what is actually your primary outcome of interest, and how are you going to analyse it. If you can describe all of those things before you start your study, that gives you much more faith that there's a, it's a valid. Result and conclusion. The alternative to doing that is to not publicly disclose any of those things and simply publish your results and describe your conclusions after the study's conduct. The problem with doing that is this phenomenon of p value hunting which is a symptom of the publication bias that we referred to earlier. So authors know that their studies are more likely to be published if there's a difference between their groups and often Scientists, clinician researchers will conduct studies because they think that there is an effect there to be observed. And so if people are then left to their own devices, some of those inherent human biases can come out and, and people can sort of, you know, change the outcome that they were initially intending to look at because one of the other outcomes they collected in their study now seems a much more interesting result. You know, there's a difference between the two groups and it's. Now seems, you know, quite clinically important, so maybe we'll report this one instead. Or they'll slightly change the definition of how they define their primary outcome and, and report that, the newly defined version because that, that provides a more statistically significant and interesting result. So registering studies, protocols and statistical analysis plans prospectively helps to guard against those things.

Dr Gavin Nimon:

And obviously for the listener, there's just things that can creep into studies and we need to watch out for.

Dr Adam Badenoch:

Yeah, I mean, people do study this from time to time. They'll take random samples from even quite respectable journals and examine the incidents with which some of these things happen. And whilst it's, you know, it's not to say that this happens all the time and everyone does it, it certainly has a significant prevalence in the medical literature across all disciplines.

Dr Gavin Nimon:

And what are the red flags that we need to look for in that scenario? Is it just the things you've already talked about? Are there anything particular that comes to mind that you see?

Dr Adam Badenoch:

I mean, by and large, it's the opposite of all of the things we've just talked about, um, that are the markers of good research. And anytime you don't see those potential markers of bad research. Another thing that can help sometimes in a study to set your mind at ease that A finding is robust and true is a sensitivity analysis because sometimes despite our best intentions there are actually multiple valid ways of defining an outcome or analyzing a particular variable and those different definitions and analysis methods can each have their own assumptions and drawbacks. Not, not that they're necessarily wrong to do it that way but there are just some assumptions that are made as part of the analysis process. So sensitivity analyses are designed to ask the question, what if we didn't make those assumptions or what, so what if we use the alternative but also valid definition of the primary outcome? What if we analysed it in a different way which is also considered appropriate? you know, in our field. So going through that process and then verifying if we didn't make these assumptions and we did it this alternative way, do we end up with the same study conclusion or does it change our study conclusion? So if you, if you do a range of sensitivity analyses and you get a different answer each time, that's quite unsettling in terms of You know, being able to have confidence in the result, whereas if there's a range of sensitivity analyses that are done and they all support the same study conclusion, then you can have a lot more confidence that these little assumptions that are made each step along the way in the analysis and the definition of outcomes and perhaps the choice of the primary outcome as well are not making or breaking the study, I guess.

Dr Gavin Nimon:

A question that's come to mind while we're speaking, are there any tools to help you analyse the research you're reading? Are there any guides at all, particular checkboxes you need to look at?

Dr Adam Badenoch:

Yeah, there are. There are some really useful guides, checkboxes, and help kits. Probably the best one to know about, which is the most widely applicable across all study designs and all research, are the consort tools and checklists. The CONSORT was a group that was formed, um, designed to promote high quality research. And they have produced a range of guidelines and checklists, which are really good if you're thinking of designing a study to use, but also as a reader as well. You know, any time you pick up a new study and read it, if you're running a journal club or anything like that. They can be a fantastic resource to use and they've got a different checklist for different scenarios. There's a guide and a checklist for randomised control trials, there's one for meta analyses, there's one for observational studies and they're very useful.

Dr Gavin Nimon:

So what sort of are the common tests we need to know or should be aware of?

Dr Adam Badenoch:

I think the most common test being used at the moment would be a student's t test for continuous data. And a chi squared test or Fisher's exact test for categorical data. Those two tests are relatively easy to calculate and are very robust in a range of different scenarios. So they've become commonly used tests for good reason and they can be applied to a large range of scenarios. So they are good ones to know about. In terms of, um, how many other tests there are. There are a huge gamut of different tests out there and whether or not to consult a statistician. There are some scenarios where I think if you feel confident to analyze a data set yourself then that can be a great thing and there are other times when you know if you feel not so confident that involving a statistician is a really good thing to do and that you know there's a range of complexities in data sets and analyses and so where you sit as an individual. And where the study data set and analysis sits on that spectrum is always going to be a different beast for different projects. I guess the one thing I've learned in my statistical studies over the years is how easy it is to get it wrong sometimes. So I've certainly learned to have a very low threshold for involving a statistician. You know, even being an accredited statistician myself now, I have a very low threshold for involving someone else. Not necessarily for them to do the whole analysis. In some cases where it's outside of my scope, I'll ask someone to do the whole thing for me. And in other scenarios, I might just ask for some supervision of what I'm doing. So, I think any of those It's whatever you feel comfortable with, and even for a given individual that will vary from project to project because different projects have involved different levels of complexity. But I'd always err on the side of involving a statistician if you're not sure. And I would always err on the side of doing this work and putting this thinking in and involving the statistician if you're going to do it. earlier rather than later in the project, you know, ideally in the design phase, the initial design phase of any study is where that involvement should come.

Dr Gavin Nimon:

Healthcare professionals, how can they integrate this sort of assessments of studies and better evidence based medicine into their clinical practice effectively? What are your thoughts on this?

Dr Adam Badenoch:

Uh, that's a tricky one. So changing clinical practice is difficult and definitely not my area of expertise. You know, I think understanding the evidence is the first step, uh, and that's certainly where I am more comfortable. From there, I think, you know, there are many other organisational and interpersonal elements that start to become increasingly important. And I guess, you know, how do you make that transition? I guess starting to talk to your colleagues about this new evidence that you've read is probably the first good step. You'll get a little bit of feedback from them and if you both agree, hey, this is a promising new revelation, you know, maybe we should look at changing our practice, at least then you've got a friend alongside you. So, that's probably As far as I would take it in terms of, you know, my level of expertise. After that, you know, there are lots of other people out there, lots of my colleagues who are much better versed in the ideas of quality improvement programs and organizational change. The role of managers, um, comes into that. It starts to become, you know, managerial domain as well. So. Depending on what the change is that you're trying to implement can be quite complicated and difficult, and whether or not there's evidence to support efficacy in terms of patient outcomes, or cost effectiveness in terms of healthcare dollar savings, is almost one small part of a much more complex piece of machinery to actually make that change in clinical practice, I'd say. I'm also aware that there's a science behind this change in clinical practice as well, so keep It's not really my area of expertise.

Dr Gavin Nimon:

Well what we've really summarised is that statistics are important to analysing results. But unless you do a properly conducted study, assessing what biases there might be, outlining what you're planning to assess along the way, and using appropriate statistics to analyse that, then you can't really be confident that the results you're showing are definitely what is actually happening in real life. You've also got to assess studies that you're reading in a similar fashion. And use tools such as that from, you've mentioned, like the CONSORT to assess them. And also being aware that when you do assess a result, that it could be useful in clinical practice to involve a team to help critically assessing it. Would that be a good summary of what we've been discussing today?

Dr Adam Badenoch:

Yeah, I think that's a fantastic summary. Well done.

Dr Gavin Nimon:

I've always liked to finish off about the use of artificial intelligence. What about the role of artificial intelligence in helping analyse data, or is that going to have a greater risk of introducing biases along the way.

Dr Adam Badenoch:

Oh look, I think you've hit the nail on the head. It has huge potential, but there are also the potential for major biases and problems. I think artificial intelligence covers a range of different concepts. The one which is probably most relevant to statistics is probably machine learning. Machine learning is a way of analysing data based on some automated, pre specified rules, but usually it is a multi layered process, or a process which feeds back on itself numerous times. But each layer, or each time it conducts this process, it's usually doing some sort of relatively simple or straightforward analysis. analysis technique, which is not new, something that already exists in the world. For example, something like a logistic regression model. If you have a binary outcome, you know, a machine learning model might use something like a random forest program, and that can use a series of layered logistic regression models to decide a final binary outcome. And it, it does that without necessarily disclosing a lot of the information. I think what, maybe one of the things we could have talked about earlier that now becomes relevant to assessing AI stuff is it's always good to see the diagnostics of any of the analysis plans that have been done in studies. So ideally if people can present their study data set in a de identified Present the code that they use to analyse it. And some techniques such as regression models can often be tested to see how well the model actually fits your observed data. So those model diagnostics can be really good to present as well. So there are some things that can be good to look for in good research. And the absence of those things can sometimes be a flag for bad research. With AI it just takes that to the next level because it's doing so much computational work. It requires a lot of code and there's a lot of output from the code because it's running multiple models again and again and again in this iterative process. But really the only way to be sure that they haven't done something silly is to look at all of that code and all of that output. And so we don't necessarily expect that everyone We'll be able to understand that code and that output, but some people can. And, uh, I think the one thing that we would ask is that those things are published each and every time someone uses, um, machine learning as part of an analysis method for a study. Simply to publish the, the code, ideally the data set, um, uh, and the output so that it, it can be verified.

Dr Gavin Nimon:

Basically, without knowing all the intricacies of what is actually computing, you really can't be sure that the numbers that come out are correct.

Dr Adam Badenoch:

No, that's right. There are some scenarios where it can be really useful too. Like I think in something like image recognition, pattern recognition aspect of it, through an iterative process, can be extremely useful. But you contrast that against something that has a numerical outcome. For example, if we come back to our blood pressure, scenario, usually in a blood pressure study, the outcome is a number which can be calculated based on a relatively simple formula. So you'll see some people occasionally applying something like a machine learning model to calculate a blood pressure study outcome. And that's a huge waste of computational power and using an iterative process which starts with a sort of a random guess as to what the number might be and gradually narrows it down based on your observed data set. When you could actually just calculate. what that value should be through a relatively simple formula. So sometimes it gets a little bit misused as well. It's become a bit of a sexy term that people like to apply to whatever it is they're doing irrespective of whether it's actually the most elegant or accurate way of deriving the answer.

Dr Gavin Nimon:

Well finishing off with that blood pressure model, would you use a chi square test or student test to analyze the differences in blood pressure after treatment or what would you tend to use in that scenario?

Dr Adam Badenoch:

Yeah, so blood pressure is a numerical outcome which theoretically could take on any number of values not just integers. So it's a good example of a continuous variable and we know that it's relatively normally distributed amongst most populations. So we can use a student's t test to analyse it or a linear regression model are two good ways it should give you the same answer essentially.

Dr Gavin Nimon:

Well look Adam, thank you very much for coming on Aussie Med Ed and enlightening us on statistics and the use of medicine. So thank you very much.

Dr Adam Badenoch:

No worries Gavin, thanks for having me on. It's been great to be a part of your show.

Dr Gavin Nimon:

That's brilliant, thank you once again. I'd like to remind you that all the information presented today is just one opinion and that there are numerous ways of treating all medical conditions. It's just general advice and may vary depending upon the region in which you are practising or being treated. The information may not be appropriate for your situation or health condition and you should always seek the advice from your health professionals in the area in which you live. Also, if you have any concerns about the information raised today, please speak to your GP or seek assistance from health organisations such as Lifeline in Australia. Thanks again for listening to the podcast and please subscribe to the podcast for the next episode. Until then, please stay safe. I'd like to let you know that Aussie Med Ed is sponsored by Avant Medical Legal Indemnity Insurance. They tell me they offer holistic support to help the doctor practice safely and believe they have extensive cover that's continually evolving to meet your needs in the ever changing regulatory environment. Avant They have a specialist medical indemnity team located here in Australia and have access to medical legal experts 24 7 in emergencies.

People on this episode