Drawing the wrong conclusion about subgroups: a comparison of Bayes and frequentist methods

In the previous post, I simulated data from a hypothetical RCT that had heterogeneous treatment effects across subgroups defined by three covariates. I presented two Bayesian models, a strongly pooled model and an unpooled version, that could be used to estimate all the subgroup effects in a single model. I compared the estimates to a set of linear regression models that were estimated for each subgroup separately. My goal in doing these comparisons is to see how often we might draw the wrong conclusion about subgroup effects when we conduct these types of analyses. [Read More]

Subgroup analysis using a Bayesian hierarchical model

I’m part of a team that recently submitted the results of a randomized clinical trial for publication in a journal. The overall findings of the study were inconclusive, and we certainly didn’t try to hide that fact in our paper. Of course, the story was a bit more complicated, as the RCT was conducted during various phases of the COVID-19 pandemic; the context in which the therapeutic treatment was provided changed over time. [Read More]

Posterior probability checking with rvars: a quick follow-up

This is a relatively brief addendum to last week’s post, where I described how the rvar datatype implemented in the R package posterior makes it quite easy to perform posterior probability checks to assess goodness of fit. In the initial post, I generated data from a linear model and estimated parameters for a linear regression model, and, unsurprisingly, the model fit the data quite well. When I introduced a quadratic term into the data generating process and fit the same linear model (without a quadratic term), equally unsurprising, the model wasn’t a great fit. [Read More]

Fitting your model is only the beginning: Bayesian posterior probability checks with rvars

Say we’ve collected data and estimated parameters of a model that give structure to the data. An important question to ask is whether the model is a reasonable approximation of the true underlying data generating process. If we did a good job, we should be able to turn around and generate data from the model itself that looks similar to the data we started with. And if we didn’t do such a great job, the newly generated data will diverge from the original. [Read More]

Estimating a risk difference (and confidence intervals) using logistic regression

The odds ratio (OR) – the effect size parameter estimated in logistic regression – is notoriously difficult to interpret. It is a ratio of two quantities (odds, under different conditions) that are themselves ratios of probabilities. I think it is pretty clear that a very large or small OR implies a strong treatment effect, but translating that effect into a clinical context can be challenging, particularly since ORs cannot be mapped to unique probabilities. [Read More]

Sample size determination in the context of Bayesian analysis

Given my recent involvement with the design of a somewhat complex trial centered around a Bayesian data analysis, I am appreciating more and more that Bayesian approaches are a very real option for clinical trial design. A key element of any study design is sample size. While some would argue that sample size considerations are not critical to the Bayesian design (since Bayesian inference is agnostic to any pre-specified sample size and is not really affected by how frequently you look at the data along the way), it might be a bit of a challenge to submit a grant without telling the potential funders how many subjects you plan on recruiting (since that could have a rather big effect on the level of resources - financial and time - required. [Read More]

Generating random lists of names with errors to explore fuzzy word matching

Health data systems are not always perfect, a point that was made quite obvious when a study I am involved with required a matched list of nursing home residents taken from one system with set results from PCR tests for COVID-19 drawn from another. Name spellings for the same person from the second list were not always consistent across different PCR tests, nor were they always consistent with the cohort we were interested in studying defined by the first list. [Read More]

The case of three MAR mechanisms: when is multiple imputation mandatory?

I thought I’d written about this before, but I searched through my posts and I couldn’t find what I was looking for. If I am repeating myself, my apologies. I explored missing data two years ago, using directed acyclic graphs (DAGs) to help understand the various missing data mechanisms (MAR, MCAR, and MNAR). The DAGs provide insight into when it is appropriate to use observed data to get unbiased estimates of population quantities even though some of the observations are missing information. [Read More]

Framework for power analysis using simulation

The simstudy package started as a collection of functions I developed as I found myself repeating many of the same types of simulations for different projects. It was a way of organizing my work that I decided to share with others in case they wanted a routine way to generate data as well. simstudy has expanded a bit from that, but replicability is still a key motivation. What I have here is another attempt to document and organize a process that I find myself doing quite often - repeated data generation and model fitting. [Read More]

Randomization tests make fewer assumptions and seem pretty intuitive

I’m preparing a lecture on simulation for a statistical modeling class, and I plan on describing a couple of cases where simulation is intrinsic to the analytic method rather than as a tool for exploration and planning. MCMC methods used for Bayesian estimation, bootstrapping, and randomization tests all come to mind. Randomization tests are particularly interesting as an approach to conducting hypothesis tests, because they allow us to avoid making unrealistic assumptions. [Read More]