Department of Biostatistics
Quantitative Issues in Cancer Research Working Seminar
2023 - 2024
ABSTRACT: Semi-competing risks refers to the setting where interest lies in some non-terminal event, the occurrence of which is subject to some terminal event (usually death). While existing analysis methods generally assume complete data on all relevant covariates, it is often the case, particularly with electronic health records databases and/or disease registries, that some information is not readily-available. To mitigate this, outcome-dependent sampling is a common strategy, especially in resource-limited settings, although researchers currently have only limited options in the semi-competing risks setting. We present a novel class of case-cohort designs for semi-competing risks within which researchers have flexibility to tailor allocation of resources in a variety of ways that best suit the disease context and study at-hand. For estimation and inference, we propose to use inverse-probability weighting for a parametric hazard regression-based frailty illness-death model. We present asymptotic results, along with a practical estimator of the asymptotic variance. Simulation results are presented that verify performance of the proposed analysis methods in finite settings and illustrate potential efficiency gains associated with the design. The work is motivated by and illustrated with data from the Center for International Blood & Marrow Transplant Research.
ABSTRACT: In recent years, the target trial emulation framework has developed as a framework for helping researchers mitigate or avoid potential biases in observational studies. In simplest terms, target trial emulation requires researchers to specify the protocol for an ideal clinical trial they would run if possible, and subsequently establish an analogous version of the protocol for the observational study that adheres as closely as possible to that of the target trial.
A critical component of this target trial emulation framework is specifying the eligibility criteria for inclusion in the study. Electronic health record databases serve as a useful data source for observational analyses. However, as EHR are collected for billing purposes rather than any particular clinical question, useful information for statistical analyses may be unavailable. In particular, when using EHR databases to emulate target trials, it is frequently the case that subjects' eligibility status can not be ascertained due to missing data in the covariates that comprise the inclusion criteria for the study. Nearly every observational analysis under the target trial emulation framework excludes all subjects with missing eligibility data, yet this could plausibly introduce selection bias, particularly when a sequence of emulated trials are pooled to increase power.
In this work, I will outline ongoing work on building infrastructure for several simulation studies to better understand settings where excluding subjects with missing eligibility data is problematic, and potential solutions. These simulation studies are motivated by the study of long term effects of Bariatric surgery, and simulation settings are informed by prior analyses conducted on EHR-based studies on the DURABLE cohort at Kaiser Permanente.
ABSTRACT: Adverse health effects of coal-fired power plant emissions are often studied under bipartite network interference (BNI) settings, in which the treated units are different from the units that outcomes are observed on and treatment units can affect multiple outcome units. There is growing literature on causal effect estimation under BNI, but to our knowledge, none have considered the problem of optimal treatment regimes under BNI. We introduce a Q Learning and A Learning approach for determining cost-constrained treatment under arbitrary BNI, and derive the asymptotic properties of our proposed estimators. We demonstrate the efficacy of our methods in a simulation study.
ABSTRACT: Recently developed technologies can measure gene expression at single cell resolution while simultaneously preserving the spatial location of samples. Standard dimension reduction techniques such as principal component analysis (PCA) can be applied to find a small set of genes that contribute biologically relevant variation. However, standard approaches do not model the count nature of the data which can lead to spurious results. Moreover, the resulting PCA factors may not be spatially coherent in the sense that nearby cells could have very different factor scores. In this talk I will discuss preliminary work on adding spatial penalties to a Poisson-based model for dimension reduction of single-cell gene expression data (scGBM). We will demonstrate the ability of our method to produce spatially coherent factors on both real and simulated data.
ABSTRACT: SARS-CoV-2 has now become a constant in our daily lives. To mitigate the severe outcomes of infection, the scientific community updates vaccines targeting both ancestral and newer, more prevalent strains. On October 12, 2022, the FDA approved the administration of a bivalent COVID-19 vaccine targeting Omicron strain infections. We intend to evaluate the bivalent vaccine's effectiveness by analyzing cases, hospitalizations, and deaths accumulated in Puerto Rico. In particular, we will compare improvements in effectiveness with respect to different groups: the unvaccinated, those who received the primary series, and those who received the primary series followed by a booster shot. Preliminary results suggest that bivalent vaccine effectiveness significantly declines within 6 months after administration, consistent with the decline in vaccine effectiveness associated with the primary series.
|Back to SPH Biostatistics||
Maintained by the
Last Update: October 4, 2023