Biostat 629
Preface
1
Mixed Models for Longitudinal Data Analysis
1.1
Methods for Analyzing Longitudinal Data
1.2
Mixed Models for Continuous Outcomes
1.3
Advantages of using random effects
1.3.1
Within-subject correlation
1.3.2
Inference about Heterogeneity - Variance of Random Effects
1.3.3
Best Linear Unbiased Prediction
1.4
Generalized linear mixed models (GLMMs)
1.4.1
GLMMs with Binary Outcomes
1.4.2
GLMMs with Count Outcomes
1.5
Fitting Linear Mixed Models (LMMs) and Generalized Linear Mixed models (GLMMs) in
R
1.5.1
Fitting LMMs with the sleepstudy data
1.5.2
Model Comparison of LMMs using anova
1.5.3
Extracting BLUPs in lme4
1.5.4
Fitting Binary GLMMs using the Ohio data
1.6
Exercises
1.6.1
Questions
2
Missing Data and Multiple Imputation
2.1
Missing Data in R and “Direct Approaches” for Handling Missing Data
2.1.1
Complete Case Analysis (Listwise Deletion)
2.1.2
Other “Direct” Methods
2.2
Multiple Imputation
2.2.1
Short Overview of Multiple Imputation
2.2.2
Multiple imputation with mice
2.2.3
Categorical Variables in MICE
2.3
What is MICE doing?
2.4
Longitudinal Data
2.5
Different Missing Data Mechanisms
2.5.1
Missing Completely at Random (MCAR)
2.5.2
Missing at Random (MAR)
2.5.3
Missing not at Random (MNAR)
3
Nonparametric Regression with Longitudinal Data
3.1
Notation
3.2
Kernel Smoothing
3.2.1
Description of Kernel Regression
3.2.2
Kernel Regression in the sleepstudy data
3.2.3
Bandwidth Selection
3.2.4
Another Example: The Bone Data
3.3
Regression Splines
3.3.1
Overview
3.3.2
Regression Splines with Longitudinal Data in R
3.3.3
Looking at a Continuous and a Binary Covariate
3.3.4
Model Comparison
3.3.5
ACTG trial example
3.3.6
Treatment Comparisons for the ACTG trial
4
Sparse Regression for Longitudinal Data
4.1
Sparse regression methods
4.2
The Lasso with longitudinal data
4.3
Lasso for LMMs and GLMMs in R
4.3.1
Soccer Data
4.3.2
Choosing the tuning parameter for the soccer data
4.4
Cross-Validation for Longitudinal Data
4.5
Penalized Generalized Estimating Equations
4.5.1
The PGEE package
4.6
GLMM-Lasso with Binary Outcomes
5
Risk Prediction and Validation (Part I)
5.1
Risk Prediction/Stratification
5.2
Area under the ROC curve and the C-statistic
5.2.1
Sensitivity and Specificity
5.2.2
The ROC curve
5.2.3
Computing the ROC curve
5.3
Area under the ROC curve
5.3.1
Rewriting the formula for the AUC
5.3.2
Interpreting the AUC
5.3.3
Computing the AUC in R
5.4
Calibration
5.5
Longitudinal Data and Risk Score Validation
6
Risk Prediction and Validation (Part II)
6.1
The Brier Score
6.1.1
Brier scores for biopsy data
6.1.2
Out-of-sample comparisons
6.2
Brier Scores with Longitudinal Data
6.2.1
Option 1
6.2.2
Option 2
7
Ordinal Regression
7.1
Ordinal Logistic Regression
7.2
Ordinal Regression Details
7.2.1
Ordinal Logistic Regression in R
7.2.2
The respdis data
7.3
Generalized Estimating Equations
7.3.1
Using geepack and ordgee
7.4
Penalized Regression with Ordinal Outcomes
8
Variable Importance Measures
8.1
Partial Dependence Plots
8.2
Uncertainty in Variable Importance Measures
8.2.1
Subsampling for Random Forest VIMP scores
8.2.2
Stability Selection for Penalized Regression
9
Conformal Prediction
9.1
Confidence Intervals and Prediction Intervals
9.2
Conformal Inference Procedure for Prediction Intervals
9.3
Why does this work?
9.4
An example with Boosting.
Published with bookdown
Notes for Case Studies in Health Big Data
Notes for Case Studies in Health Big Data
Nicholas Henderson
2024-04-18
Preface
This contains some of the course notes for Biostatistics 629.