1 Dentists, Adult Smoking, and Excessive Drinking

A Cross-sectional Study from CHR 2018 - 2024

Author

Sarah Albalawi

Published

2025-10-28

R Setup

Code

knitr::opts_chunk$set(comment = NA) 

library(janitor)
library(dplyr)
library(readr)
library(magrittr)
library(Hmisc)
library(knitr)
library(naniar)
library(easystats)
library(tableone)
library(mosaic)
library(gt)
library(Epi)
library(Matching)
library(arm)
library(cobalt)
library(broom)
library(survival)
library(ggplot2)
library(mice)
library(patchwork)
library(twang)
library(car)
library(rbounds)
library(tidyverse)

theme_set(theme_bw())

Loading CHR Data

We will load two data sets from the County Health Ranking source: CHR from year 2024 and CHR from year 2018.

Code

chr24_raw <- read_csv("analytic_data2018_0.csv", skip = 1, show_col_types = FALSE)

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

Code

dim(chr24_raw)

[1] 3194  508

Code

chr18_raw <- read_csv("analytic_data2024.csv", skip = 1, show_col_types = FALSE)

dim(chr18_raw)

[1] 3195  770

2 Data Management

2.1 Joining data sets

We will only use the outcomes smoking and excessive drinking from the 2024 CHR data set.

Code

chr24_var <- chr24_raw|> dplyr::select( 
    fipscode,
    v009_rawvalue,
    v049_rawvalue)

The exposure dentist proportion and the covariates will be pulled from the 2018 CHR data set.

Code

chr18_var <- chr18_raw|> dplyr::select( 
    fipscode,
    v002_rawvalue,
    v036_rawvalue,
    v042_rawvalue,
    v133_rawvalue,
    v085_rawvalue,
    v004_rawvalue,
    v088_rawvalue,
    v062_other_data_1,
    v069_rawvalue,
    v023_rawvalue,
    v044_rawvalue, 
    v140_rawvalue,
    v147_rawvalue,
    v060_rawvalue,
    v143_rawvalue,
    v021_rawvalue,
    v141_rawvalue,
    v153_rawvalue,
    v154_rawvalue,
    v052_rawvalue,
    v053_rawvalue, 
    v054_rawvalue, 
    v055_rawvalue,
    v081_rawvalue,
    v056_rawvalue,
    v126_rawvalue,
    v057_rawvalue,
    v058_rawvalue,
    v070_rawvalue,
    v011_rawvalue,
    v063_rawvalue)

Now we will joing the two table to produce our initial working data set chr_1.

Code

chr_1 <- chr18_var |>
  left_join(chr24_var, by= "fipscode")

dim(chr_1)

[1] 3195   34

Our joint data set has 3,195 observations (counties) and 34 variables (health measures).

The Codebook

The data file includes 3,195 observations (counties), on 34 variables (health measures).

Variable	Description	Notes	Year
`fipscode`	5-digit FIPS Code
`v088_rawvalue`	Number of dentists per 1000 population	Binary Exposure: High proportion vs Low proportion	2018
`v009_rawvalue`	Adult Smoking prevalence %	Quantitative Outcome 1	2024
`v049_rawvalue`	Excessive Drinking prevalence %	Quantitative outcome 2	2024
`v052_rawvalue`	% Below 18 Years of Age proportion	Covariate	2018
`v053_rawvalue`	% 65 and Older proportion	Covariate	2018
`v054_rawvalue`	% Non-Hispanic Black proportion	Covariate	2018
`v055_rawvalue`	% American Indian or Alaska Native proportion	Covariate	2018
`v081_rawvalue`	% Asian proportion	Covariate	2018
`v056_rawvalue`	% Hispanic proportion	Covariate	2018
`v126_rawvalue`	% Non-Hispanic White proportion	Covariate	2018
`v057_rawvalue`	% Female proportion	Covariate	2018
`v058_rawvalue`	% Rural	Covariate	2018
`v002_rawvalue`	% Poor or Fair Health	Covariate	2018
`v036_rawvalue`	Poor Physical Health Days	Covariate	2018
`v042_rawvalue`	Poor Mental Health Days	Covariate	2018
`v133_rawvalue`	Food Environment Index	Covariate	2018
`v085_rawvalue`	Uninsured population %	Covariate	2018
`v004_rawvalue`	Primary Care Physicians per population	Covariate	2018
`v062_other_data_1`	Ratio of population to mental health providers	Covariate	2018
`v021_rawvalue`	High School Completion %	Covariate	2018
`v069_rawvalue`	Some College %	Covariate	2018
`v023_rawvalue`	Unemployment %	Covariate	2018
`v044_rawvalue`	Income Inequality	Covariate	2018
`v140_rawvalue`	Social Associations	Covariate	2018
`v070_rawvalue`	Physical Inactivity days	Covariate	2018
`v011_rawvalue`	Adult Obesity prevalence %	Covariate	2018
`v060_rawvalue`	Diabetes Prevalence %	Covariate	2018
`v143_rawvalue`	Insufficient Sleep proportion %	Covariate	2018
`v141_rawvalue`	Residential Segregation - Black/White	Covariate	2018
`v147_rawvalue`	Life Expectancy (years)	Covariate	2018
`v063_rawvalue`	Median Household Income	Covariate	2018
`v153_rawvalue`	Home ownership	Covariate	2018
`v154_rawvalue`	Severe Housing Cost Burden	Covariate	2018

Note: The analysis will include one exposure and two outcome variables.

2.2 Renaming the variables

Code

chr_1 <- chr_1 |>
  mutate(dentist_grp = v088_rawvalue*100,
         smoking_prop = v009_rawvalue*100,
         drinking_prop = v049_rawvalue*100)|>
  rename(below_18 = v052_rawvalue,
         above_65 = v053_rawvalue,
         black = v054_rawvalue,
         native = v055_rawvalue,
         asian = v081_rawvalue,
         hispanic = v056_rawvalue,
         white = v126_rawvalue,
         female = v057_rawvalue,
         rural = v058_rawvalue,
         poor_health = v002_rawvalue,
         poor_physical = v036_rawvalue,
         poor_mental = v042_rawvalue,
         food_index = v133_rawvalue,
         uninsured = v085_rawvalue,
         primary_care = v004_rawvalue,
         mental_care = v062_other_data_1,
         highschool_grad  = v021_rawvalue,
         some_college = v069_rawvalue,
         unemployment = v023_rawvalue,
         income_inequality = v044_rawvalue,
         social_score = v140_rawvalue,
         inactivity = v070_rawvalue,
         obesity = v011_rawvalue,
         diabetes = v060_rawvalue, 
         insuff_sleep = v143_rawvalue,
         segregation = v141_rawvalue, 
         life_expectancy = v147_rawvalue,
         household_income = v063_rawvalue, 
         home_ownership = v153_rawvalue,
         housing_burden = v154_rawvalue )|>
  dplyr::select(-v088_rawvalue, -v009_rawvalue , -v049_rawvalue)

Checking the new names:

Code

names(chr_1)

 [1] "fipscode"          "poor_health"       "poor_physical"    
 [4] "poor_mental"       "food_index"        "uninsured"        
 [7] "primary_care"      "mental_care"       "some_college"     
[10] "unemployment"      "income_inequality" "social_score"     
[13] "life_expectancy"   "diabetes"          "insuff_sleep"     
[16] "highschool_grad"   "segregation"       "home_ownership"   
[19] "housing_burden"    "below_18"          "above_65"         
[22] "black"             "native"            "asian"            
[25] "hispanic"          "white"             "female"           
[28] "rural"             "inactivity"        "obesity"          
[31] "household_income"  "dentist_grp"       "smoking_prop"     
[34] "drinking_prop"

The new names look reasonable and are free from spelling errors.

2.3 Categorizing Exposure `dentist_grp`

Code

summary(chr_1$dentist_grp)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
0.00000 0.02760 0.04391 0.04867 0.06495 0.81642      87

We will use the first quartile (0.028) and third quartile (0.065) as cut points to dichotomize the dentist_grp variable.

Code

chr_1 <- chr_1 |>
  mutate(dentist_grp = case_when(
    dentist_grp <= 0.02760 ~ "Low",
    dentist_grp >= 0.06495 ~ "High")) |>
  mutate(dentist_grp = factor(dentist_grp))

chr_1 |> count(dentist_grp)

# A tibble: 3 × 2
  dentist_grp     n
  <fct>       <int>
1 High          777
2 Low           777
3 <NA>         1641

Code

chr_1 <- chr_1 |>
  mutate(dentist_n = as.numeric(dentist_grp),
         dentist_n = dentist_n - 1)

chr_1 |> count(dentist_n)

# A tibble: 3 × 2
  dentist_n     n
      <dbl> <int>
1         0   777
2         1   777
3        NA  1641

We have 777 counties in each group.

2.4 Numeric Summaries Before Imputation

Code

describe(chr_1)

chr_1 

 35  Variables      3195  Observations
--------------------------------------------------------------------------------
fipscode 
       n  missing distinct 
    3195        0     3195 

lowest : 00000 01000 01001 01003 01005, highest: 56037 56039 56041 56043 56045
--------------------------------------------------------------------------------
poor_health 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0      237        1   0.1768    0.174  0.05052   0.1160 
     .10      .25      .50      .75      .90      .95 
  0.1254   0.1420   0.1680   0.2070   0.2380   0.2580 

lowest : 0.084 0.087 0.089 0.091 0.093, highest: 0.351 0.359 0.364 0.375 0.38 
--------------------------------------------------------------------------------
poor_physical 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3194        1    3.888    3.871    0.734    2.915 
     .10      .25      .50      .75      .90      .95 
   3.065    3.385    3.866    4.333    4.739    4.977 

lowest : 2.26179 2.33164 2.39068 2.3946  2.41979
highest: 6.08336 6.0868  6.1734  6.44954 6.83683
--------------------------------------------------------------------------------
poor_mental 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3194        1    5.212    5.219   0.6926    4.122 
     .10      .25      .50      .75      .90      .95 
   4.410    4.814    5.229    5.626    5.990    6.194 

lowest : 3.18554 3.21723 3.53009 3.54504 3.54712
highest: 7.07808 7.15042 7.17393 7.30487 7.38881
--------------------------------------------------------------------------------
food_index 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3160       35       82    0.999    7.543     7.65    1.291      5.3 
     .10      .25      .50      .75      .90      .95 
     6.1      6.9      7.7      8.4      8.9      9.2 

lowest : 0   0.9 1.3 1.4 1.6, highest: 9.6 9.7 9.8 9.9 10 
--------------------------------------------------------------------------------
uninsured 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3194        1     3190        1   0.1149   0.1099  0.05717  0.05144 
     .10      .25      .50      .75      .90      .95 
 0.05934  0.07462  0.10340  0.14548  0.18739  0.21667 

lowest : 0.0238974 0.0241492 0.024785  0.0268813 0.0279943
highest: 0.323534  0.324548  0.327475  0.387124  0.463235 
--------------------------------------------------------------------------------
primary_care 
        n   missing  distinct      Info      Mean   pMedian       Gmd       .05 
     3038       157      2942         1 0.0005389 0.0005031   0.00037 0.0001039 
      .10       .25       .50       .75       .90       .95 
0.0001689 0.0002974 0.0004740 0.0007176 0.0009665 0.0011567 

lowest : 0           3.43867e-05 3.93043e-05 3.98311e-05 4.38462e-05
highest: 0.00284638  0.00289715  0.00296695  0.00507614  0.00580527 
--------------------------------------------------------------------------------
mental_care 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3010      185     3001        1     1344    907.7     1536    162.8 
     .10      .25      .50      .75      .90      .95 
   220.2    368.0    692.9   1547.6   2999.5   4792.8 

lowest :  -931  -919  -884  -865  -849, highest: 15654 17550 19935 22593 23044
--------------------------------------------------------------------------------
some_college 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3185        1   0.5902   0.5918   0.1336   0.3936 
     .10      .25      .50      .75      .90      .95 
  0.4370   0.5110   0.5921   0.6753   0.7368   0.7746 

lowest : 0.117647 0.122152 0.208128 0.228007 0.236698
highest: 0.925572 0.926316 0.930521 0.992308 1       
--------------------------------------------------------------------------------
unemployment 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3194        1     3181        1  0.03592  0.03482  0.01286  0.02035 
     .10      .25      .50      .75      .90      .95 
 0.02277  0.02730  0.03415  0.04201  0.05064  0.05705 

lowest : 0.00572082 0.00859107 0.0131761  0.0133748  0.0134731 
highest: 0.116036   0.125495   0.129104   0.130016   0.146677  
--------------------------------------------------------------------------------
income_inequality 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3180       15     3179        1    4.553    4.481   0.8517    3.527 
     .10      .25      .50      .75      .90      .95 
   3.706    4.019    4.436    4.945    5.544    6.009 

lowest : 1.65164 2.26575 2.41043 2.53293 2.57027
highest: 8.82769 9.19356 9.30925 9.51399 10.5323
--------------------------------------------------------------------------------
social_score 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3013        1    11.28    10.94    6.154    2.174 
     .10      .25      .50      .75      .90      .95 
   5.182    7.892   10.767   14.073   17.901   21.240 

lowest : 0        0.556783 0.931619 1.75186  1.84139 
highest: 41.0034  41.9162  43.4783  48.2245  59.4354 
--------------------------------------------------------------------------------
life_expectancy 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3122       73     3121        1    75.78    75.82    3.768    70.54 
     .10      .25      .50      .75      .90      .95 
   71.63    73.61    75.88    78.02    79.81    81.00 

lowest : 55.919  60.1049 61.1734 61.3558 61.3605
highest: 89.3706 91.1526 92.7249 93.0018 98.9029
--------------------------------------------------------------------------------
diabetes 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0      138        1   0.1056   0.1035  0.02444    0.076 
     .10      .25      .50      .75      .90      .95 
   0.081    0.089    0.102    0.118    0.134    0.147 

lowest : 0.056 0.058 0.06  0.062 0.063, highest: 0.203 0.204 0.205 0.216 0.222
--------------------------------------------------------------------------------
insuff_sleep 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3192        3      256        1   0.3445    0.344  0.04103   0.2860 
     .10      .25      .50      .75      .90      .95 
  0.2980   0.3190   0.3440   0.3680   0.3929   0.4070 

lowest : 0.238 0.24  0.247 0.248 0.249, highest: 0.46  0.463 0.465 0.469 0.484
--------------------------------------------------------------------------------
highschool_grad 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    2517      678     1637        1   0.8834   0.8905  0.06992   0.7637 
     .10      .25      .50      .75      .90      .95 
  0.8003   0.8538   0.8942   0.9258   0.9550   0.9706 

lowest : 0.175    0.221452 0.336957 0.425    0.475   
highest: 0.979681 0.98     0.981393 0.989176 0.995   
--------------------------------------------------------------------------------
segregation 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    2128     1067     2126        1    49.61    49.95    19.09    20.70 
     .10      .25      .50      .75      .90      .95 
   26.69    38.31    50.88    61.29    71.35    76.25 

lowest : 1.05295 1.06989 1.59135 1.67271 1.96799
highest: 88.7365 88.9926 89.8186 89.9605 93.7638
--------------------------------------------------------------------------------
home_ownership 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3191        1   0.7261   0.7333  0.09183   0.5686 
     .10      .25      .50      .75      .90      .95 
  0.6194   0.6826   0.7388   0.7839   0.8194   0.8371 

lowest : 0        0.157895 0.199638 0.243232 0.276574
highest: 0.912844 0.923458 0.933333 0.933466 0.969858
--------------------------------------------------------------------------------
housing_burden 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3190        5     3179        1   0.1095   0.1072   0.0387  0.06053 
     .10      .25      .50      .75      .90      .95 
 0.07002  0.08552  0.10478  0.12855  0.15675  0.17516 

lowest : 0.00595238 0.00878477 0.00996678 0.0159657  0.0185874 
highest: 0.25323    0.255763   0.2584     0.281928   0.311692  
--------------------------------------------------------------------------------
below_18 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3193        1   0.2166   0.2164  0.03757   0.1615 
     .10      .25      .50      .75      .90      .95 
  0.1746   0.1966   0.2173   0.2353   0.2537   0.2696 

lowest : 0.0243902 0.0354663 0.0493957 0.0570924 0.0711527
highest: 0.369018  0.375724  0.381907  0.414208  0.420995 
--------------------------------------------------------------------------------
above_65 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3189        1   0.2046    0.202  0.05296   0.1324 
     .10      .25      .50      .75      .90      .95 
  0.1483   0.1734   0.2008   0.2304   0.2665   0.2920 

lowest : 0.0544279 0.0629911 0.0684948 0.0737527 0.0755564
highest: 0.414013  0.417121  0.424593  0.435067  0.574546 
--------------------------------------------------------------------------------
black 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3187        1  0.09037  0.04868   0.1235 0.004405 
     .10      .25      .50      .75      .90      .95 
0.005453 0.008516 0.023844 0.103048 0.292486 0.410786 

lowest : 0           0.000646134 0.000772201 0.000950119 0.000986193
highest: 0.787643    0.790084    0.821227    0.845068    0.851448   
--------------------------------------------------------------------------------
native 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3190        1  0.02449 0.009342  0.03517 0.002600 
     .10      .25      .50      .75      .90      .95 
0.003159 0.004443 0.007416 0.014790 0.036667 0.084503 

lowest : 0           0.000555247 0.000785376 0.00115182  0.0011554  
highest: 0.810833    0.844577    0.848004    0.895259    0.921116   
--------------------------------------------------------------------------------
asian 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3184        1  0.01755  0.01038     0.02 0.002817 
     .10      .25      .50      .75      .90      .95 
0.003649 0.005451 0.008366 0.016349 0.036610 0.061040 

lowest : 0           0.000563063 0.00061757  0.000723589 0.000817184
highest: 0.372693    0.385787    0.414258    0.425768    0.463802   
--------------------------------------------------------------------------------
hispanic 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3194        1   0.1045  0.06686   0.1171  0.01596 
     .10      .25      .50      .75      .90      .95 
 0.01953  0.02863  0.05158  0.11195  0.25909  0.40959 

lowest : 0.00662915 0.00732961 0.00762389 0.00768806 0.00772813
highest: 0.936973   0.948533   0.950321   0.952917   0.960641  
--------------------------------------------------------------------------------
white 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3193        1   0.7468   0.7698   0.2173   0.3256 
     .10      .25      .50      .75      .90      .95 
  0.4383   0.6286   0.8172   0.9121   0.9414   0.9520 

lowest : 0.0269389 0.0284045 0.0345667 0.03594   0.0400751
highest: 0.972168  0.972469  0.97272   0.9754    0.975858 
--------------------------------------------------------------------------------
female 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     3192        1   0.4952   0.4982  0.02076   0.4596 
     .10      .25      .50      .75      .90      .95 
  0.4747   0.4899   0.4988   0.5065   0.5140   0.5201 

lowest : 0.215686 0.270395 0.287504 0.291292 0.301648
highest: 0.544438 0.546502 0.547111 0.548661 0.577624
--------------------------------------------------------------------------------
rural 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0     2066    0.959   0.6347    0.639   0.3791  0.05155 
     .10      .25      .50      .75      .90      .95 
 0.13787  0.34769  0.65675  1.00000  1.00000  1.00000 

lowest : 0           6.79046e-07 3.07633e-05 3.99956e-05 4.51164e-05
highest: 0.998927    0.998929    0.999886    0.99994     1          
--------------------------------------------------------------------------------
inactivity 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0      280        1    0.266    0.265  0.05883    0.184 
     .10      .25      .50      .75      .90      .95 
   0.200    0.230    0.262    0.300    0.333    0.357 

lowest : 0.12  0.121 0.122 0.123 0.13 , highest: 0.443 0.45  0.456 0.457 0.47 
--------------------------------------------------------------------------------
obesity 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3195        0      271        1   0.3729   0.3755  0.05027    0.287 
     .10      .25      .50      .75      .90      .95 
   0.314    0.350    0.376    0.402    0.427    0.439 

lowest : 0.174 0.185 0.187 0.197 0.201, highest: 0.509 0.51  0.512 0.52  0.525
--------------------------------------------------------------------------------
household_income 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3194        1     3101        1    63449    61837    17240    42604 
     .10      .25      .50      .75      .90      .95 
   45708    52610    60986    70897    83207    94785 

lowest :  28972  30659  30688  31972  32783, highest: 142513 143795 144632 150502 167605
--------------------------------------------------------------------------------
dentist_grp 
       n  missing distinct 
    1554     1641        2 
                    
Value      High  Low
Frequency   777  777
Proportion  0.5  0.5
--------------------------------------------------------------------------------
smoking_prop 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3192        3     3191        1    17.87    17.71    4.014    12.96 
     .10      .25      .50      .75      .90      .95 
   13.84    15.23    17.31    20.28    22.44    23.85 

lowest : 6.73543 6.82175 7.16927 7.68742 7.73646
highest: 37.029  38.3231 39.0797 41.2018 42.7541
--------------------------------------------------------------------------------
drinking_prop 
       n  missing distinct     Info     Mean  pMedian      Gmd      .05 
    3192        3     3191        1    17.43    17.39    3.677    12.29 
     .10      .25      .50      .75      .90      .95 
   13.14    15.10    17.40    19.68    21.47    22.92 

lowest : 9.26516 9.28781 9.40767 9.48465 9.52642
highest: 27.8133 28.0047 28.0822 28.1579 29.4401
--------------------------------------------------------------------------------
dentist_n 
       n  missing distinct     Info      Sum     Mean 
    1554     1641        2     0.75      777      0.5 

--------------------------------------------------------------------------------

2.5 Dealing with Missingness

Code

miss_var_summary(chr_1)

# A tibble: 35 × 3
   variable          n_miss pct_miss
   <chr>              <int>    <num>
 1 dentist_grp         1641   51.4  
 2 dentist_n           1641   51.4  
 3 segregation         1067   33.4  
 4 highschool_grad      678   21.2  
 5 mental_care          185    5.79 
 6 primary_care         157    4.91 
 7 life_expectancy       73    2.28 
 8 food_index            35    1.10 
 9 income_inequality     15    0.469
10 housing_burden         5    0.156
# ℹ 25 more rows

The segregation variable constitutes 33.4% of the missing data, followed by highschool_grd (21.2%), mental_care (5.79%), primary_care (4.91%), life_expectancy (2.28%), food_index (1.10%). Other variables with minimal missingness (<1%) include income_inequality, housing_burden, insuff_sleep, uninsured, unemployment, household_income.

2.5.1 Filter Complete Cases For Exposure `dentist_prop`

We will remove missing cases for the exposure dentist_prop and name the set chr24_si.

Code

chr_simp <- chr_1 |>
  filter(!is.na(dentist_grp))|>
  filter(!is.na(dentist_n))

Code

n_miss(chr_simp$dentist_n)

[1] 0

The outcome variable has no missing values. Now we will deal with the missing values in the covariates.

2.5.2 Single Imputation for Covariates

We will use the mice function from the mice package to perform single imputation for all variables in the chr24_si data set.

Code

chr_simp <- 
  mice(chr_simp, m = 1, seed = 500, print = FALSE) |>
  complete() |>
  tibble()

Warning: Number of logged events: 67

Code

miss_var_summary(chr_simp)

# A tibble: 35 × 3
   variable      n_miss pct_miss
   <chr>          <int>    <num>
 1 fipscode           0        0
 2 poor_health        0        0
 3 poor_physical      0        0
 4 poor_mental        0        0
 5 food_index         0        0
 6 uninsured          0        0
 7 primary_care       0        0
 8 mental_care        0        0
 9 some_college       0        0
10 unemployment       0        0
# ℹ 25 more rows

There are no missing observation in the semi-final working data set chr_simp.

3 Final Variable Selection

3.1 Check Correlation

Code

res <- correlation(chr_simp, method = "spearman")

res

# Correlation Matrix (spearman-method)

Parameter1        |        Parameter2 |       rho |         95% CI |        S |         p
-----------------------------------------------------------------------------------------
poor_health       |     poor_physical |      0.94 | [ 0.93,  0.95] | 3.77e+07 | < .001***
poor_health       |       poor_mental |      0.75 | [ 0.73,  0.77] | 1.54e+08 | < .001***
poor_health       |        food_index |     -0.68 | [-0.70, -0.65] | 1.05e+09 | < .001***
poor_health       |         uninsured |      0.54 | [ 0.51,  0.58] | 2.85e+08 | < .001***
poor_health       |      primary_care |     -0.40 | [-0.44, -0.36] | 8.75e+08 | < .001***
poor_health       |       mental_care |      0.37 | [ 0.33,  0.42] | 3.93e+08 | < .001***
poor_health       |      some_college |     -0.79 | [-0.81, -0.77] | 1.12e+09 | < .001***
poor_health       |      unemployment |      0.47 | [ 0.43,  0.51] | 3.29e+08 | < .001***
poor_health       | income_inequality |      0.45 | [ 0.41,  0.49] | 3.41e+08 | < .001***
poor_health       |      social_score |     -0.13 | [-0.18, -0.08] | 7.05e+08 | < .001***
poor_health       |   life_expectancy |     -0.81 | [-0.83, -0.79] | 1.13e+09 | < .001***
poor_health       |          diabetes |      0.94 | [ 0.93,  0.94] | 3.96e+07 | < .001***
poor_health       |      insuff_sleep |      0.79 | [ 0.77,  0.80] | 1.34e+08 | < .001***
poor_health       |   highschool_grad |      0.01 | [-0.04,  0.06] | 6.18e+08 | > .999   
poor_health       |       segregation |     -0.30 | [-0.35, -0.25] | 8.14e+08 | < .001***
poor_health       |    home_ownership |     -0.06 | [-0.11, -0.01] | 6.62e+08 | > .999   
poor_health       |    housing_burden |  1.45e-03 | [-0.05,  0.05] | 6.25e+08 | > .999   
poor_health       |          below_18 |      0.16 | [ 0.11,  0.21] | 5.25e+08 | < .001***
poor_health       |          above_65 |     -0.10 | [-0.15, -0.04] | 6.85e+08 | 0.021*   
poor_health       |             black |      0.30 | [ 0.25,  0.35] | 4.38e+08 | < .001***
poor_health       |            native | -7.28e-03 | [-0.06,  0.04] | 6.30e+08 | > .999   
poor_health       |             asian |     -0.36 | [-0.41, -0.32] | 8.53e+08 | < .001***
poor_health       |          hispanic |     -0.06 | [-0.11, -0.01] | 6.62e+08 | > .999   
poor_health       |             white |     -0.29 | [-0.34, -0.25] | 8.09e+08 | < .001***
poor_health       |            female |      0.05 | [ 0.00,  0.10] | 5.94e+08 | > .999   
poor_health       |             rural |      0.33 | [ 0.28,  0.37] | 4.21e+08 | < .001***
poor_health       |        inactivity |      0.88 | [ 0.87,  0.89] | 7.60e+07 | < .001***
poor_health       |           obesity |      0.69 | [ 0.66,  0.71] | 1.96e+08 | < .001***
poor_health       |  household_income |     -0.78 | [-0.80, -0.76] | 1.11e+09 | < .001***
poor_health       |      smoking_prop |      0.71 | [ 0.69,  0.74] | 1.80e+08 | < .001***
poor_health       |     drinking_prop |     -0.69 | [-0.72, -0.66] | 1.06e+09 | < .001***
poor_health       |         dentist_n |      0.44 | [ 0.39,  0.48] | 3.53e+08 | < .001***
poor_physical     |       poor_mental |      0.85 | [ 0.84,  0.87] | 9.12e+07 | < .001***
poor_physical     |        food_index |     -0.64 | [-0.67, -0.61] | 1.03e+09 | < .001***
poor_physical     |         uninsured |      0.46 | [ 0.42,  0.50] | 3.38e+08 | < .001***
poor_physical     |      primary_care |     -0.41 | [-0.45, -0.36] | 8.80e+08 | < .001***
poor_physical     |       mental_care |      0.35 | [ 0.31,  0.40] | 4.04e+08 | < .001***
poor_physical     |      some_college |     -0.78 | [-0.80, -0.76] | 1.11e+09 | < .001***
poor_physical     |      unemployment |      0.47 | [ 0.43,  0.51] | 3.32e+08 | < .001***
poor_physical     | income_inequality |      0.40 | [ 0.36,  0.44] | 3.75e+08 | < .001***
poor_physical     |      social_score |     -0.17 | [-0.22, -0.12] | 7.32e+08 | < .001***
poor_physical     |   life_expectancy |     -0.80 | [-0.82, -0.78] | 1.12e+09 | < .001***
poor_physical     |          diabetes |      0.82 | [ 0.81,  0.84] | 1.10e+08 | < .001***
poor_physical     |      insuff_sleep |      0.76 | [ 0.74,  0.78] | 1.50e+08 | < .001***
poor_physical     |   highschool_grad |      0.04 | [-0.01,  0.09] | 5.98e+08 | > .999   
poor_physical     |       segregation |     -0.25 | [-0.29, -0.20] | 7.79e+08 | < .001***
poor_physical     |    home_ownership |  9.49e-03 | [-0.04,  0.06] | 6.20e+08 | > .999   
poor_physical     |    housing_burden |     -0.03 | [-0.08,  0.02] | 6.45e+08 | > .999   
poor_physical     |          below_18 |      0.10 | [ 0.05,  0.15] | 5.63e+08 | 0.010*   
poor_physical     |          above_65 |     -0.04 | [-0.09,  0.01] | 6.50e+08 | > .999   
poor_physical     |             black |      0.18 | [ 0.13,  0.23] | 5.10e+08 | < .001***
poor_physical     |            native |     -0.04 | [-0.09,  0.02] | 6.48e+08 | > .999   
poor_physical     |             asian |     -0.40 | [-0.44, -0.35] | 8.74e+08 | < .001***
poor_physical     |          hispanic |     -0.16 | [-0.21, -0.11] | 7.24e+08 | < .001***
poor_physical     |             white |     -0.12 | [-0.17, -0.07] | 7.00e+08 | < .001***
poor_physical     |            female |      0.02 | [-0.03,  0.08] | 6.10e+08 | > .999   
poor_physical     |             rural |      0.34 | [ 0.30,  0.39] | 4.11e+08 | < .001***
poor_physical     |        inactivity |      0.80 | [ 0.78,  0.82] | 1.26e+08 | < .001***
poor_physical     |           obesity |      0.61 | [ 0.58,  0.65] | 2.41e+08 | < .001***
poor_physical     |  household_income |     -0.75 | [-0.77, -0.73] | 1.10e+09 | < .001***
poor_physical     |      smoking_prop |      0.74 | [ 0.72,  0.76] | 1.61e+08 | < .001***
poor_physical     |     drinking_prop |     -0.68 | [-0.70, -0.65] | 1.05e+09 | < .001***
poor_physical     |         dentist_n |      0.43 | [ 0.39,  0.47] | 3.55e+08 | < .001***
poor_mental       |        food_index |     -0.52 | [-0.56, -0.48] | 9.52e+08 | < .001***
poor_mental       |         uninsured |      0.26 | [ 0.21,  0.31] | 4.61e+08 | < .001***
poor_mental       |      primary_care |     -0.25 | [-0.29, -0.20] | 7.79e+08 | < .001***
poor_mental       |       mental_care |      0.18 | [ 0.13,  0.23] | 5.11e+08 | < .001***
poor_mental       |      some_college |     -0.59 | [-0.62, -0.55] | 9.94e+08 | < .001***
poor_mental       |      unemployment |      0.46 | [ 0.42,  0.50] | 3.36e+08 | < .001***
poor_mental       | income_inequality |      0.39 | [ 0.35,  0.44] | 3.79e+08 | < .001***
poor_mental       |      social_score |     -0.16 | [-0.21, -0.11] | 7.26e+08 | < .001***
poor_mental       |   life_expectancy |     -0.68 | [-0.71, -0.65] | 1.05e+09 | < .001***
poor_mental       |          diabetes |      0.65 | [ 0.61,  0.67] | 2.22e+08 | < .001***
poor_mental       |      insuff_sleep |      0.68 | [ 0.66,  0.71] | 1.98e+08 | < .001***
poor_mental       |   highschool_grad |  4.24e-03 | [-0.05,  0.06] | 6.23e+08 | > .999   
poor_mental       |       segregation |     -0.16 | [-0.21, -0.11] | 7.23e+08 | < .001***
poor_mental       |    home_ownership |     -0.03 | [-0.08,  0.02] | 6.44e+08 | > .999   
poor_mental       |    housing_burden |      0.08 | [ 0.03,  0.13] | 5.77e+08 | 0.251    
poor_mental       |          below_18 |     -0.03 | [-0.08,  0.02] | 6.44e+08 | > .999   
poor_mental       |          above_65 |     -0.03 | [-0.08,  0.02] | 6.46e+08 | > .999   
poor_mental       |             black |      0.19 | [ 0.14,  0.24] | 5.08e+08 | < .001***
poor_mental       |            native |     -0.09 | [-0.14, -0.04] | 6.81e+08 | 0.051    
poor_mental       |             asian |     -0.25 | [-0.29, -0.20] | 7.79e+08 | < .001***
poor_mental       |          hispanic |     -0.16 | [-0.21, -0.11] | 7.27e+08 | < .001***
poor_mental       |             white |     -0.05 | [-0.10,  0.00] | 6.57e+08 | > .999   
poor_mental       |            female |      0.18 | [ 0.13,  0.23] | 5.14e+08 | < .001***
poor_mental       |             rural |      0.17 | [ 0.12,  0.22] | 5.17e+08 | < .001***
poor_mental       |        inactivity |      0.56 | [ 0.53,  0.60] | 2.72e+08 | < .001***
poor_mental       |           obesity |      0.41 | [ 0.37,  0.45] | 3.68e+08 | < .001***
poor_mental       |  household_income |     -0.59 | [-0.62, -0.56] | 9.94e+08 | < .001***
poor_mental       |      smoking_prop |      0.63 | [ 0.60,  0.66] | 2.32e+08 | < .001***
poor_mental       |     drinking_prop |     -0.56 | [-0.59, -0.52] | 9.75e+08 | < .001***
poor_mental       |         dentist_n |      0.28 | [ 0.23,  0.32] | 4.52e+08 | < .001***
food_index        |         uninsured |     -0.49 | [-0.52, -0.45] | 9.29e+08 | < .001***
food_index        |      primary_care |      0.31 | [ 0.26,  0.35] | 4.34e+08 | < .001***
food_index        |       mental_care |     -0.12 | [-0.17, -0.07] | 7.02e+08 | < .001***
food_index        |      some_college |      0.49 | [ 0.45,  0.53] | 3.18e+08 | < .001***
food_index        |      unemployment |     -0.36 | [-0.40, -0.31] | 8.49e+08 | < .001***
food_index        | income_inequality |     -0.36 | [-0.40, -0.31] | 8.50e+08 | < .001***
food_index        |      social_score |      0.15 | [ 0.10,  0.20] | 5.31e+08 | < .001***
food_index        |   life_expectancy |      0.59 | [ 0.56,  0.62] | 2.57e+08 | < .001***
food_index        |          diabetes |     -0.63 | [-0.66, -0.60] | 1.02e+09 | < .001***
food_index        |      insuff_sleep |     -0.45 | [-0.49, -0.41] | 9.06e+08 | < .001***
food_index        |   highschool_grad |      0.07 | [ 0.02,  0.12] | 5.81e+08 | 0.490    
food_index        |       segregation |      0.12 | [ 0.07,  0.17] | 5.52e+08 | < .001***
food_index        |    home_ownership |      0.10 | [ 0.05,  0.15] | 5.65e+08 | 0.020*   
food_index        |    housing_burden |  2.06e-03 | [-0.05,  0.05] | 6.24e+08 | > .999   
food_index        |          below_18 |     -0.09 | [-0.15, -0.04] | 6.85e+08 | 0.024*   
food_index        |          above_65 |     -0.07 | [-0.12, -0.02] | 6.72e+08 | 0.359    
food_index        |             black |     -0.07 | [-0.12, -0.02] | 6.71e+08 | 0.391    
food_index        |            native |     -0.14 | [-0.19, -0.09] | 7.12e+08 | < .001***
food_index        |             asian |      0.34 | [ 0.29,  0.38] | 4.14e+08 | < .001***
food_index        |          hispanic |      0.03 | [-0.02,  0.08] | 6.08e+08 | > .999   
food_index        |             white |      0.17 | [ 0.11,  0.21] | 5.22e+08 | < .001***
food_index        |            female |      0.02 | [-0.03,  0.07] | 6.14e+08 | > .999   
food_index        |             rural |     -0.31 | [-0.36, -0.27] | 8.22e+08 | < .001***
food_index        |        inactivity |     -0.59 | [-0.62, -0.55] | 9.93e+08 | < .001***
food_index        |           obesity |     -0.49 | [-0.53, -0.45] | 9.32e+08 | < .001***
food_index        |  household_income |      0.70 | [ 0.68,  0.73] | 1.85e+08 | < .001***
food_index        |      smoking_prop |     -0.51 | [-0.54, -0.47] | 9.42e+08 | < .001***
food_index        |     drinking_prop |      0.49 | [ 0.45,  0.53] | 3.18e+08 | < .001***
food_index        |         dentist_n |     -0.30 | [-0.35, -0.25] | 8.14e+08 | < .001***
uninsured         |      primary_care |     -0.32 | [-0.36, -0.27] | 8.23e+08 | < .001***
uninsured         |       mental_care |      0.37 | [ 0.32,  0.41] | 3.97e+08 | < .001***
uninsured         |      some_college |     -0.51 | [-0.54, -0.47] | 9.42e+08 | < .001***
uninsured         |      unemployment |      0.01 | [-0.04,  0.06] | 6.19e+08 | > .999   
uninsured         | income_inequality |      0.14 | [ 0.09,  0.19] | 5.40e+08 | < .001***
uninsured         |      social_score |     -0.05 | [-0.10,  0.00] | 6.57e+08 | > .999   
uninsured         |   life_expectancy |     -0.40 | [-0.44, -0.36] | 8.77e+08 | < .001***
uninsured         |          diabetes |      0.54 | [ 0.50,  0.57] | 2.89e+08 | < .001***
uninsured         |      insuff_sleep |      0.27 | [ 0.22,  0.32] | 4.56e+08 | < .001***
uninsured         |   highschool_grad |      0.03 | [-0.02,  0.08] | 6.05e+08 | > .999   
uninsured         |       segregation |     -0.27 | [-0.31, -0.22] | 7.92e+08 | < .001***
uninsured         |    home_ownership |      0.02 | [-0.04,  0.07] | 6.16e+08 | > .999   
uninsured         |    housing_burden |     -0.13 | [-0.18, -0.08] | 7.05e+08 | < .001***
uninsured         |          below_18 |      0.25 | [ 0.20,  0.29] | 4.71e+08 | < .001***
uninsured         |          above_65 |     -0.03 | [-0.08,  0.02] | 6.42e+08 | > .999   
uninsured         |             black |      0.08 | [ 0.02,  0.13] | 5.78e+08 | 0.316    
uninsured         |            native |      0.37 | [ 0.33,  0.41] | 3.94e+08 | < .001***
uninsured         |             asian |     -0.24 | [-0.29, -0.19] | 7.75e+08 | < .001***
uninsured         |          hispanic |      0.26 | [ 0.22,  0.31] | 4.60e+08 | < .001***
uninsured         |             white |     -0.36 | [-0.40, -0.31] | 8.49e+08 | < .001***
uninsured         |            female |     -0.10 | [-0.15, -0.05] | 6.88e+08 | 0.010**  
uninsured         |             rural |      0.36 | [ 0.31,  0.40] | 4.00e+08 | < .001***
uninsured         |        inactivity |      0.46 | [ 0.42,  0.50] | 3.35e+08 | < .001***
uninsured         |           obesity |      0.31 | [ 0.27,  0.36] | 4.29e+08 | < .001***
uninsured         |  household_income |     -0.41 | [-0.45, -0.37] | 8.83e+08 | < .001***
uninsured         |      smoking_prop |      0.25 | [ 0.20,  0.29] | 4.71e+08 | < .001***
uninsured         |     drinking_prop |     -0.45 | [-0.49, -0.41] | 9.06e+08 | < .001***
uninsured         |         dentist_n |      0.39 | [ 0.34,  0.43] | 3.84e+08 | < .001***
primary_care      |       mental_care |     -0.50 | [-0.54, -0.46] | 9.38e+08 | < .001***
primary_care      |      some_college |      0.49 | [ 0.45,  0.52] | 3.22e+08 | < .001***
primary_care      |      unemployment |     -0.09 | [-0.14, -0.04] | 6.80e+08 | 0.065    
primary_care      | income_inequality |      0.05 | [ 0.00,  0.10] | 5.95e+08 | > .999   
primary_care      |      social_score |      0.14 | [ 0.09,  0.19] | 5.37e+08 | < .001***
primary_care      |   life_expectancy |      0.34 | [ 0.29,  0.38] | 4.13e+08 | < .001***
primary_care      |          diabetes |     -0.34 | [-0.38, -0.29] | 8.38e+08 | < .001***
primary_care      |      insuff_sleep |     -0.31 | [-0.35, -0.26] | 8.19e+08 | < .001***
primary_care      |   highschool_grad |     -0.17 | [-0.22, -0.12] | 7.30e+08 | < .001***
primary_care      |       segregation |      0.25 | [ 0.20,  0.29] | 4.71e+08 | < .001***
primary_care      |    home_ownership |     -0.39 | [-0.43, -0.35] | 8.70e+08 | < .001***
primary_care      |    housing_burden |      0.32 | [ 0.28,  0.37] | 4.24e+08 | < .001***
primary_care      |          below_18 |     -0.10 | [-0.15, -0.05] | 6.88e+08 | 0.011*   
primary_care      |          above_65 |     -0.17 | [-0.21, -0.11] | 7.29e+08 | < .001***
primary_care      |             black |      0.03 | [-0.02,  0.08] | 6.07e+08 | > .999   
primary_care      |            native |  6.63e-03 | [-0.04,  0.06] | 6.21e+08 | > .999   
primary_care      |             asian |      0.54 | [ 0.50,  0.57] | 2.88e+08 | < .001***
primary_care      |          hispanic |      0.17 | [ 0.12,  0.22] | 5.21e+08 | < .001***
primary_care      |             white |     -0.12 | [-0.17, -0.07] | 7.00e+08 | < .001***
primary_care      |            female |      0.28 | [ 0.23,  0.32] | 4.53e+08 | < .001***
primary_care      |             rural |     -0.55 | [-0.59, -0.51] | 9.70e+08 | < .001***
primary_care      |        inactivity |     -0.44 | [-0.48, -0.40] | 8.99e+08 | < .001***
primary_care      |           obesity |     -0.42 | [-0.46, -0.37] | 8.86e+08 | < .001***
primary_care      |  household_income |      0.37 | [ 0.33,  0.42] | 3.92e+08 | < .001***
primary_care      |      smoking_prop |     -0.31 | [-0.35, -0.26] | 8.17e+08 | < .001***
primary_care      |     drinking_prop |      0.32 | [ 0.28,  0.37] | 4.23e+08 | < .001***
primary_care      |         dentist_n |     -0.67 | [-0.69, -0.64] | 1.04e+09 | < .001***
mental_care       |      some_college |     -0.47 | [-0.50, -0.42] | 9.16e+08 | < .001***
mental_care       |      unemployment |     -0.02 | [-0.07,  0.03] | 6.40e+08 | > .999   
mental_care       | income_inequality |     -0.06 | [-0.11,  0.00] | 6.60e+08 | > .999   
mental_care       |      social_score |      0.10 | [ 0.05,  0.15] | 5.61e+08 | 0.008**  
mental_care       |   life_expectancy |     -0.31 | [-0.36, -0.27] | 8.21e+08 | < .001***
mental_care       |          diabetes |      0.34 | [ 0.30,  0.39] | 4.12e+08 | < .001***
mental_care       |      insuff_sleep |      0.32 | [ 0.28,  0.37] | 4.23e+08 | < .001***
mental_care       |   highschool_grad |      0.32 | [ 0.27,  0.36] | 4.27e+08 | < .001***
mental_care       |       segregation |     -0.34 | [-0.39, -0.30] | 8.40e+08 | < .001***
mental_care       |    home_ownership |      0.48 | [ 0.44,  0.52] | 3.27e+08 | < .001***
mental_care       |    housing_burden |     -0.42 | [-0.46, -0.37] | 8.86e+08 | < .001***
mental_care       |          below_18 |      0.12 | [ 0.07,  0.17] | 5.48e+08 | < .001***
mental_care       |          above_65 |      0.16 | [ 0.11,  0.21] | 5.24e+08 | < .001***
mental_care       |             black |  7.22e-03 | [-0.04,  0.06] | 6.21e+08 | > .999   
mental_care       |            native |     -0.11 | [-0.16, -0.06] | 6.95e+08 | 0.002**  
mental_care       |             asian |     -0.48 | [-0.52, -0.44] | 9.26e+08 | < .001***
mental_care       |          hispanic |     -0.21 | [-0.25, -0.16] | 7.54e+08 | < .001***
mental_care       |             white |      0.14 | [ 0.09,  0.19] | 5.37e+08 | < .001***
mental_care       |            female |     -0.17 | [-0.22, -0.12] | 7.29e+08 | < .001***
mental_care       |             rural |      0.55 | [ 0.51,  0.58] | 2.83e+08 | < .001***
mental_care       |        inactivity |      0.45 | [ 0.41,  0.49] | 3.43e+08 | < .001***
mental_care       |           obesity |      0.43 | [ 0.38,  0.47] | 3.59e+08 | < .001***
mental_care       |  household_income |     -0.32 | [-0.37, -0.28] | 8.28e+08 | < .001***
mental_care       |      smoking_prop |      0.25 | [ 0.20,  0.30] | 4.70e+08 | < .001***
mental_care       |     drinking_prop |     -0.34 | [-0.39, -0.30] | 8.39e+08 | < .001***
mental_care       |         dentist_n |      0.63 | [ 0.59,  0.66] | 2.34e+08 | < .001***
some_college      |      unemployment |     -0.35 | [-0.40, -0.31] | 8.47e+08 | < .001***
some_college      | income_inequality |     -0.22 | [-0.27, -0.18] | 7.66e+08 | < .001***
some_college      |      social_score |      0.09 | [ 0.04,  0.14] | 5.68e+08 | 0.038*   
some_college      |   life_expectancy |      0.66 | [ 0.63,  0.68] | 2.15e+08 | < .001***
some_college      |          diabetes |     -0.70 | [-0.72, -0.67] | 1.06e+09 | < .001***
some_college      |      insuff_sleep |     -0.62 | [-0.65, -0.58] | 1.01e+09 | < .001***
some_college      |   highschool_grad |     -0.05 | [-0.10,  0.00] | 6.58e+08 | > .999   
some_college      |       segregation |      0.24 | [ 0.19,  0.29] | 4.74e+08 | < .001***
some_college      |    home_ownership |     -0.18 | [-0.22, -0.13] | 7.35e+08 | < .001***
some_college      |    housing_burden |      0.17 | [ 0.12,  0.22] | 5.20e+08 | < .001***
some_college      |          below_18 |     -0.08 | [-0.13, -0.03] | 6.78e+08 | 0.119    
some_college      |          above_65 |     -0.08 | [-0.13, -0.03] | 6.73e+08 | 0.284    
some_college      |             black |     -0.11 | [-0.16, -0.06] | 6.94e+08 | 0.002**  
some_college      |            native |     -0.05 | [-0.10,  0.00] | 6.56e+08 | > .999   
some_college      |             asian |      0.46 | [ 0.42,  0.50] | 3.39e+08 | < .001***
some_college      |          hispanic |      0.10 | [ 0.05,  0.15] | 5.64e+08 | 0.015*   
some_college      |             white |      0.10 | [ 0.05,  0.15] | 5.65e+08 | 0.018*   
some_college      |            female |      0.16 | [ 0.11,  0.21] | 5.27e+08 | < .001***
some_college      |             rural |     -0.47 | [-0.51, -0.43] | 9.20e+08 | < .001***
some_college      |        inactivity |     -0.74 | [-0.76, -0.72] | 1.09e+09 | < .001***
some_college      |           obesity |     -0.58 | [-0.61, -0.54] | 9.88e+08 | < .001***
some_college      |  household_income |      0.66 | [ 0.63,  0.69] | 2.12e+08 | < .001***
some_college      |      smoking_prop |     -0.58 | [-0.62, -0.55] | 9.90e+08 | < .001***
some_college      |     drinking_prop |      0.56 | [ 0.52,  0.59] | 2.77e+08 | < .001***
some_college      |         dentist_n |     -0.52 | [-0.56, -0.48] | 9.52e+08 | < .001***
unemployment      | income_inequality |      0.36 | [ 0.32,  0.41] | 3.98e+08 | < .001***
unemployment      |      social_score |     -0.15 | [-0.20, -0.10] | 7.22e+08 | < .001***
unemployment      |   life_expectancy |     -0.39 | [-0.43, -0.34] | 8.68e+08 | < .001***
unemployment      |          diabetes |      0.41 | [ 0.37,  0.46] | 3.67e+08 | < .001***
unemployment      |      insuff_sleep |      0.45 | [ 0.41,  0.49] | 3.42e+08 | < .001***
unemployment      |   highschool_grad |     -0.18 | [-0.23, -0.13] | 7.40e+08 | < .001***
unemployment      |       segregation |     -0.02 | [-0.07,  0.03] | 6.39e+08 | > .999   
unemployment      |    home_ownership |     -0.08 | [-0.13, -0.03] | 6.74e+08 | 0.251    
unemployment      |    housing_burden |      0.23 | [ 0.18,  0.27] | 4.84e+08 | < .001***
unemployment      |          below_18 |     -0.19 | [-0.24, -0.14] | 7.43e+08 | < .001***
unemployment      |          above_65 |  1.51e-03 | [-0.05,  0.05] | 6.25e+08 | > .999   
unemployment      |             black |      0.23 | [ 0.18,  0.28] | 4.82e+08 | < .001***
unemployment      |            native |     -0.04 | [-0.09,  0.01] | 6.49e+08 | > .999   
unemployment      |             asian |     -0.02 | [-0.07,  0.03] | 6.37e+08 | > .999   
unemployment      |          hispanic |     -0.01 | [-0.06,  0.04] | 6.32e+08 | > .999   
unemployment      |             white |     -0.24 | [-0.29, -0.19] | 7.76e+08 | < .001***
unemployment      |            female |      0.12 | [ 0.07,  0.17] | 5.49e+08 | < .001***
unemployment      |             rural | -9.86e-03 | [-0.06,  0.04] | 6.32e+08 | > .999   
unemployment      |        inactivity |      0.29 | [ 0.24,  0.34] | 4.44e+08 | < .001***
unemployment      |           obesity |      0.18 | [ 0.13,  0.23] | 5.14e+08 | < .001***
unemployment      |  household_income |     -0.34 | [-0.39, -0.30] | 8.40e+08 | < .001***
unemployment      |      smoking_prop |      0.34 | [ 0.29,  0.38] | 4.14e+08 | < .001***
unemployment      |     drinking_prop |     -0.24 | [-0.29, -0.19] | 7.78e+08 | < .001***
unemployment      |         dentist_n |      0.04 | [-0.01,  0.09] | 6.00e+08 | > .999   
income_inequality |      social_score |     -0.06 | [-0.11, -0.01] | 6.63e+08 | > .999   
income_inequality |   life_expectancy |     -0.36 | [-0.40, -0.31] | 8.50e+08 | < .001***
income_inequality |          diabetes |      0.47 | [ 0.43,  0.51] | 3.31e+08 | < .001***
income_inequality |      insuff_sleep |      0.40 | [ 0.35,  0.44] | 3.76e+08 | < .001***
income_inequality |   highschool_grad |     -0.13 | [-0.18, -0.08] | 7.04e+08 | < .001***
income_inequality |       segregation |     -0.04 | [-0.09,  0.01] | 6.50e+08 | > .999   
income_inequality |    home_ownership |     -0.27 | [-0.31, -0.22] | 7.93e+08 | < .001***
income_inequality |    housing_burden |      0.42 | [ 0.37,  0.46] | 3.66e+08 | < .001***
income_inequality |          below_18 |     -0.15 | [-0.20, -0.10] | 7.18e+08 | < .001***
income_inequality |          above_65 |     -0.03 | [-0.08,  0.03] | 6.42e+08 | > .999   
income_inequality |             black |      0.31 | [ 0.27,  0.36] | 4.29e+08 | < .001***
income_inequality |            native |     -0.08 | [-0.13, -0.03] | 6.78e+08 | 0.113    
income_inequality |             asian |     -0.03 | [-0.09,  0.02] | 6.47e+08 | > .999   
income_inequality |          hispanic |     -0.05 | [-0.10,  0.00] | 6.57e+08 | > .999   
income_inequality |             white |     -0.26 | [-0.31, -0.22] | 7.90e+08 | < .001***
income_inequality |            female |      0.26 | [ 0.21,  0.30] | 4.64e+08 | < .001***
income_inequality |             rural |     -0.03 | [-0.08,  0.03] | 6.41e+08 | > .999   
income_inequality |        inactivity |      0.38 | [ 0.34,  0.42] | 3.87e+08 | < .001***
income_inequality |           obesity |      0.23 | [ 0.18,  0.28] | 4.81e+08 | < .001***
income_inequality |  household_income |     -0.41 | [-0.45, -0.37] | 8.83e+08 | < .001***
income_inequality |      smoking_prop |      0.34 | [ 0.29,  0.38] | 4.15e+08 | < .001***
income_inequality |     drinking_prop |     -0.26 | [-0.31, -0.21] | 7.88e+08 | < .001***
income_inequality |         dentist_n |      0.02 | [-0.03,  0.07] | 6.11e+08 | > .999   
social_score      |   life_expectancy |      0.02 | [-0.03,  0.07] | 6.11e+08 | > .999   
social_score      |          diabetes |     -0.11 | [-0.16, -0.06] | 6.94e+08 | 0.002**  
social_score      |      insuff_sleep |     -0.12 | [-0.17, -0.07] | 7.00e+08 | < .001***
social_score      |   highschool_grad |      0.04 | [-0.01,  0.09] | 5.98e+08 | > .999   
social_score      |       segregation |     -0.02 | [-0.07,  0.04] | 6.35e+08 | > .999   
social_score      |    home_ownership |      0.08 | [ 0.03,  0.13] | 5.73e+08 | 0.121    
social_score      |    housing_burden |     -0.19 | [-0.24, -0.14] | 7.43e+08 | < .001***
social_score      |          below_18 |     -0.01 | [-0.06,  0.04] | 6.32e+08 | > .999   
social_score      |          above_65 |      0.22 | [ 0.17,  0.27] | 4.88e+08 | < .001***
social_score      |             black |     -0.02 | [-0.07,  0.03] | 6.37e+08 | > .999   
social_score      |            native |     -0.08 | [-0.13, -0.03] | 6.74e+08 | 0.251    
social_score      |             asian |     -0.11 | [-0.16, -0.06] | 6.92e+08 | 0.004**  
social_score      |          hispanic |     -0.19 | [-0.24, -0.14] | 7.47e+08 | < .001***
social_score      |             white |      0.19 | [ 0.14,  0.24] | 5.05e+08 | < .001***
social_score      |            female |      0.09 | [ 0.04,  0.14] | 5.69e+08 | 0.046*   
social_score      |             rural |      0.05 | [ 0.00,  0.10] | 5.93e+08 | > .999   
social_score      |        inactivity |      0.01 | [-0.04,  0.06] | 6.18e+08 | > .999   
social_score      |           obesity |      0.14 | [ 0.09,  0.19] | 5.40e+08 | < .001***
social_score      |  household_income |     -0.03 | [-0.08,  0.02] | 6.44e+08 | > .999   
social_score      |      smoking_prop |     -0.04 | [-0.09,  0.01] | 6.51e+08 | > .999   
social_score      |     drinking_prop |      0.11 | [ 0.06,  0.16] | 5.54e+08 | < .001***
social_score      |         dentist_n |     -0.03 | [-0.09,  0.02] | 6.47e+08 | > .999   
life_expectancy   |          diabetes |     -0.78 | [-0.79, -0.75] | 1.11e+09 | < .001***
life_expectancy   |      insuff_sleep |     -0.70 | [-0.73, -0.68] | 1.06e+09 | < .001***
life_expectancy   |   highschool_grad |      0.03 | [-0.02,  0.08] | 6.08e+08 | > .999   
life_expectancy   |       segregation |      0.28 | [ 0.23,  0.33] | 4.51e+08 | < .001***
life_expectancy   |    home_ownership |      0.03 | [-0.02,  0.08] | 6.09e+08 | > .999   
life_expectancy   |    housing_burden |      0.07 | [ 0.02,  0.12] | 5.84e+08 | 0.834    
life_expectancy   |          below_18 |     -0.20 | [-0.24, -0.15] | 7.48e+08 | < .001***
life_expectancy   |          above_65 |      0.08 | [ 0.03,  0.13] | 5.77e+08 | 0.229    
life_expectancy   |             black |     -0.27 | [-0.31, -0.22] | 7.93e+08 | < .001***
life_expectancy   |            native |      0.08 | [ 0.03,  0.13] | 5.76e+08 | 0.205    
life_expectancy   |             asian |      0.38 | [ 0.34,  0.43] | 3.86e+08 | < .001***
life_expectancy   |          hispanic |      0.21 | [ 0.16,  0.26] | 4.95e+08 | < .001***
life_expectancy   |             white |      0.14 | [ 0.09,  0.19] | 5.39e+08 | < .001***
life_expectancy   |            female |     -0.14 | [-0.19, -0.09] | 7.14e+08 | < .001***
life_expectancy   |             rural |     -0.27 | [-0.32, -0.22] | 7.96e+08 | < .001***
life_expectancy   |        inactivity |     -0.75 | [-0.78, -0.73] | 1.10e+09 | < .001***
life_expectancy   |           obesity |     -0.65 | [-0.68, -0.62] | 1.03e+09 | < .001***
life_expectancy   |  household_income |      0.74 | [ 0.72,  0.76] | 1.62e+08 | < .001***
life_expectancy   |      smoking_prop |     -0.74 | [-0.76, -0.72] | 1.09e+09 | < .001***
life_expectancy   |     drinking_prop |      0.62 | [ 0.59,  0.65] | 2.37e+08 | < .001***
life_expectancy   |         dentist_n |     -0.36 | [-0.40, -0.31] | 8.50e+08 | < .001***
diabetes          |      insuff_sleep |      0.79 | [ 0.77,  0.81] | 1.30e+08 | < .001***
diabetes          |   highschool_grad |     -0.03 | [-0.08,  0.03] | 6.41e+08 | > .999   
diabetes          |       segregation |     -0.36 | [-0.40, -0.31] | 8.50e+08 | < .001***
diabetes          |    home_ownership |     -0.15 | [-0.20, -0.09] | 7.16e+08 | < .001***
diabetes          |    housing_burden |      0.07 | [ 0.02,  0.12] | 5.83e+08 | 0.678    
diabetes          |          below_18 |      0.19 | [ 0.14,  0.24] | 5.07e+08 | < .001***
diabetes          |          above_65 |     -0.19 | [-0.24, -0.14] | 7.43e+08 | < .001***
diabetes          |             black |      0.44 | [ 0.40,  0.48] | 3.51e+08 | < .001***
diabetes          |            native |     -0.04 | [-0.09,  0.01] | 6.50e+08 | > .999   
diabetes          |             asian |     -0.26 | [-0.31, -0.21] | 7.87e+08 | < .001***
diabetes          |          hispanic |      0.01 | [-0.04,  0.06] | 6.17e+08 | > .999   
diabetes          |             white |     -0.43 | [-0.47, -0.39] | 8.95e+08 | < .001***
diabetes          |            female |      0.14 | [ 0.09,  0.19] | 5.39e+08 | < .001***
diabetes          |             rural |      0.22 | [ 0.17,  0.27] | 4.86e+08 | < .001***
diabetes          |        inactivity |      0.86 | [ 0.84,  0.87] | 8.88e+07 | < .001***
diabetes          |           obesity |      0.70 | [ 0.67,  0.73] | 1.87e+08 | < .001***
diabetes          |  household_income |     -0.72 | [-0.74, -0.69] | 1.07e+09 | < .001***
diabetes          |      smoking_prop |      0.66 | [ 0.63,  0.69] | 2.14e+08 | < .001***
diabetes          |     drinking_prop |     -0.67 | [-0.70, -0.64] | 1.05e+09 | < .001***
diabetes          |         dentist_n |      0.38 | [ 0.34,  0.42] | 3.86e+08 | < .001***
insuff_sleep      |   highschool_grad |      0.08 | [ 0.03,  0.13] | 5.78e+08 | 0.267    
insuff_sleep      |       segregation |     -0.34 | [-0.39, -0.30] | 8.39e+08 | < .001***
insuff_sleep      |    home_ownership |     -0.04 | [-0.09,  0.01] | 6.52e+08 | > .999   
insuff_sleep      |    housing_burden |      0.07 | [ 0.02,  0.12] | 5.83e+08 | 0.678    
insuff_sleep      |          below_18 |      0.03 | [-0.02,  0.08] | 6.05e+08 | > .999   
insuff_sleep      |          above_65 |     -0.13 | [-0.18, -0.08] | 7.08e+08 | < .001***
insuff_sleep      |             black |      0.50 | [ 0.46,  0.54] | 3.13e+08 | < .001***
insuff_sleep      |            native |     -0.30 | [-0.34, -0.25] | 8.10e+08 | < .001***
insuff_sleep      |             asian |     -0.22 | [-0.27, -0.17] | 7.62e+08 | < .001***
insuff_sleep      |          hispanic |     -0.16 | [-0.21, -0.11] | 7.25e+08 | < .001***
insuff_sleep      |             white |     -0.24 | [-0.29, -0.20] | 7.79e+08 | < .001***
insuff_sleep      |            female |      0.19 | [ 0.14,  0.24] | 5.06e+08 | < .001***
insuff_sleep      |             rural |      0.13 | [ 0.08,  0.18] | 5.44e+08 | < .001***
insuff_sleep      |        inactivity |      0.71 | [ 0.68,  0.73] | 1.82e+08 | < .001***
insuff_sleep      |           obesity |      0.57 | [ 0.54,  0.61] | 2.67e+08 | < .001***
insuff_sleep      |  household_income |     -0.56 | [-0.60, -0.53] | 9.77e+08 | < .001***
insuff_sleep      |      smoking_prop |      0.66 | [ 0.63,  0.68] | 2.15e+08 | < .001***
insuff_sleep      |     drinking_prop |     -0.58 | [-0.61, -0.54] | 9.86e+08 | < .001***
insuff_sleep      |         dentist_n |      0.34 | [ 0.29,  0.39] | 4.12e+08 | < .001***
highschool_grad   |       segregation |     -0.18 | [-0.23, -0.13] | 7.38e+08 | < .001***
highschool_grad   |    home_ownership |      0.31 | [ 0.26,  0.35] | 4.33e+08 | < .001***
highschool_grad   |    housing_burden |     -0.28 | [-0.33, -0.24] | 8.03e+08 | < .001***
highschool_grad   |          below_18 |      0.06 | [ 0.01,  0.11] | 5.87e+08 | > .999   
highschool_grad   |          above_65 |      0.07 | [ 0.02,  0.12] | 5.81e+08 | 0.484    
highschool_grad   |             black |     -0.08 | [-0.13, -0.03] | 6.75e+08 | 0.184    
highschool_grad   |            native |     -0.21 | [-0.26, -0.16] | 7.59e+08 | < .001***
highschool_grad   |             asian |     -0.20 | [-0.25, -0.15] | 7.48e+08 | < .001***
highschool_grad   |          hispanic |     -0.12 | [-0.17, -0.07] | 6.99e+08 | < .001***
highschool_grad   |             white |      0.27 | [ 0.22,  0.32] | 4.57e+08 | < .001***
highschool_grad   |            female |     -0.08 | [-0.13, -0.03] | 6.74e+08 | 0.259    
highschool_grad   |             rural |      0.21 | [ 0.16,  0.26] | 4.92e+08 | < .001***
highschool_grad   |        inactivity |      0.11 | [ 0.06,  0.16] | 5.59e+08 | 0.004**  
highschool_grad   |           obesity |      0.09 | [ 0.04,  0.14] | 5.67e+08 | 0.032*   
highschool_grad   |  household_income |      0.02 | [-0.03,  0.07] | 6.11e+08 | > .999   
highschool_grad   |      smoking_prop |     -0.05 | [-0.10,  0.01] | 6.54e+08 | > .999   
highschool_grad   |     drinking_prop |     -0.07 | [-0.13, -0.02] | 6.72e+08 | 0.342    
highschool_grad   |         dentist_n |      0.26 | [ 0.21,  0.31] | 4.61e+08 | < .001***
segregation       |    home_ownership |     -0.01 | [-0.06,  0.04] | 6.34e+08 | > .999   
segregation       |    housing_burden |      0.12 | [ 0.07,  0.17] | 5.50e+08 | < .001***
segregation       |          below_18 |     -0.10 | [-0.15, -0.05] | 6.89e+08 | 0.008**  
segregation       |          above_65 |      0.12 | [ 0.07,  0.17] | 5.52e+08 | < .001***
segregation       |             black |     -0.37 | [-0.42, -0.33] | 8.60e+08 | < .001***
segregation       |            native |      0.12 | [ 0.06,  0.17] | 5.53e+08 | < .001***
segregation       |             asian |      0.09 | [ 0.04,  0.14] | 5.67e+08 | 0.029*   
segregation       |          hispanic |  2.81e-03 | [-0.05,  0.05] | 6.24e+08 | > .999   
segregation       |             white |      0.24 | [ 0.19,  0.28] | 4.77e+08 | < .001***
segregation       |            female |     -0.07 | [-0.12, -0.02] | 6.70e+08 | 0.510    
segregation       |             rural |     -0.18 | [-0.23, -0.13] | 7.36e+08 | < .001***
segregation       |        inactivity |     -0.30 | [-0.34, -0.25] | 8.12e+08 | < .001***
segregation       |           obesity |     -0.29 | [-0.34, -0.25] | 8.09e+08 | < .001***
segregation       |  household_income |      0.19 | [ 0.14,  0.24] | 5.07e+08 | < .001***
segregation       |      smoking_prop |     -0.18 | [-0.23, -0.13] | 7.38e+08 | < .001***
segregation       |     drinking_prop |      0.29 | [ 0.24,  0.34] | 4.44e+08 | < .001***
segregation       |         dentist_n |     -0.32 | [-0.37, -0.28] | 8.27e+08 | < .001***
home_ownership    |    housing_burden |     -0.55 | [-0.58, -0.51] | 9.68e+08 | < .001***
home_ownership    |          below_18 |     -0.10 | [-0.15, -0.05] | 6.88e+08 | 0.011*   
home_ownership    |          above_65 |      0.48 | [ 0.44,  0.52] | 3.22e+08 | < .001***
home_ownership    |             black |     -0.33 | [-0.38, -0.29] | 8.34e+08 | < .001***
home_ownership    |            native |     -0.10 | [-0.15, -0.05] | 6.87e+08 | 0.013*   
home_ownership    |             asian |     -0.53 | [-0.56, -0.49] | 9.55e+08 | < .001***
home_ownership    |          hispanic |     -0.37 | [-0.41, -0.32] | 8.55e+08 | < .001***
home_ownership    |             white |      0.50 | [ 0.47,  0.54] | 3.10e+08 | < .001***
home_ownership    |            female |     -0.24 | [-0.28, -0.19] | 7.73e+08 | < .001***
home_ownership    |             rural |      0.56 | [ 0.52,  0.59] | 2.78e+08 | < .001***
home_ownership    |        inactivity |      0.03 | [-0.03,  0.08] | 6.10e+08 | > .999   
home_ownership    |           obesity |      0.10 | [ 0.05,  0.15] | 5.64e+08 | 0.015*   
home_ownership    |  household_income |     -0.03 | [-0.08,  0.02] | 6.46e+08 | > .999   
home_ownership    |      smoking_prop |     -0.05 | [-0.10,  0.00] | 6.56e+08 | > .999   
home_ownership    |     drinking_prop |     -0.09 | [-0.14, -0.04] | 6.83e+08 | 0.033*   
home_ownership    |         dentist_n |      0.46 | [ 0.42,  0.50] | 3.37e+08 | < .001***
housing_burden    |          below_18 |     -0.20 | [-0.25, -0.15] | 7.53e+08 | < .001***
housing_burden    |          above_65 |     -0.21 | [-0.26, -0.17] | 7.60e+08 | < .001***
housing_burden    |             black |      0.35 | [ 0.30,  0.39] | 4.08e+08 | < .001***
housing_burden    |            native |     -0.03 | [-0.08,  0.02] | 6.46e+08 | > .999   
housing_burden    |             asian |      0.46 | [ 0.42,  0.50] | 3.36e+08 | < .001***
housing_burden    |          hispanic |      0.25 | [ 0.20,  0.30] | 4.67e+08 | < .001***
housing_burden    |             white |     -0.40 | [-0.44, -0.36] | 8.77e+08 | < .001***
housing_burden    |            female |      0.30 | [ 0.26,  0.35] | 4.35e+08 | < .001***
housing_burden    |             rural |     -0.52 | [-0.56, -0.48] | 9.52e+08 | < .001***
housing_burden    |        inactivity |     -0.14 | [-0.19, -0.09] | 7.11e+08 | < .001***
housing_burden    |           obesity |     -0.22 | [-0.27, -0.17] | 7.63e+08 | < .001***
housing_burden    |  household_income |      0.09 | [ 0.04,  0.14] | 5.70e+08 | 0.059    
housing_burden    |      smoking_prop |     -0.05 | [-0.10,  0.00] | 6.56e+08 | > .999   
housing_burden    |     drinking_prop |      0.07 | [ 0.01,  0.12] | 5.84e+08 | 0.874    
housing_burden    |         dentist_n |     -0.40 | [-0.44, -0.35] | 8.73e+08 | < .001***
below_18          |          above_65 |     -0.50 | [-0.54, -0.46] | 9.40e+08 | < .001***
below_18          |             black |     -0.06 | [-0.11, -0.01] | 6.65e+08 | > .999   
below_18          |            native |      0.18 | [ 0.13,  0.23] | 5.15e+08 | < .001***
below_18          |             asian |     -0.05 | [-0.10,  0.00] | 6.56e+08 | > .999   
below_18          |          hispanic |      0.21 | [ 0.16,  0.26] | 4.93e+08 | < .001***
below_18          |             white |     -0.17 | [-0.22, -0.12] | 7.30e+08 | < .001***
below_18          |            female |      0.01 | [-0.04,  0.06] | 6.18e+08 | > .999   
below_18          |             rural |     -0.02 | [-0.07,  0.03] | 6.38e+08 | > .999   
below_18          |        inactivity |      0.23 | [ 0.18,  0.27] | 4.85e+08 | < .001***
below_18          |           obesity |      0.27 | [ 0.22,  0.31] | 4.59e+08 | < .001***
below_18          |  household_income |      0.02 | [-0.03,  0.07] | 6.13e+08 | > .999   
below_18          |      smoking_prop |      0.06 | [ 0.01,  0.11] | 5.88e+08 | > .999   
below_18          |     drinking_prop |     -0.10 | [-0.15, -0.05] | 6.86e+08 | 0.016*   
below_18          |         dentist_n |      0.04 | [-0.02,  0.09] | 6.03e+08 | > .999   
above_65          |             black |     -0.39 | [-0.43, -0.34] | 8.66e+08 | < .001***
above_65          |            native |     -0.01 | [-0.06,  0.04] | 6.33e+08 | > .999   
above_65          |             asian |     -0.43 | [-0.47, -0.38] | 8.93e+08 | < .001***
above_65          |          hispanic |     -0.35 | [-0.39, -0.30] | 8.44e+08 | < .001***
above_65          |             white |      0.47 | [ 0.42,  0.50] | 3.34e+08 | < .001***
above_65          |            female |     -0.08 | [-0.13, -0.03] | 6.78e+08 | 0.113    
above_65          |             rural |      0.50 | [ 0.46,  0.53] | 3.16e+08 | < .001***
above_65          |        inactivity |     -0.04 | [-0.09,  0.01] | 6.50e+08 | > .999   
above_65          |           obesity |      0.02 | [-0.03,  0.07] | 6.12e+08 | > .999   
above_65          |  household_income |     -0.24 | [-0.29, -0.19] | 7.76e+08 | < .001***
above_65          |      smoking_prop |     -0.10 | [-0.15, -0.05] | 6.90e+08 | 0.007**  
above_65          |     drinking_prop |     -0.11 | [-0.16, -0.06] | 6.94e+08 | 0.002**  
above_65          |         dentist_n |      0.23 | [ 0.18,  0.28] | 4.83e+08 | < .001***
black             |            native |     -0.29 | [-0.34, -0.24] | 8.08e+08 | < .001***
black             |             asian |      0.33 | [ 0.28,  0.37] | 4.22e+08 | < .001***
black             |          hispanic |      0.15 | [ 0.10,  0.20] | 5.30e+08 | < .001***
black             |             white |     -0.60 | [-0.64, -0.57] | 1.00e+09 | < .001***
black             |            female |      0.38 | [ 0.34,  0.43] | 3.85e+08 | < .001***
black             |             rural |     -0.35 | [-0.39, -0.30] | 8.42e+08 | < .001***
black             |        inactivity |      0.25 | [ 0.20,  0.29] | 4.71e+08 | < .001***
black             |           obesity |      0.17 | [ 0.12,  0.22] | 5.21e+08 | < .001***
black             |  household_income |     -0.08 | [-0.13, -0.03] | 6.74e+08 | 0.251    
black             |      smoking_prop |      0.24 | [ 0.19,  0.29] | 4.74e+08 | < .001***
black             |     drinking_prop |     -0.18 | [-0.23, -0.13] | 7.37e+08 | < .001***
black             |         dentist_n |     -0.03 | [-0.08,  0.02] | 6.45e+08 | > .999   
native            |             asian |      0.08 | [ 0.02,  0.13] | 5.78e+08 | 0.296    
native            |          hispanic |      0.49 | [ 0.45,  0.53] | 3.18e+08 | < .001***
native            |             white |     -0.29 | [-0.34, -0.24] | 8.08e+08 | < .001***
native            |            female |     -0.32 | [-0.37, -0.28] | 8.29e+08 | < .001***
native            |             rural |      0.10 | [ 0.05,  0.15] | 5.64e+08 | 0.014*   
native            |        inactivity |     -0.12 | [-0.17, -0.07] | 7.04e+08 | < .001***
native            |           obesity |     -0.11 | [-0.16, -0.06] | 6.94e+08 | 0.002**  
native            |  household_income |      0.03 | [-0.02,  0.08] | 6.07e+08 | > .999   
native            |      smoking_prop |     -0.18 | [-0.23, -0.14] | 7.41e+08 | < .001***
native            |     drinking_prop |      0.10 | [ 0.05,  0.15] | 5.63e+08 | 0.010*   
native            |         dentist_n |     -0.08 | [-0.13, -0.03] | 6.78e+08 | 0.110    
asian             |          hispanic |      0.51 | [ 0.48,  0.55] | 3.04e+08 | < .001***
asian             |             white |     -0.46 | [-0.50, -0.42] | 9.14e+08 | < .001***
asian             |            female |      0.24 | [ 0.19,  0.28] | 4.78e+08 | < .001***
asian             |             rural |     -0.71 | [-0.73, -0.68] | 1.07e+09 | < .001***
asian             |        inactivity |     -0.47 | [-0.51, -0.43] | 9.19e+08 | < .001***
asian             |           obesity |     -0.50 | [-0.54, -0.46] | 9.41e+08 | < .001***
asian             |  household_income |      0.53 | [ 0.49,  0.57] | 2.92e+08 | < .001***
asian             |      smoking_prop |     -0.36 | [-0.40, -0.32] | 8.51e+08 | < .001***
asian             |     drinking_prop |      0.36 | [ 0.31,  0.40] | 4.00e+08 | < .001***
asian             |         dentist_n |     -0.59 | [-0.62, -0.55] | 9.92e+08 | < .001***
hispanic          |             white |     -0.65 | [-0.68, -0.62] | 1.03e+09 | < .001***
hispanic          |            female |     -0.03 | [-0.08,  0.02] | 6.44e+08 | > .999   
hispanic          |             rural |     -0.34 | [-0.39, -0.30] | 8.40e+08 | < .001***
hispanic          |        inactivity |     -0.20 | [-0.25, -0.15] | 7.51e+08 | < .001***
hispanic          |           obesity |     -0.30 | [-0.34, -0.25] | 8.10e+08 | < .001***
hispanic          |  household_income |      0.29 | [ 0.24,  0.34] | 4.44e+08 | < .001***
hispanic          |      smoking_prop |     -0.40 | [-0.45, -0.36] | 8.78e+08 | < .001***
hispanic          |     drinking_prop |      0.11 | [ 0.06,  0.16] | 5.59e+08 | 0.004**  
hispanic          |         dentist_n |     -0.25 | [-0.30, -0.20] | 7.82e+08 | < .001***
white             |            female |     -0.19 | [-0.24, -0.14] | 7.44e+08 | < .001***
white             |             rural |      0.32 | [ 0.27,  0.37] | 4.24e+08 | < .001***
white             |        inactivity |     -0.11 | [-0.16, -0.06] | 6.95e+08 | 0.002**  
white             |           obesity |     -0.02 | [-0.07,  0.03] | 6.40e+08 | > .999   
white             |  household_income | -5.93e-04 | [-0.05,  0.05] | 6.26e+08 | > .999   
white             |      smoking_prop |      0.03 | [-0.02,  0.08] | 6.08e+08 | > .999   
white             |     drinking_prop |      0.13 | [ 0.08,  0.18] | 5.42e+08 | < .001***
white             |         dentist_n |      0.17 | [ 0.12,  0.22] | 5.22e+08 | < .001***
female            |             rural |     -0.38 | [-0.43, -0.34] | 8.66e+08 | < .001***
female            |        inactivity |      0.03 | [-0.02,  0.08] | 6.09e+08 | > .999   
female            |           obesity | -9.56e-03 | [-0.06,  0.04] | 6.31e+08 | > .999   
female            |  household_income | -1.80e-03 | [-0.05,  0.05] | 6.27e+08 | > .999   
female            |      smoking_prop |      0.08 | [ 0.03,  0.13] | 5.75e+08 | 0.181    
female            |     drinking_prop |     -0.17 | [-0.21, -0.12] | 7.29e+08 | < .001***
female            |         dentist_n |     -0.23 | [-0.28, -0.18] | 7.69e+08 | < .001***
rural             |        inactivity |      0.38 | [ 0.34,  0.43] | 3.85e+08 | < .001***
rural             |           obesity |      0.39 | [ 0.35,  0.43] | 3.81e+08 | < .001***
rural             |  household_income |     -0.47 | [-0.51, -0.43] | 9.22e+08 | < .001***
rural             |      smoking_prop |      0.24 | [ 0.19,  0.28] | 4.78e+08 | < .001***
rural             |     drinking_prop |     -0.33 | [-0.38, -0.29] | 8.33e+08 | < .001***
rural             |         dentist_n |      0.66 | [ 0.63,  0.69] | 2.12e+08 | < .001***
inactivity        |           obesity |      0.79 | [ 0.77,  0.81] | 1.29e+08 | < .001***
inactivity        |  household_income |     -0.76 | [-0.78, -0.74] | 1.10e+09 | < .001***
inactivity        |      smoking_prop |      0.71 | [ 0.68,  0.73] | 1.83e+08 | < .001***
inactivity        |     drinking_prop |     -0.62 | [-0.65, -0.58] | 1.01e+09 | < .001***
inactivity        |         dentist_n |      0.50 | [ 0.46,  0.53] | 3.14e+08 | < .001***
obesity           |  household_income |     -0.66 | [-0.69, -0.63] | 1.04e+09 | < .001***
obesity           |      smoking_prop |      0.62 | [ 0.58,  0.65] | 2.40e+08 | < .001***
obesity           |     drinking_prop |     -0.49 | [-0.53, -0.45] | 9.35e+08 | < .001***
obesity           |         dentist_n |      0.45 | [ 0.40,  0.49] | 3.47e+08 | < .001***
household_income  |      smoking_prop |     -0.66 | [-0.69, -0.63] | 1.04e+09 | < .001***
household_income  |     drinking_prop |      0.61 | [ 0.58,  0.64] | 2.45e+08 | < .001***
household_income  |         dentist_n |     -0.42 | [-0.47, -0.38] | 8.91e+08 | < .001***
smoking_prop      |     drinking_prop |     -0.47 | [-0.51, -0.43] | 9.18e+08 | < .001***
smoking_prop      |         dentist_n |      0.32 | [ 0.27,  0.37] | 4.25e+08 | < .001***
drinking_prop     |         dentist_n |     -0.39 | [-0.43, -0.35] | 8.69e+08 | < .001***

p-value adjustment method: Holm (1979)
Observations: 1554

The variables with a correlation of 0.8 or more (or -0.8 or less) are:

poor_health with poor_physical, life_expectancy, diabetes, inactivity, and obesity
poor_physical with poor_mental, life_expectancy, diabetes, and inactivity
diabetes with inactivity

The variables that exhibited high correlation with the outcome smoking_prop will be removed from the analysis (poor_health, poor_physical, and diabetes).

Code

chr_fin <- chr_simp |>
  dplyr::select(-fipscode, -poor_health, -poor_physical, -inactivity)

dim(chr_fin)

[1] 1554   31

4 Table 1

setting up table 1:

Code

myVars  <- c("poor_mental", "food_index",
                   "uninsured", "primary_care", "mental_care",
                   "highschool_grad", "some_college", "unemployment", "income_inequality",
                   "social_score", "life_expectancy", "insuff_sleep",
                   "home_ownership", "housing_burden",
                   "below_18","above_65", "black", "native", "asian", "hispanic",
                   "white", "female", "rural", "obesity", "household_income", "smoking_prop", "drinking_prop")


t1 <- tableone::CreateTableOne(
          data = chr_fin, 
          vars = myVars,
          strata = c("dentist_grp"))

table_one_text <- print(
  t1,
  showAllLevels = TRUE,
  quote         = FALSE,
  noSpaces      = FALSE,
  printToggle   = FALSE  
)


knitr::kable(
  table_one_text,
  format   = "html",
  caption  = "Comparison by Dentist Proportion"
)

Comparison by Dentist Proportion
	High	Low	p
n	777	777
poor_mental (mean (SD))	5.02 (0.59)	5.35 (0.64)	<0.001
food_index (mean (SD))	7.90 (1.18)	7.19 (1.38)	<0.001
uninsured (mean (SD))	0.09 (0.04)	0.13 (0.06)	<0.001
primary_care (mean (SD))	0.00 (0.00)	0.00 (0.00)	<0.001
mental_care (mean (SD))	547.08 (975.87)	2390.97 (2778.69)	<0.001
highschool_grad (mean (SD))	0.86 (0.07)	0.89 (0.08)	<0.001
some_college (mean (SD))	0.67 (0.10)	0.54 (0.12)	<0.001
unemployment (mean (SD))	0.03 (0.01)	0.04 (0.01)	0.151
income_inequality (mean (SD))	4.54 (0.74)	4.61 (0.95)	0.085
social_score (mean (SD))	10.97 (5.14)	10.85 (7.36)	0.701
life_expectancy (mean (SD))	77.26 (3.57)	74.98 (3.37)	<0.001
insuff_sleep (mean (SD))	0.33 (0.04)	0.35 (0.04)	<0.001
home_ownership (mean (SD))	0.68 (0.10)	0.76 (0.08)	<0.001
housing_burden (mean (SD))	0.13 (0.04)	0.10 (0.03)	<0.001
below_18 (mean (SD))	0.21 (0.03)	0.22 (0.04)	0.632
above_65 (mean (SD))	0.20 (0.05)	0.22 (0.05)	<0.001
black (mean (SD))	0.08 (0.12)	0.10 (0.16)	0.004
native (mean (SD))	0.04 (0.10)	0.02 (0.06)	<0.001
asian (mean (SD))	0.04 (0.05)	0.01 (0.01)	<0.001
hispanic (mean (SD))	0.11 (0.11)	0.10 (0.15)	0.036
white (mean (SD))	0.72 (0.20)	0.76 (0.21)	<0.001
female (mean (SD))	0.50 (0.02)	0.49 (0.03)	<0.001
rural (mean (SD))	0.38 (0.33)	0.89 (0.19)	<0.001
obesity (mean (SD))	0.34 (0.05)	0.39 (0.04)	<0.001
household_income (mean (SD))	71992.92 (19356.32)	57529.52 (12919.78)	<0.001
smoking_prop (mean (SD))	16.41 (3.68)	18.78 (3.76)	<0.001
drinking_prop (mean (SD))	18.83 (3.24)	16.26 (2.91)	<0.001

The low dentist group counties have more proportions of uninsured populations (0.13 vs 0.09 ), lower college education (0.54 vs 0.67), lower life expectancy (74.9 vs 77.3), more rural areas (0.89 vs 0.38), more smoking prevalence (18.8 vs 16.4) and lower excessive drinking (16.3 vs 18.8).

5 Propensity Score Assessment

5.1 Calculating and Evaluating PS Balance

To calculate the PS, we will build a logistic model with the binary exposure as the outcome and the covariates as the predictors.

Code

psmodel0 <- glm(dentist_grp ~  poor_mental + food_index + uninsured + income_inequality +
                  primary_care + mental_care + highschool_grad + some_college + unemployment + 
                  social_score + life_expectancy + insuff_sleep + segregation + home_ownership +
                  housing_burden + below_18 + above_65 + black + native + asian + hispanic +
                  white + female + rural + obesity + household_income, 
                family = binomial(), data= chr_fin)

model_parameters(psmodel0) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−0.289	7.033	0.950	−14.370	13.621	−0.041	Inf	0.967
poor_mental	0.229	0.283	0.950	−0.325	0.785	0.811	Inf	0.417
food_index	−0.196	0.107	0.950	−0.409	0.012	−1.826	Inf	0.068
uninsured	12.491	3.018	0.950	6.653	18.494	4.139	Inf	0.000
income_inequality	0.143	0.161	0.950	−0.167	0.464	0.890	Inf	0.373
primary_care	−3,235.569	350.427	0.950	−3,940.793	−2,564.907	−9.233	Inf	0.000
mental_care	0.000	0.000	0.950	0.000	0.000	2.513	Inf	0.012
highschool_grad	−0.445	1.344	0.950	−3.212	2.094	−0.331	Inf	0.741
some_college	−0.572	1.270	0.950	−3.064	1.920	−0.451	Inf	0.652
unemployment	−18.456	10.540	0.950	−39.187	2.206	−1.751	Inf	0.080
social_score	−0.018	0.015	0.950	−0.048	0.011	−1.215	Inf	0.224
life_expectancy	−0.011	0.048	0.950	−0.104	0.085	−0.225	Inf	0.822
insuff_sleep	3.596	5.940	0.950	−8.048	15.264	0.605	Inf	0.545
segregation	−0.022	0.007	0.950	−0.035	−0.009	−3.336	Inf	0.001
home_ownership	6.351	2.013	0.950	2.421	10.306	3.155	Inf	0.002
housing_burden	−6.008	4.090	0.950	−14.047	2.009	−1.469	Inf	0.142
below_18	−14.298	4.826	0.950	−23.729	−4.787	−2.963	Inf	0.003
above_65	−14.052	3.834	0.950	−21.546	−6.495	−3.665	Inf	0.000
black	−1.113	4.465	0.950	−10.265	8.398	−0.249	Inf	0.803
native	−8.765	4.815	0.950	−18.599	1.319	−1.820	Inf	0.069
asian	−26.237	15.047	0.950	−57.152	−2.854	−1.744	Inf	0.081
hispanic	−2.042	4.303	0.950	−10.832	7.051	−0.474	Inf	0.635
white	−1.641	4.644	0.950	−11.106	8.138	−0.353	Inf	0.724
female	−1.140	5.360	0.950	−12.093	9.212	−0.213	Inf	0.832
rural	4.280	0.517	0.950	3.288	5.316	8.282	Inf	0.000
obesity	9.328	3.804	0.950	1.903	16.834	2.452	Inf	0.014
household_income	0.000	0.000	0.950	0.000	0.000	0.812	Inf	0.417

5.1.1 Saving Propesnsity Score

Code

chr_fin$ps0 <- psmodel0$fitted

chr_fin$linps0 <- psmodel0$linear.predictors

5.1.2 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps0 + linps0 ~ dentist_grp, data = chr_fin) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps0	High	0.000	0.004	0.023	0.124	0.991	0.139	0.251	777
ps0	Low	0.014	0.829	0.943	0.984	1.000	0.861	0.193	777
linps0	High	−19.766	−5.642	−3.767	−1.952	4.666	−3.895	3.356	777
linps0	Low	−4.267	1.579	2.807	4.149	9.743	2.815	1.937	777

The propensity scores here are concerning. We will need to observe the value of inflation to observe potential collinearity between the covariates.

The VIF:

Code

vif_model <- glm(dentist_grp ~  poor_mental + food_index + uninsured + income_inequality +
                  primary_care + mental_care + highschool_grad + some_college + unemployment + 
                  social_score + life_expectancy + insuff_sleep + segregation + home_ownership +
                  housing_burden + below_18 + above_65 + black + native + asian + hispanic +
                  white + female + rural + obesity + household_income, 
                family = binomial(), data= chr_fin)

vif(vif_model)

      poor_mental        food_index         uninsured income_inequality 
         3.775083          2.288975          2.291229          2.039530 
     primary_care       mental_care   highschool_grad      some_college 
         1.129137          1.219127          1.369741          2.204048 
     unemployment      social_score   life_expectancy      insuff_sleep 
         1.920851          1.500884          3.128986          4.694362 
      segregation    home_ownership    housing_burden          below_18 
         1.337250          3.200085          2.076655          3.673811 
         above_65             black            native             asian 
         4.596523         27.541605         21.671785          2.526745 
         hispanic             white            female             rural 
        37.921934         87.584373          1.763807          2.417255 
          obesity  household_income 
         2.243732          3.681487

While most of the variables have low variance inflation factors, there are a few with an extremely high values (>10), which presents serious multi-collinearity. These variables are hispanic, white, black, and native. We will remove these variables from our analysis.

5.1.3 Fixed Propensity Score

We will start by excluding the problematic variables from our final data set chr_fin to produce subset for outcome 1 analysis chr24_smoking.

Code

chr_fin1 <- chr_fin|>
  dplyr::select(-hispanic, -white, -black, -native, -ps0, -linps0)

Next, we will build the new model:

Code

psmodel00 <- glm(dentist_grp ~  poor_mental + food_index + uninsured + income_inequality +
                  primary_care + mental_care + highschool_grad + some_college + unemployment + 
                  social_score + life_expectancy + insuff_sleep + segregation + home_ownership +
                  housing_burden + below_18 + above_65 + asian +
                  female + rural + obesity + household_income,  
                family = binomial(), data= chr_fin1)

model_parameters(psmodel00) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−10.190	5.276	0.950	−20.584	0.116	−1.931	Inf	0.053
poor_mental	0.175	0.256	0.950	−0.327	0.679	0.683	Inf	0.495
food_index	−0.128	0.100	0.950	−0.325	0.066	−1.282	Inf	0.200
uninsured	10.767	2.457	0.950	6.056	15.695	4.382	Inf	0.000
income_inequality	−0.046	0.144	0.950	−0.327	0.240	−0.322	Inf	0.748
primary_care	−3,322.832	344.040	0.950	−4,015.515	−2,664.968	−9.658	Inf	0.000
mental_care	0.000	0.000	0.950	0.000	0.000	2.891	Inf	0.004
highschool_grad	1.728	1.117	0.950	−0.492	3.896	1.547	Inf	0.122
some_college	−0.886	1.213	0.950	−3.269	1.492	−0.730	Inf	0.465
unemployment	−27.947	9.386	0.950	−46.423	−9.585	−2.978	Inf	0.003
social_score	−0.010	0.015	0.950	−0.039	0.019	−0.700	Inf	0.484
life_expectancy	0.068	0.045	0.950	−0.019	0.158	1.516	Inf	0.129
insuff_sleep	10.463	5.100	0.950	0.488	20.507	2.052	Inf	0.040
segregation	−0.024	0.006	0.950	−0.036	−0.012	−3.914	Inf	0.000
home_ownership	7.180	1.794	0.950	3.689	10.737	4.002	Inf	0.000
housing_burden	−2.289	3.806	0.950	−9.742	5.197	−0.602	Inf	0.548
below_18	−17.897	4.356	0.950	−26.485	−9.390	−4.109	Inf	0.000
above_65	−13.396	3.645	0.950	−20.548	−6.244	−3.675	Inf	0.000
asian	−26.238	12.986	0.950	−52.511	−5.981	−2.020	Inf	0.043
female	−0.123	5.055	0.950	−10.218	9.945	−0.024	Inf	0.981
rural	3.693	0.456	0.950	2.814	4.602	8.104	Inf	0.000
obesity	8.389	3.681	0.950	1.204	15.652	2.279	Inf	0.023
household_income	0.000	0.000	0.950	0.000	0.000	−0.281	Inf	0.779

5.1.4 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps00 <- psmodel00$fitted

chr_fin1$linps00 <- psmodel00$linear.predictors

5.1.5 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps00 + linps00 ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps00	High	0.000	0.004	0.026	0.137	0.992	0.145	0.254	777
ps00	Low	0.014	0.828	0.936	0.982	1.000	0.855	0.195	777
linps00	High	−20.056	−5.575	−3.626	−1.842	4.880	−3.839	3.430	777
linps00	Low	−4.238	1.570	2.689	4.015	10.800	2.707	1.906	777

The propensity scores slightly improved after removing the variables with high collinearity values. However, the problem with their values being too close to 0 or 1 is visibly persistent.

5.1.5.1 Fixed Propensity Score - Part 2

We will attempt to observe the impact of each covariate on the outcome individually to rule out the variable(s) that impacted the PS values.

5.1.5.1.1 `poor_mental`

Code

psmodel_pm <- glm(dentist_grp ~  poor_mental, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_pm) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−4.579	0.458	0.950	−5.487	−3.692	−10.004	Inf	0.000
poor_mental	0.883	0.088	0.950	0.713	1.057	10.077	Inf	0.000

5.1.5.1.2 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_pm <- psmodel_pm$fitted

chr_fin1$linps_pm <- psmodel_pm$linear.predictors

5.1.5.1.3 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_pm + linps_pm ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_pm	High	0.146	0.376	0.463	0.552	0.785	0.465	0.123	777
ps_pm	Low	0.150	0.448	0.549	0.632	0.875	0.535	0.132	777
linps_pm	High	−1.766	−0.505	−0.147	0.209	1.294	−0.151	0.523	777
linps_pm	Low	−1.738	−0.207	0.196	0.542	1.945	0.148	0.568	777

The poor_mental variable looks has two linear PS less than 0.01 and above 1.

5.1.5.1.4 `food_index`

Code

psmodel_fi <- glm(dentist_grp ~  food_index, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_fi) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	3.505	0.360	0.950	2.812	4.225	9.727	Inf	0.000
food_index	−0.462	0.047	0.950	−0.555	−0.372	−9.911	Inf	0.000

5.1.5.1.5 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_fi <- psmodel_fi$fitted

chr_fin1$linps_fi <- psmodel_fi$linear.predictors

5.1.5.1.6 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_fi + linps_fi ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_fi	High	0.256	0.374	0.441	0.522	0.948	0.462	0.120	777
ps_fi	Low	0.247	0.441	0.522	0.612	0.971	0.538	0.137	777
linps_fi	High	−1.068	−0.514	−0.237	0.086	2.904	−0.144	0.545	777
linps_fi	Low	−1.115	−0.237	0.086	0.456	3.505	0.182	0.638	777

The food_index variable has two linear PS less than 0.01 and above 1. All PS values are close to 0.99 or above, therefore, it will be removed.

5.1.5.1.7 `uninsured`

Code

psmodel_un <- glm(dentist_grp ~  uninsured, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_un) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−1.945	0.148	0.950	−2.240	−1.659	−13.136	Inf	0.000
uninsured	17.458	1.276	0.950	15.005	20.009	13.682	Inf	0.000

5.1.5.1.8 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_un <- psmodel_un$fitted

chr_fin1$linps_un <- psmodel_un$linear.predictors

5.1.5.1.9 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_un + linps_un ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_un	High	0.178	0.303	0.382	0.530	0.944	0.427	0.159	777
ps_un	Low	0.232	0.406	0.559	0.727	0.998	0.573	0.193	777
linps_un	High	−1.528	−0.834	−0.481	0.120	2.832	−0.303	0.716	777
linps_un	Low	−1.197	−0.382	0.238	0.978	6.142	0.399	0.997	777

The uninsured variable has two linear PS less than 0.01 and above 1. All PS values are close to 0.99 or above, therefore, it will be removed.

5.1.5.1.10 `income_inequality`

Code

psmodel_ii <- glm(dentist_grp ~  income_inequality, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_ii) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−0.471	0.279	0.950	−1.021	0.074	−1.689	Inf	0.091
income_inequality	0.103	0.060	0.950	−0.014	0.221	1.717	Inf	0.086

5.1.5.1.11 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_ii <- psmodel_ii$fitted

chr_fin1$linps_ii <- psmodel_ii$linear.predictors

5.1.5.1.12 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_ii + linps_ii ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_ii	High	0.448	0.487	0.496	0.508	0.599	0.499	0.019	777
ps_ii	Low	0.425	0.485	0.497	0.512	0.649	0.501	0.024	777
linps_ii	High	−0.210	−0.052	−0.016	0.034	0.401	−0.004	0.076	777
linps_ii	Low	−0.301	−0.059	−0.011	0.050	0.614	0.004	0.098	777

The income_inequality has two linear propensity scores less than 0.01 and above 1.

5.1.5.1.13 `primary_care`

Code

psmodel_pc <- glm(dentist_grp ~  primary_care, 
                family = binomial(), data= chr_fin1)

Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

Code

model_parameters(psmodel_pc) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	3.190	0.167	0.950	2.872	3.525	19.152	Inf	0.000
primary_care	−5,829.786	288.884	0.950	−6,412.009	−5,278.912	−20.180	Inf	0.000

This variable has an exploding coefficient.

5.1.5.1.14 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_pc <- psmodel_pc$fitted

chr_fin1$linps_pc <- psmodel_pc$linear.predictors

5.1.5.1.15 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_pc + linps_pc ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_pc	High	0.000	0.064	0.195	0.445	0.960	0.276	0.252	777
ps_pc	Low	0.001	0.627	0.815	0.901	0.960	0.724	0.237	777
linps_pc	High	−30.653	−2.687	−1.417	−0.222	3.190	−1.684	2.569	777
linps_pc	Low	−6.747	0.520	1.480	2.212	3.190	1.218	1.415	777

The primary_care variable has all propensity scores as less than 0.01 and above 1. This variable will be removed from the analysis.

5.1.5.1.16 `mental_care`

Code

psmodel_mc <- glm(dentist_grp ~  mental_care, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_mc) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−1.232	0.087	0.950	−1.404	−1.064	−14.208	Inf	0.000
mental_care	0.001	0.000	0.950	0.001	0.001	15.128	Inf	0.000

5.1.5.1.17 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_mc <- psmodel_mc$fitted

chr_fin1$linps_mc <- psmodel_mc$linear.predictors

5.1.5.1.18 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_mc + linps_mc ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_mc	High	0.092	0.276	0.299	0.339	1.000	0.345	0.148	777
ps_mc	Low	0.087	0.437	0.666	0.897	1.000	0.655	0.256	777
linps_mc	High	−2.293	−0.963	−0.850	−0.668	14.836	−0.576	1.171	777
linps_mc	Low	−2.349	−0.253	0.689	2.165	25.878	1.637	3.334	777

The mental_care variable has all ps values smaller than 0.01 and above 1. This variable will be removed from the analysis.

5.1.5.1.19 `highschool_grad`

Code

psmodel_hs <- glm(dentist_grp ~  highschool_grad, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_hs) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−4.943	0.719	0.950	−6.382	−3.564	−6.872	Inf	0.000
highschool_grad	5.612	0.812	0.950	4.053	7.235	6.910	Inf	0.000

5.1.5.1.20 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_hs <- psmodel_hs$fitted

chr_fin1$linps_hs <- psmodel_hs$linear.predictors

5.1.5.1.21 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_hs + linps_hs ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_hs	High	0.019	0.429	0.491	0.545	0.637	0.479	0.092	777
ps_hs	Low	0.019	0.492	0.540	0.573	0.647	0.521	0.090	777
linps_hs	High	−3.961	−0.288	−0.035	0.182	0.564	−0.093	0.411	777
linps_hs	Low	−3.961	−0.033	0.160	0.295	0.608	0.068	0.464	777

The highschool_grad variable has two PS value less than 0.01, but overall looks acceptable.

5.1.5.1.22 `some_college`

Code

psmodel_sc <- glm(dentist_grp ~  some_college, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_sc) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	6.212	0.360	0.950	5.521	6.934	17.250	Inf	0.000
some_college	−10.244	0.582	0.950	−11.410	−9.126	−17.587	Inf	0.000

5.1.5.1.23 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_sc <- psmodel_sc$fitted

chr_fin1$linps_sc <- psmodel_sc$linear.predictors

5.1.5.1.24 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_sc + linps_sc ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_sc	High	0.035	0.218	0.322	0.490	0.983	0.366	0.200	777
ps_sc	Low	0.017	0.476	0.682	0.823	0.993	0.634	0.234	777
linps_sc	High	−3.320	−1.276	−0.744	−0.040	4.080	−0.641	1.015	777
linps_sc	Low	−4.031	−0.095	0.762	1.535	5.007	0.697	1.269	777

The some_college variable has two linear PS value less than 0.01 and 3 above 0.99. This variable will be removed.

5.1.5.1.25 `unemployment`

Code

psmodel_ue <- glm(dentist_grp ~  unemployment, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_ue) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−0.221	0.162	0.950	−0.541	0.096	−1.364	Inf	0.173
unemployment	6.279	4.373	0.950	−2.273	14.890	1.436	Inf	0.151

5.1.5.1.26 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_ue <- psmodel_ue$fitted

chr_fin1$linps_ue <- psmodel_ue$linear.predictors

5.1.5.1.27 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_ue + linps_ue ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_ue	High	0.466	0.486	0.497	0.509	0.595	0.499	0.018	777
ps_ue	Low	0.454	0.487	0.498	0.510	0.614	0.501	0.019	777
linps_ue	High	−0.137	−0.055	−0.014	0.036	0.383	−0.003	0.072	777
linps_ue	Low	−0.186	−0.051	−0.008	0.041	0.466	0.003	0.075	777

The unemployment variable has two linear PS values less than 0.01, but overall looks acceptable.

5.1.5.1.28 `social_score`

Code

psmodel_ss <- glm(dentist_grp ~  social_score, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_ss) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	0.034	0.101	0.950	−0.164	0.232	0.332	Inf	0.740
social_score	−0.003	0.008	0.950	−0.019	0.013	−0.384	Inf	0.701

5.1.5.1.29 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_ss <- psmodel_ss$fitted

chr_fin1$linps_ss <- psmodel_ss$linear.predictors

5.1.5.1.30 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_ss + linps_ss ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_ss	High	0.479	0.498	0.500	0.502	0.508	0.500	0.004	777
ps_ss	Low	0.463	0.497	0.501	0.503	0.508	0.500	0.006	777
linps_ss	High	−0.083	−0.007	0.002	0.009	0.034	0.000	0.016	777
linps_ss	Low	−0.149	−0.011	0.002	0.014	0.034	0.000	0.023	777

The social_score has two linear PS values less than 0.01, but overall looks acceptable.

5.1.5.1.31 `life_expectancy`

Code

psmodel_le <- glm(dentist_grp ~  life_expectancy, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_le) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	15.042	1.291	0.950	12.551	17.613	11.652	Inf	0.000
life_expectancy	−0.198	0.017	0.950	−0.231	−0.165	−11.665	Inf	0.000

5.1.5.1.32 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_le <- psmodel_le$fitted

chr_fin1$linps_le <- psmodel_le$linear.predictors

5.1.5.1.33 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_le + linps_le ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_le	High	0.034	0.344	0.433	0.532	0.982	0.447	0.149	777
ps_le	Low	0.049	0.446	0.568	0.664	0.960	0.553	0.148	777
linps_le	High	−3.333	−0.644	−0.268	0.128	3.994	−0.222	0.705	777
linps_le	Low	−2.967	−0.215	0.273	0.680	3.167	0.227	0.665	777

The life_expectancy variable has two linear PS less than 0.01 and above 1. The other PS values are also problematic, therefore, it will be removed.

5.1.5.1.34 `insuff_sleep`

Code

psmodel_is <- glm(dentist_grp ~  insuff_sleep, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_is) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−6.637	0.536	0.950	−7.704	−5.600	−12.371	Inf	0.000
insuff_sleep	19.431	1.565	0.950	16.407	22.546	12.412	Inf	0.000

5.1.5.1.35 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_is <- psmodel_is$fitted

chr_fin1$linps_is <- psmodel_is$linear.predictors

5.1.5.1.36 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_is + linps_is ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_is	High	0.118	0.334	0.425	0.541	0.880	0.444	0.154	777
ps_is	Low	0.190	0.439	0.560	0.674	0.922	0.556	0.160	777
linps_is	High	−2.012	−0.691	−0.303	0.164	1.990	−0.239	0.685	777
linps_is	Low	−1.449	−0.244	0.242	0.727	2.476	0.254	0.719	777

The insuff_sleep variable has two linear PS less than 0.01 and above 1.

5.1.5.1.37 `home_ownership`

Code

psmodel_ho <- glm(dentist_grp ~  home_ownership, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_ho) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−8.754	0.592	0.950	−9.940	−7.618	−14.787	Inf	0.000
home_ownership	12.103	0.809	0.950	10.551	13.723	14.966	Inf	0.000

5.1.5.1.38 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_ho <- psmodel_ho$fitted

chr_fin1$linps_ho <- psmodel_ho$linear.predictors

5.1.5.1.39 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_ho + linps_ho ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_ho	High	0.003	0.240	0.397	0.556	0.897	0.401	0.207	777
ps_ho	Low	0.000	0.507	0.626	0.726	0.927	0.599	0.175	777
linps_ho	High	−5.810	−1.150	−0.418	0.227	2.169	−0.545	1.155	777
linps_ho	Low	−8.754	0.029	0.514	0.974	2.544	0.415	0.945	777

The home_ownership variable has all PS values in problematic levels. Therefore, it will be removed.

5.1.5.1.40 `housing_burden`

Code

psmodel_hb <- glm(dentist_grp ~  housing_burden, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_hb) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	2.601	0.196	0.950	2.223	2.991	13.274	Inf	0.000
housing_burden	−23.546	1.724	0.950	−26.990	−20.228	−13.659	Inf	0.000

5.1.5.1.41 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_hb <- psmodel_hb$fitted

chr_fin1$linps_hb <- psmodel_hb$linear.predictors

5.1.5.1.42 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_hb + linps_hb ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_hb	High	0.030	0.296	0.428	0.562	0.890	0.427	0.181	777
ps_hb	Low	0.017	0.476	0.598	0.690	0.921	0.573	0.167	777
linps_hb	High	−3.483	−0.868	−0.289	0.250	2.090	−0.358	0.878	777
linps_hb	Low	−4.037	−0.097	0.397	0.801	2.461	0.312	0.785	777

The housing_burden variable has two linear PS less than 0.01 and above 1. All PS values are close to 0.99 or above, therefore, it will be removed.

5.1.5.1.43 `below_18`

Code

psmodel_b18 <- glm(dentist_grp ~  below_18, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_b18) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−0.147	0.311	0.950	−0.758	0.463	−0.472	Inf	0.637
below_18	0.684	1.430	0.950	−2.118	3.492	0.479	Inf	0.632

5.1.5.1.44 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_b18 <- psmodel_b18$fitted

chr_fin1$linps_b18 <- psmodel_b18$linear.predictors

5.1.5.1.45 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_b18 + linps_b18 ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_b18	High	0.480	0.496	0.500	0.503	0.526	0.500	0.006	777
ps_b18	Low	0.467	0.497	0.500	0.503	0.525	0.500	0.006	777
linps_b18	High	−0.081	−0.015	0.000	0.012	0.106	0.000	0.024	777
linps_b18	Low	−0.130	−0.013	0.001	0.013	0.101	0.000	0.025	777

The below_18 variable has two linear PS values less than 0.01 (0.00). Otherwise it looks acceptable.

5.1.5.1.46 `above_65`

Code

psmodel_a65 <- glm(dentist_grp ~  above_65, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_a65) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−1.663	0.228	0.950	−2.115	−1.220	−7.286	Inf	0.000
above_65	8.110	1.088	0.950	6.001	10.268	7.454	Inf	0.000

5.1.5.1.47 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_a65 <- psmodel_a65$fitted

chr_fin1$linps_a65 <- psmodel_a65$linear.predictors

5.1.5.1.48 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_a65 + linps_a65 ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_a65	High	0.228	0.416	0.464	0.534	0.805	0.481	0.097	777
ps_a65	Low	0.264	0.462	0.511	0.576	0.866	0.519	0.093	777
linps_a65	High	−1.222	−0.340	−0.146	0.136	1.418	−0.076	0.404	777
linps_a65	Low	−1.025	−0.153	0.046	0.307	1.865	0.082	0.395	777

The above_65 variable has two linear PS less than 0.01 and above 1.

5.1.5.1.49 `asian`

Code

psmodel_as <- glm(dentist_grp ~  asian, 
                family = binomial(), data= chr_fin1)

Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

Code

model_parameters(psmodel_as) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	1.692	0.108	0.950	1.484	1.908	15.654	Inf	0.000
asian	−132.404	9.115	0.950	−150.928	−115.192	−14.526	Inf	0.000

5.1.5.1.50 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_as <- psmodel_as$fitted

chr_fin1$linps_as <- psmodel_as$linear.predictors

5.1.5.1.51 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_as + linps_as ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_as	High	0.000	0.023	0.337	0.628	0.844	0.333	0.287	777
ps_as	Low	0.000	0.639	0.711	0.764	0.844	0.667	0.150	777
linps_as	High	−54.682	−3.745	−0.678	0.524	1.692	−3.050	6.809	777
linps_as	Low	−10.225	0.571	0.903	1.175	1.692	0.691	0.919	777

The asian variable has multiple PS values in problematic levels, therefore, it will be removed.

5.1.5.1.52 `female`

Code

psmodel_fm <- glm(dentist_grp ~  female, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_fm) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	11.219	1.477	0.950	8.411	14.199	7.598	Inf	0.000
female	−22.627	2.969	0.950	−28.616	−16.976	−7.620	Inf	0.000

5.1.5.1.53 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_fm <- psmodel_fm$fitted

chr_fin1$linps_fm <- psmodel_fm$linear.predictors

5.1.5.1.54 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_fm + linps_fm ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_fm	High	0.232	0.428	0.467	0.510	0.990	0.476	0.084	777
ps_fm	Low	0.136	0.453	0.502	0.558	0.998	0.524	0.120	777
linps_fm	High	−1.195	−0.288	−0.133	0.038	4.628	−0.090	0.400	777
linps_fm	Low	−1.851	−0.187	0.010	0.234	6.339	0.135	0.649	777

The female variable has two linear PS less than 0.01 all PS values at or above 0.99. Therefore, it will be removed.

5.1.5.1.55 `rural`

Code

psmodel_rl <- glm(dentist_grp ~  rural, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_rl) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−3.521	0.185	0.950	−3.894	−3.170	−19.076	Inf	0.000
rural	5.291	0.240	0.950	4.834	5.773	22.093	Inf	0.000

5.1.5.1.56 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_rl <- psmodel_rl$fitted

chr_fin1$linps_rl <- psmodel_rl$linear.predictors

5.1.5.1.57 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_rl + linps_rl ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_rl	High	0.029	0.049	0.118	0.366	0.854	0.262	0.292	777
ps_rl	Low	0.029	0.658	0.854	0.854	0.854	0.738	0.202	777
linps_rl	High	−3.521	−2.958	−2.011	−0.552	1.770	−1.525	1.755	777
linps_rl	Low	−3.493	0.653	1.770	1.770	1.770	1.176	1.018	777

The rural variable has two linear PS less than 0.01 and value above 1. We might remove this variable.

5.1.5.1.58 `obesity`

Code

psmodel_ob <- glm(dentist_grp ~  obesity, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_ob) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−8.412	0.549	0.950	−9.512	−7.360	−15.327	Inf	0.000
obesity	22.823	1.472	0.950	20.000	25.773	15.504	Inf	0.000

5.1.5.1.59 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_ob <- psmodel_ob$fitted

chr_fin1$linps_ob <- psmodel_ob$linear.predictors

5.1.5.1.60 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_ob + linps_ob ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_ob	High	0.012	0.220	0.390	0.565	0.944	0.399	0.221	777
ps_ob	Low	0.041	0.491	0.609	0.734	0.973	0.601	0.174	777
linps_ob	High	−4.440	−1.268	−0.446	0.261	2.817	−0.542	1.175	777
linps_ob	Low	−3.162	−0.036	0.444	1.014	3.570	0.458	0.850	777

The obesity variable has two linear PS less than 0.01 and all PS values around 0.99. We will remove this variable.

5.1.5.1.61 `segregation`

Code

psmodel_se <- glm(dentist_grp ~  segregation, 
                family = binomial(), data= chr_fin1)

model_parameters(psmodel_se) |> 
  gt() |> fmt_number(decimals = 3) |> 
  opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	2.099	0.185	0.950	1.742	2.468	11.339	Inf	0.000
segregation	−0.041	0.003	0.950	−0.048	−0.035	−11.916	Inf	0.000

5.1.5.1.62 Saving Propesnsity Score

Now we will save the new propensity scores:

Code

chr_fin1$ps_se <- psmodel_se$fitted

chr_fin1$linps_se <- psmodel_se$linear.predictors

5.1.5.1.63 Comparing The PS Distribution of The Two Exposure Groups

Code

df_stats(ps_se + linps_se ~ dentist_grp, data = chr_fin1) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps_se	High	0.165	0.355	0.445	0.530	0.886	0.448	0.135	777
ps_se	Low	0.145	0.432	0.565	0.679	0.886	0.552	0.166	777
linps_se	High	−1.618	−0.595	−0.222	0.119	2.055	−0.217	0.594	777
linps_se	Low	−1.776	−0.275	0.263	0.749	2.055	0.233	0.741	777

The segregation variable has two linear PS less than 0.01 and all PS values around 0.99. We might remove this variable.

5.1.6 PS After Covariates Removal

We will start by building a subset for the first outcome in our analysis chr_fin2 after removing the problematic covariates from the original chr_fin set.

Code

chr_fin2 <- chr_fin|>
  dplyr::select( -hispanic, -white, -black, -native, -ps0, -linps0, -food_index, -uninsured, -primary_care, -mental_care, -some_college, -life_expectancy, -housing_burden, -home_ownership, -female, -rural, -asian, -obesity)

Now, we will build our psmodel1.

Code

psmodel1 <- glm(dentist_grp ~  poor_mental + unemployment  + income_inequality + social_score + 
                  diabetes  + insuff_sleep + highschool_grad + segregation +
                  below_18 + above_65 ,
                family = binomial(), data= chr_fin2)

model_parameters(psmodel1) |> 
  gt() |> gt::fmt_number(decimals = 3) |> 
  gt::opt_stylize(style = 1, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	z	df_error	p
(Intercept)	−15.234	1.450	0.950	−18.122	−12.435	−10.508	Inf	0.000
poor_mental	0.415	0.146	0.950	0.130	0.703	2.839	Inf	0.005
unemployment	−32.170	6.628	0.950	−45.231	−19.222	−4.854	Inf	0.000
income_inequality	−0.568	0.098	0.950	−0.764	−0.378	−5.779	Inf	0.000
social_score	−0.024	0.010	0.950	−0.045	−0.004	−2.364	Inf	0.018
diabetes	57.055	5.926	0.950	45.665	68.906	9.629	Inf	0.000
insuff_sleep	2.959	3.121	0.950	−3.167	9.078	0.948	Inf	0.343
highschool_grad	6.238	0.875	0.950	4.558	8.003	7.132	Inf	0.000
segregation	−0.027	0.004	0.950	−0.036	−0.019	−6.547	Inf	0.000
below_18	7.428	2.417	0.950	2.711	12.196	3.073	Inf	0.002
above_65	21.644	1.817	0.950	18.152	25.280	11.912	Inf	0.000

Code

chr_fin2$ps1 <- psmodel1$fitted

chr_fin2$linps1 <- psmodel1$linear.predictors

Code

df_stats(ps1 + linps1 ~ dentist_grp, data = chr_fin2) |>
  gt() |> fmt_number(columns = c(-n, -missing), decimals = 3) |> 
  opt_stylize(style = 2, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n
ps1	High	0.002	0.129	0.257	0.463	0.993	0.320	0.240	777
ps1	Low	0.005	0.528	0.731	0.872	0.999	0.680	0.232	777
linps1	High	−6.437	−1.911	−1.063	−0.150	4.940	−0.987	1.448	777
linps1	Low	−5.230	0.112	1.000	1.923	6.709	1.032	1.435	777

The problematic values in the propensity score are persistent and we will need to investigate this using other methods. For now, we will proceed to the analysis phase.

6 Outcome 1 Analysis

6.1 The Blanace Before PS Adjustment

Code

ggplot(chr_fin2, aes(x = dentist_grp, y = ps1, color = dentist_grp)) + 
    geom_boxplot() +
    geom_jitter(width = 0.1) + 
    guides(color = "none")

Code

ggplot(chr_fin2, aes(x = linps1, fill = dentist_grp)) +
    geom_density(alpha = 0.3)

There is a good overlap between the two groups.

6.1.1 Rubin’s Rule 1

Code

rubin1.unadj <- with(chr_fin2,
     abs(100*(mean(linps1[dentist_n==1])-mean(linps1[dentist_n==0]))/sd(linps1)))
rubin1.unadj

[1] 114.7657

The absolute value of the standardized difference of the linear propensity score comparing the counties with high dentist proportions and low dentist proportions is 114.8%. This value is too far from the ideal value of 0% and above the maximum acceptable value of 50 %. We definitely do not pass Rubin’s rule 1, therefore, propensity score adjustment is needed.

6.1.2 Rubin’s Rule 2

Code

rubin2.unadj <-with(chr_fin2, var(linps1[dentist_n==1])/var(linps1[dentist_n==0]))
rubin2.unadj

[1] 0.9818352

The ratio of variances of the linear propensity score comparing the counties with high dentist proportions and low dentist proportions is 0.98. This value is very close to the desired value of 1.0. In this case, we pass Rubin’s Rule 2.

6.2 Unadjusted Outcome 1 `smoking_prop`

We will observe the the distribution of the outcome smoking_prop among the two exposure groups dentist_grp.

Code

df_stats(smoking_prop ~ dentist_grp, data = chr_fin2) |>
  gt() |> fmt_number(mean:sd, decimals = 2) |>
  opt_stylize(style = 5, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n	missing
smoking_prop	High	6.821752	14.04568	15.86923	18.30110	37.02901	16.41	3.68	777	0
smoking_prop	Low	7.687424	15.68921	18.72868	21.37735	41.20176	18.78	3.76	777	0

The high dentist group has a lower mean of smoking proportions (16.4 vs 18.8).

6.2.1 Linear Regression Model for Outcome 1

We will build a linear regression model for outcome 1 smoking_prop and the exposure dentist_prop without including the covariates.

Code

unadj.out1 <- lm(smoking_prop ~ dentist_grp, data= chr_fin2)

model_parameters(unadj.out1, ci = 0.95) |> 
  gt() |> fmt_number(columns = -df_error, decimals = 3) |>
  opt_stylize(style = 6, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	t	df_error	p
(Intercept)	16.412	0.133	0.950	16.150	16.673	123.071	1552	0.000
dentist_grpLow	2.364	0.189	0.950	1.994	2.734	12.533	1552	0.000

The unadjusted model estimates that low dentist proportions increases smoking prevalence in US counties by 2.36 percentage points (95% CI: 1.99, 2.73).

6.3 1:1 Matching With Replacement

The binary exposure has an equal number of observations in each group (n=777). Therefore, we will use 1:1 matching with replacement.

Code

X <- chr_fin2$linps1 
Tr <- as.logical(chr_fin2$dentist_n)
match1 <- Match(Tr=Tr, X=X, M = 1, replace = TRUE, ties=FALSE)
summary(match1)


Estimate...  0 
SE.........  0 
T-stat.....  NaN 
p.val......  NA 

Original number of observations..............  1554 
Original number of treated obs...............  777 
Matched number of observations...............  777 
Matched number of observations  (unweighted).  777

checking if matching worked:

Code

n_distinct(match1$index.treated)

[1] 777

Code

n_distinct(match1$index.control)

[1] 278

After 1:1 matching with replacement, the effective size was reduced to 272 matched counties from the treatment and control group.

6.3.1 Balance Assessment

Code

set.seed(500)
mb1 <- MatchBalance(dentist_n ~ poor_mental + unemployment  + income_inequality + social_score + 
                  diabetes  + insuff_sleep + highschool_grad + segregation +
                  below_18 + above_65 + ps1 + linps1, 
                  data= chr_fin2, 
                    match.out = match1, nboots=500)


***** (V1) poor_mental *****
                       Before Matching       After Matching
mean treatment........      5.354             5.354 
mean control..........     5.0152            5.2984 
std mean diff.........     52.617            8.6282 

mean raw eQQ diff.....    0.34041          0.082562 
med  raw eQQ diff.....    0.36298          0.073533 
max  raw eQQ diff.....    0.73693           0.73693 

mean eCDF diff........    0.16003          0.034772 
med  eCDF diff........    0.18919          0.027027 
max  eCDF diff........    0.24324           0.10296 

var ratio (Tr/Co).....       1.18            1.0251 
T-test p-value........ < 2.22e-16          0.068521 
KS Bootstrap p-value.. < 2.22e-16             0.002 
KS Naive p-value...... < 2.22e-16        0.00052946 
KS Statistic..........    0.24324           0.10296 


***** (V2) unemployment *****
                       Before Matching       After Matching
mean treatment........   0.035706          0.035706 
mean control..........   0.034857          0.035613 
std mean diff.........     7.1471           0.77874 

mean raw eQQ diff.....  0.0010747         0.0010116 
med  raw eQQ diff.....   0.001005        0.00071892 
max  raw eQQ diff.....   0.013165          0.018658 

mean eCDF diff........   0.025066          0.020112 
med  eCDF diff........   0.020592          0.015444 
max  eCDF diff........   0.059202          0.065637 

var ratio (Tr/Co).....     1.0862            1.0782 
T-test p-value........    0.15077            0.8748 
KS Bootstrap p-value..      0.146             0.072 
KS Naive p-value......    0.13128          0.070342 
KS Statistic..........   0.059202          0.065637 


***** (V3) income_inequality *****
                       Before Matching       After Matching
mean treatment........     4.6117            4.6117 
mean control..........     4.5373            4.6189 
std mean diff.........     7.7974          -0.75458 

mean raw eQQ diff.....    0.15959           0.12362 
med  raw eQQ diff.....    0.10178          0.062316 
max  raw eQQ diff.....     2.0677            2.5917 

mean eCDF diff........   0.037661          0.024458 
med  eCDF diff........   0.041184          0.024453 
max  eCDF diff........   0.073359          0.056628 

var ratio (Tr/Co).....     1.6836            1.6035 
T-test p-value........   0.085366            0.8694 
KS Bootstrap p-value..      0.032             0.182 
KS Naive p-value......   0.030552           0.16546 
KS Statistic..........   0.073359          0.056628 


***** (V4) social_score *****
                       Before Matching       After Matching
mean treatment........     10.845            10.845 
mean control..........     10.969            11.278 
std mean diff.........    -1.6808           -5.8744 

mean raw eQQ diff.....     1.7301           0.88592 
med  raw eQQ diff.....     1.2191           0.90882 
max  raw eQQ diff.....     21.371            21.456 

mean eCDF diff........   0.059115          0.046787 
med  eCDF diff........   0.052767           0.05148 
max  eCDF diff........    0.12999          0.095238 

var ratio (Tr/Co).....     2.0485            1.2144 
T-test p-value........    0.70099            0.2186 
KS Bootstrap p-value.. < 2.22e-16             0.004 
KS Naive p-value...... 3.9747e-06          0.001739 
KS Statistic..........    0.12999          0.095238 


***** (V5) diabetes *****
                       Before Matching       After Matching
mean treatment........    0.11284           0.11284 
mean control..........   0.095932           0.11175 
std mean diff.........     71.142            4.5818 

mean raw eQQ diff.....   0.016906         0.0019331 
med  raw eQQ diff.....      0.018             0.001 
max  raw eQQ diff.....      0.033             0.045 

mean eCDF diff........    0.13194          0.014661 
med  eCDF diff........   0.099743          0.010296 
max  eCDF diff........    0.34234          0.061776 

var ratio (Tr/Co).....     1.5867            1.0317 
T-test p-value........ < 2.22e-16           0.27851 
KS Bootstrap p-value.. < 2.22e-16             0.074 
KS Naive p-value...... < 2.22e-16           0.10308 
KS Statistic..........    0.34234          0.061776 


***** (V6) insuff_sleep *****
                       Before Matching       After Matching
mean treatment........    0.35465           0.35465 
mean control..........    0.32925           0.35599 
std mean diff.........     68.689           -3.6185 

mean raw eQQ diff.....     0.0254         0.0078744 
med  raw eQQ diff.....      0.027             0.006 
max  raw eQQ diff.....      0.032             0.029 

mean eCDF diff........    0.13895          0.042921 
med  eCDF diff........    0.11583          0.043115 
max  eCDF diff........     0.3166           0.10682 

var ratio (Tr/Co).....     1.1012           0.73316 
T-test p-value........ < 2.22e-16           0.47547 
KS Bootstrap p-value.. < 2.22e-16        < 2.22e-16 
KS Naive p-value...... < 2.22e-16        0.00028217 
KS Statistic..........     0.3166           0.10682 


***** (V7) highschool_grad *****
                       Before Matching       After Matching
mean treatment........    0.89305           0.89305 
mean control..........    0.86432           0.88698 
std mean diff.........      34.78            7.3467 

mean raw eQQ diff.....   0.033037          0.012742 
med  raw eQQ diff.....   0.029953              0.01 
max  raw eQQ diff.....    0.37636           0.37636 

mean eCDF diff........    0.15082          0.047224 
med  eCDF diff........    0.16345          0.045045 
max  eCDF diff........    0.26641           0.13256 

var ratio (Tr/Co).....     1.2754            1.4269 
T-test p-value........ 6.2084e-13           0.11513 
KS Bootstrap p-value.. < 2.22e-16        < 2.22e-16 
KS Naive p-value...... < 2.22e-16         2.351e-06 
KS Statistic..........    0.26641           0.13256 


***** (V8) segregation *****
                       Before Matching       After Matching
mean treatment........     45.162            45.162 
mean control..........     56.059            48.835 
std mean diff.........    -60.758           -20.479 

mean raw eQQ diff.....     10.918            4.6257 
med  raw eQQ diff.....     10.675            4.5646 
max  raw eQQ diff.....     17.966            9.7518 

mean eCDF diff........    0.18644          0.078972 
med  eCDF diff........    0.20721          0.066924 
max  eCDF diff........    0.33076            0.1982 

var ratio (Tr/Co).....     1.5551            1.2339 
T-test p-value........ < 2.22e-16        2.1495e-06 
KS Bootstrap p-value.. < 2.22e-16        < 2.22e-16 
KS Naive p-value...... < 2.22e-16        1.1099e-13 
KS Statistic..........    0.33076            0.1982 


***** (V9) below_18 *****
                       Before Matching       After Matching
mean treatment........    0.21521           0.21521 
mean control..........    0.21435           0.21543 
std mean diff.........     2.3763           -0.6046 

mean raw eQQ diff.....  0.0033666         0.0038355 
med  raw eQQ diff.....  0.0025852         0.0029644 
max  raw eQQ diff.....   0.071666          0.099059 

mean eCDF diff........   0.023328          0.025996 
med  eCDF diff........   0.023166          0.027027 
max  eCDF diff........   0.050193          0.068211 

var ratio (Tr/Co).....     1.0897           0.96816 
T-test p-value........    0.63248           0.90575 
KS Bootstrap p-value..       0.29             0.046 
KS Naive p-value......    0.28165          0.053823 
KS Statistic..........   0.050193          0.068211 


***** (V10) above_65 *****
                       Before Matching       After Matching
mean treatment........    0.21521           0.21521 
mean control..........    0.19577            0.2255 
std mean diff.........     39.892           -21.115 

mean raw eQQ diff.....   0.019441          0.016389 
med  raw eQQ diff.....    0.02113          0.010991 
max  raw eQQ diff.....   0.055136          0.060015 

mean eCDF diff........    0.13214          0.066599 
med  eCDF diff........    0.13385          0.061776 
max  eCDF diff........    0.24453           0.16216 

var ratio (Tr/Co).....    0.95543           0.58912 
T-test p-value........ 1.3767e-14         0.0002428 
KS Bootstrap p-value.. < 2.22e-16        < 2.22e-16 
KS Naive p-value...... < 2.22e-16        2.6751e-09 
KS Statistic..........    0.24453           0.16216 


***** (V11) ps1 *****
                       Before Matching       After Matching
mean treatment........    0.68035           0.68035 
mean control..........    0.31965           0.68039 
std mean diff.........     155.79          -0.01788 

mean raw eQQ diff.....    0.36069         0.0014057 
med  raw eQQ diff.....    0.40833        0.00093887 
max  raw eQQ diff.....    0.48292         0.0093827 

mean eCDF diff........    0.34969         0.0025874 
med  eCDF diff........    0.38867          0.001287 
max  eCDF diff........    0.56113          0.019305 

var ratio (Tr/Co).....    0.93354           0.99996 
T-test p-value........ < 2.22e-16           0.56947 
KS Bootstrap p-value.. < 2.22e-16                 1 
KS Naive p-value...... < 2.22e-16           0.99869 
KS Statistic..........    0.56113          0.019305 


***** (V12) linps1 *****
                       Before Matching       After Matching
mean treatment........     1.0321            1.0321 
mean control..........   -0.98732            1.0249 
std mean diff.........     140.76           0.50261 

mean raw eQQ diff.....     2.0194          0.021934 
med  raw eQQ diff.....     2.0486         0.0054214 
max  raw eQQ diff.....     2.7235            1.7689 

mean eCDF diff........    0.34969         0.0025874 
med  eCDF diff........    0.38867          0.001287 
max  eCDF diff........    0.56113          0.019305 

var ratio (Tr/Co).....    0.98184            1.0325 
T-test p-value........ < 2.22e-16          0.035311 
KS Bootstrap p-value.. < 2.22e-16                 1 
KS Naive p-value...... < 2.22e-16           0.99869 
KS Statistic..........    0.56113          0.019305 


Before Matching Minimum p.value: < 2.22e-16 
Variable Name(s): poor_mental social_score diabetes insuff_sleep highschool_grad segregation above_65 ps1 linps1  Number(s): 1 4 5 6 7 8 10 11 12 

After Matching Minimum p.value: < 2.22e-16 
Variable Name(s): insuff_sleep highschool_grad segregation above_65  Number(s): 6 7 8 10

6.3.2 Tabulating Standardized Differences

We will name the covariates used in our model.

Code

covnames <- c( "poor_mental", "unemployment", "income_inequality", "social_score", 
                  "diabetes", "insuff_sleep", "highschool_grad", "segregation",
                  "below_18", "above_65", "raw PS", "linear PS")

Then we will extract the standardized differences.

Code

pre.szd <- NULL; post.szd <- NULL
for(i in 1:length(covnames)) {
  pre.szd[i] <- mb1$BeforeMatching[[i]]$sdiff.pooled
  post.szd[i] <- mb1$AfterMatching[[i]]$sdiff.pooled
}

Now, we can have our final table.

Code

match_szd <- tibble(covnames, pre.szd, post.szd, row.names=covnames)
match_szd

# A tibble: 12 × 4
   covnames          pre.szd post.szd row.names        
   <chr>               <dbl>    <dbl> <chr>            
 1 poor_mental         54.7    8.63   poor_mental      
 2 unemployment         7.29   0.779  unemployment     
 3 income_inequality    8.73  -0.755  income_inequality
 4 social_score        -1.95  -5.87   social_score     
 5 diabetes            78.8    4.58   diabetes         
 6 insuff_sleep        70.3   -3.62   insuff_sleep     
 7 highschool_grad     36.8    7.35   highschool_grad  
 8 segregation        -67.0  -20.5    segregation      
 9 below_18             2.43  -0.605  below_18         
10 above_65            39.4  -21.1    above_65         
11 raw PS             153.    -0.0179 raw PS           
12 linear PS          140.     0.503  linear PS

We will use these results to build the love plot to better visualize the standardized differences.

6.3.3 Visualizing Standardized Differences

We will use the cobalt method to build plots for the standardized differences and the variance ratios.

Code

b <- bal.tab(match1, 
             dentist_n ~ poor_mental + unemployment  + income_inequality + social_score + 
                  diabetes  + insuff_sleep + highschool_grad + segregation +
                  below_18 + above_65 + ps1 + linps1, 
             data= chr_fin2, stats = c("m", "v"), un = TRUE,
             quick = FALSE)
b

Balance Measures
                     Type Diff.Un V.Ratio.Un Diff.Adj V.Ratio.Adj
poor_mental       Contin.  0.5262     1.1800   0.0863      1.0184
unemployment      Contin.  0.0715     1.0862   0.0078      1.0711
income_inequality Contin.  0.0780     1.6836  -0.0075      1.5929
social_score      Contin. -0.0168     2.0485  -0.0587      1.2064
diabetes          Contin.  0.7114     1.5867   0.0458      1.0249
insuff_sleep      Contin.  0.6869     1.1012  -0.0362      0.7283
highschool_grad   Contin.  0.3478     1.2754   0.0735      1.4175
segregation       Contin. -0.6076     1.5551  -0.2048      1.2258
below_18          Contin.  0.0238     1.0897  -0.0060      0.9618
above_65          Contin.  0.3989     0.9554  -0.2111      0.5852
ps1               Contin.  1.5579     0.9335  -0.0002      0.9934
linps1            Contin.  1.4076     0.9818   0.0050      1.0257

Sample sizes
                     Control Treated
All                   777.       777
Matched (ESS)         127.23     777
Matched (Unweighted)  278.       777
Unmatched             499.         0

As previously mentioned, the effective size after 1:1 matching with replacement is 278 counties. We lost 499 counties from the sample because they were unmatched.

Now we will build the love plot for the standardized differences:

Code

lplot1 <- love.plot(b, threshold = .1, size = 3,
               var.order = "unadjusted", stars = "raw",
               title = "Standardized Differences and 1:1 Matching with Replacement")
lplot1

All covariates have improved standardized difference after 1:1 matching adjustment, with the exception for social_score.

Plot of Variance Ratios

Code

lplot2 <- love.plot(b, stat = "v",
               threshold = 1.25, size = 3,
               var.order = "unadjusted",
               title = "Variance Ratios and 1:1 Matching with Replacement")
lplot2 + theme_bw()

Most of the covariates displayed improved variance ratios with the exception of income_inequality, highschool_grad, insuff_sleep, and above_65.

6.3.4 Building New Data Frame with Matched Sample

Code

matches1 <- factor(rep(match1$index.treated, 2))
chr.matchedsample1 <- cbind(matches1, chr_fin2[c(match1$index.control, match1$index.treated),])

Code

chr.matchedsample1 |> count(dentist_grp)

  dentist_grp   n
1        High 777
2         Low 777

Because the 1:1 matching method we used includes replacement, it is only natural to retain the original number of observations in the exposure groups.

6.3.5 Rubin’s Rules After Matching

First, we will check Rubin’s Rule 1:

Code

rubin1.match1 <- with(chr.matchedsample1,
      abs(100*(mean(linps1[dentist_n==1])-mean(linps1[dentist_n==0]))/sd(linps1)))
rubin1.match1

[1] 0.5067778

The absolute value of the standardized difference of the linear propensity score before adjustment was 114.8%. The value after adjustment (52.76%) shows substantial improvement as it is close to the maximum value of 50%, but the value remains above 10% and is not close to 0%. In this case, we barely pass Rubin’s Rule 1.

Next, we will check Rubin’s Rule 2:

Code

rubin2.match1 <- with(chr.matchedsample1, var(linps1[dentist_n==1])/var(linps1[dentist_n==0]))
rubin2.match1

[1] 1.032521

The previous ratio of the variance of the linear propensity score was 0.98. This value also shows mild improvement where the value is closer to 1 and remains below 2. We also pass Rubin’s Rule 2 and can proceed to use the results of this PS adjustment to measure the estimates.

6.3.6 Estimating Effect of Exposure on Outcome After 1:1 Matching

We will use the ATT (Average Treatment Effect on the Treated) estimate to measure the dentist proportion effect on smoking prevalence in US Counties. First, we will need to create a data frame out of the matched sample:

Code

matches1 <- factor(rep(match1$index.treated, 2))
chr.matchedsamplef <- cbind(matches1, chr_fin2[c(match1$index.control, match1$index.treated),])

Now, we can produce an estimate using the matched sample from chr_fin2.

Code

chr.matchedsamplef$matches.f <- as.factor(chr.matchedsamplef$matches1)

matched_mix.mod1 <- lmer(smoking_prop ~ dentist_n + (1 | matches.f), REML = TRUE, data=chr.matchedsamplef)

tidy_matched_mix.mod1 <- broom.mixed::tidy(matched_mix.mod1, conf.int = TRUE, conf.level = 0.95) |> 
  filter(term == "dentist_n")

tidy_matched_mix.mod1 |> kable(digits = 3)

effect	group	term	estimate	std.error	statistic	conf.low	conf.high
fixed	NA	dentist_n	0.776	0.195	3.985	0.394	1.158

After 1:1 matching with replacement, our model estimates the effect of low dentists proportions on smoking prevalence in US counties to increase by 0.78 percentage points (95% CI: 0.39, 1.16). Our confidence interval does not include 0, therefore, following up with a sensitivity analysis is a reasonable choice.

6.4 Use ATT Weighting Adjustment

We will start by creating a data frame of our data set.

Code

chrfin_df <- base::data.frame(chr_fin2)

Now, we will calculate the propensity score using the ps function from the twang package.

Code

ps.chr <- twang::ps(dentist_n ~ poor_mental + unemployment  + income_inequality + social_score + 
                  diabetes  + insuff_sleep + highschool_grad + segregation +
                  below_18 + above_65 + ps1 + linps1,
             data = chrfin_df,
             n.trees = 5000,
             interaction.depth = 2,
             stop.method = c("es.mean"),
             estimand = "ATT",
             verbose = FALSE)

Checking the iteration:

Code

plot(ps.chr)

We can see that we have run enough iterations to reach a stable estimate.

Observing the effective sample size of the weighted estimates:

Code

summary(ps.chr)

            n.treat n.ctrl ess.treat ess.ctrl    max.es   mean.es    max.ks
unw             777    777       777 777.0000 1.5578591 0.5361848 0.5611326
es.mean.ATT     777    777       777 163.8941 0.1794596 0.0667433 0.1205385
            max.ks.p    mean.ks iter
unw               NA 0.26490776   NA
es.mean.ATT       NA 0.07668971 1111

The effective size after ATT weighting is 163 counties.

6.4.1 Checking the balance

Code

plot(ps.chr, plots = 2)

Code

plot(ps.chr, plots = 3)

We can see that the weighted estimates have an absolute standard differences are below 0.5 and are closer to 0 compared to the unadjusted estimates.

We can also assess the balance using the cobalt method.

Code

b2 <- cobalt::bal.tab(ps.chr, full.stop.method = "es.mean.att", 
        stats = c("m", "v"), un = TRUE)

b2

Balance Measures
                      Type Diff.Un V.Ratio.Un Diff.Adj V.Ratio.Adj
prop.score        Distance  2.2674     0.7698   0.5561      0.6932
poor_mental        Contin.  0.5262     1.1800   0.0079      0.9499
unemployment       Contin.  0.0715     1.0862   0.0512      1.1563
income_inequality  Contin.  0.0780     1.6836   0.0068      1.3792
social_score       Contin. -0.0168     2.0485   0.0007      1.0241
diabetes           Contin.  0.7114     1.5867   0.1795      1.4289
insuff_sleep       Contin.  0.6869     1.1012   0.0630      0.9451
highschool_grad    Contin.  0.3478     1.2754  -0.0281      1.3991
segregation        Contin. -0.6076     1.5551  -0.0901      0.9325
below_18           Contin.  0.0238     1.0897  -0.0110      1.0725
above_65           Contin.  0.3989     0.9554  -0.0601      0.8466
ps1                Contin.  1.5579     0.9335   0.1684      0.8879
linps1             Contin.  1.4076     0.9818   0.1342      0.8905

Effective sample sizes
           Control Treated
Unadjusted  777.       777
Adjusted    163.89     777

We will visualize the standardized differences using the Love plot.

Code

lplot3 <- cobalt::love.plot(b2, 
               threshold = .1, size = 3, stars = "raw",
               title = "Standardized Diffs and TWANG ATT weighting")
lplot3

All of the variables have observable improved standardized difference after TWANG ATT weighting adjustment.

Visualizing the variance ratio:

Code

lplot4 <- cobalt::love.plot(b2, stat = "v",
               threshold = 1.25, size = 3, 
               title = "Variance Ratios: TWANG ATT weighting")
lplot4 + theme_bw()

Most of the variables show some improvement in the variance ration after adjustment, though it is not very satisfying.

6.4.2 Estimating The Causal Effect After ATT Weight Adjustment

First, we will build the weight reference wts1 (which is the inverse propensity score) and add it to our data set chr_out1.

Code

chr_fin2$wts1 <- ifelse(chr_fin2$dentist_n==1, 1, chr_fin2$ps1/(1-chr_fin2$ps1))

Code

chrwt1.design <- survey::svydesign(ids=~1, weights=~wts1, data= chr_fin2) 

adjout1.wt1 <- survey::svyglm(smoking_prop ~ dentist_n, design= chrwt1.design)

wt_att_results1 <- tidy(adjout1.wt1, conf.int = TRUE) |> filter(term == "dentist_n")

wt_att_results1

# A tibble: 1 × 7
  term      estimate std.error statistic p.value conf.low conf.high
  <chr>        <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
1 dentist_n   -0.434      1.09    -0.398   0.691    -2.57      1.71

After ATT weight adjustment, the estimated impact of low dentist proportions on smoking prevalence in US counties is decrease of -0.43 percentage points (95% CI: -2.57, 1.71).

6.5 Use Double Robust Adjustment

Code

chr.dr.design <- survey::svydesign(ids=~1, weights=~wts1, data= chr_fin2) 

dr.out1.wt1 <- survey::svyglm(smoking_prop ~ dentist_n + linps1, design= chr.dr.design)

dr_att_out1 <- tidy(dr.out1.wt1, conf.int = TRUE) |> filter(term == "dentist_n")

dr_att_out1

# A tibble: 1 × 7
  term      estimate std.error statistic p.value conf.low conf.high
  <chr>        <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
1 dentist_n    0.723     0.453      1.60   0.111   -0.166      1.61

After Double Robust adjustment, the estimated impact of low dentist proportions on smoking prevalence in US counties is an increase of 0.72 percentage points (95% CI: -0.17, 1.61). These results are slightly different from the ATT Weighting results.

6.6 Compare Results

Code

chr_results <- tibble(
    analysis = c("Unadjusted", "PS 1:1 Match", "TWANG ATT WT", "Double Robust"),
    estimate = c(2.36, 0.78, -0.43, 0.72),
    conf.low = c(1.99, 0.39, -2.57, -0.17),
    conf.high = c(2.73, 1.16, 1.71, 1.61))

ggplot(chr_results, aes(x = analysis, y = estimate)) +
    geom_errorbar(aes(ymax = conf.high, ymin = conf.low), width = 0.5) + 
    geom_label(aes(label = estimate), size = 5)

We can see that propensity score adjustment decreased the estimated affect of dentist proportions on smoking prevalence in US counties (from 2.36% to about 0.7-0.8%) . While two of the three adjustments present close estimates, the 1:1 matching with replacement adjustment displays the smallest confidence interval, the best balance, and largest effective size (278 after 1:1 matching compared to 163 after ATT weighting).

6.7 Sensitivity Analysis

The estimate produced from the 1:1 Matching with replacement has a statistically significant p-value. Therefore, we will need to perform a sensitivity analysis.

Code

Rc <- match1$mdata$Y[match1$mdata$Tr==0]
Rt <- match1$mdata$Y[match1$mdata$Tr==1] 


psens(Rt, Rc, Gamma = 5, GammaInc = 0.25)


 Rosenbaum Sensitivity Test for Wilcoxon Signed Rank P-Value 
 
Unconfounded estimate ....  NaN 

 Gamma Lower bound Upper bound
  1.00         NaN         NaN
  1.25         NaN         NaN
  1.50         NaN         NaN
  1.75         NaN         NaN
  2.00         NaN         NaN
  2.25         NaN         NaN
  2.50         NaN         NaN
  2.75         NaN         NaN
  3.00         NaN         NaN
  3.25         NaN         NaN
  3.50         NaN         NaN
  3.75         NaN         NaN
  4.00         NaN         NaN
  4.25         NaN         NaN
  4.50         NaN         NaN
  4.75         NaN         NaN
  5.00         NaN         NaN

 Note: Gamma is Odds of Differential Assignment To
 Treatment Due to Unobserved Factors

Our conclusions regarding the impact of low dentists proportions increasing smoking prevalence in US counties hold true in absence of biases in the study. This conclusion is maintained in the presence of a biases level between L= 1.00 and L= 1.25. This means that only small hidden bias is needed to change our conclusions.

7 Outcome 2 Analysis

7.1 Unadjusted Outcome 2 `drinking_prop`

We will observe the the distribution of the outcome drinking_prop among the two exposure groups dentist_grp.

Code

df_stats(drinking_prop ~ dentist_grp, data = chr_fin2) |>
  gt() |> fmt_number(mean:sd, decimals = 2) |>
  opt_stylize(style = 5, color = "pink")

response	dentist_grp	min	Q1	median	Q3	max	mean	sd	n	missing
drinking_prop	High	10.140012	16.79649	18.88142	20.74658	29.44012	18.83	3.24	777	0
drinking_prop	Low	9.265156	14.09065	16.03230	18.32060	27.81331	16.26	2.91	777	0

The low dentist group has a lower mean of excessive drinking prevalence (16.3 vs 18.8).

7.1.1 Linear Regression Model for Outcome 2

We will build a linear regression model for outcome 2 drinking_prop and the exposure dentist_grp without including the covariates.

Code

unadj.out2 <- lm(drinking_prop ~ dentist_grp, data= chr_fin2)

model_parameters(unadj.out2, ci = 0.95) |> 
  gt() |> fmt_number(columns = -df_error, decimals = 3) |>
  opt_stylize(style = 6, color = "pink")

Parameter	Coefficient	SE	CI	CI_low	CI_high	t	df_error	p
(Intercept)	18.825	0.110	0.950	18.609	19.042	170.493	1552	0.000
dentist_grpLow	−2.563	0.156	0.950	−2.869	−2.256	−16.412	1552	0.000

The unadjusted model estimates that low dentist proportions decreases excessive drinking prevalence in US counties by 2.56 percentage points (95% CI: -2.87, -2.26).

7.2 1:1 Matching With Replacement

The binary exposure has an equal number of observations in each group (n=777). Therefore, we will use 1:1 matching with replacement.

Code

X <- chr_fin2$linps1 
Tr <- as.logical(chr_fin2$dentist_n)
match2 <- Match(Tr=Tr, X=X, M = 1, replace = TRUE, ties=FALSE)
summary(match2)


Estimate...  0 
SE.........  0 
T-stat.....  NaN 
p.val......  NA 

Original number of observations..............  1554 
Original number of treated obs...............  777 
Matched number of observations...............  777 
Matched number of observations  (unweighted).  777

checking if matching worked:

Code

n_distinct(match2$index.treated)

[1] 777

Code

n_distinct(match2$index.control)

[1] 269

After 1:1 matching with replacement, the effective size was reduced to 269 matched counties from the low dentist and high dentist groups.

7.2.1 Balance Assessment

Code

set.seed(500)
mb2 <- MatchBalance(dentist_n ~ poor_mental + unemployment  + income_inequality + social_score + 
                  diabetes  + insuff_sleep + highschool_grad + segregation +
                  below_18 + above_65 + ps1 + linps1, 
                  data= chr_fin2, 
                    match.out = match2, nboots=500)


***** (V1) poor_mental *****
                       Before Matching       After Matching
mean treatment........      5.354             5.354 
mean control..........     5.0152            5.3006 
std mean diff.........     52.617            8.2912 

mean raw eQQ diff.....    0.34041          0.078434 
med  raw eQQ diff.....    0.36298          0.068421 
max  raw eQQ diff.....    0.73693           0.73693 

mean eCDF diff........    0.16003          0.033382 
med  eCDF diff........    0.18919           0.02574 
max  eCDF diff........    0.24324           0.10296 

var ratio (Tr/Co).....       1.18            1.0018 
T-test p-value........ < 2.22e-16          0.081108 
KS Bootstrap p-value.. < 2.22e-16             0.002 
KS Naive p-value...... < 2.22e-16        0.00052946 
KS Statistic..........    0.24324           0.10296 


***** (V2) unemployment *****
                       Before Matching       After Matching
mean treatment........   0.035706          0.035706 
mean control..........   0.034857          0.035517 
std mean diff.........     7.1471            1.5883 

mean raw eQQ diff.....  0.0010747        0.00088258 
med  raw eQQ diff.....   0.001005        0.00061619 
max  raw eQQ diff.....   0.013165          0.018658 

mean eCDF diff........   0.025066          0.017423 
med  eCDF diff........   0.020592           0.01287 
max  eCDF diff........   0.059202          0.055341 

var ratio (Tr/Co).....     1.0862            1.1023 
T-test p-value........    0.15077            0.7428 
KS Bootstrap p-value..      0.146             0.178 
KS Naive p-value......    0.13128           0.18502 
KS Statistic..........   0.059202          0.055341 


***** (V3) income_inequality *****
                       Before Matching       After Matching
mean treatment........     4.6117            4.6117 
mean control..........     4.5373            4.6387 
std mean diff.........     7.7974           -2.8275 

mean raw eQQ diff.....    0.15959            0.1258 
med  raw eQQ diff.....    0.10178          0.079581 
max  raw eQQ diff.....     2.0677            2.5917 

mean eCDF diff........   0.037661          0.027845 
med  eCDF diff........   0.041184          0.028314 
max  eCDF diff........   0.073359          0.065637 

var ratio (Tr/Co).....     1.6836            1.5003 
T-test p-value........   0.085366           0.53569 
KS Bootstrap p-value..      0.032             0.078 
KS Naive p-value......   0.030552          0.070342 
KS Statistic..........   0.073359          0.065637 


***** (V4) social_score *****
                       Before Matching       After Matching
mean treatment........     10.845            10.845 
mean control..........     10.969            11.301 
std mean diff.........    -1.6808           -6.1888 

mean raw eQQ diff.....     1.7301           0.94784 
med  raw eQQ diff.....     1.2191            0.9656 
max  raw eQQ diff.....     21.371            21.456 

mean eCDF diff........   0.059115          0.049636 
med  eCDF diff........   0.052767          0.056628 
max  eCDF diff........    0.12999          0.099099 

var ratio (Tr/Co).....     2.0485            1.2142 
T-test p-value........    0.70099           0.19129 
KS Bootstrap p-value.. < 2.22e-16             0.002 
KS Naive p-value...... 3.9747e-06        0.00097071 
KS Statistic..........    0.12999          0.099099 


***** (V5) diabetes *****
                       Before Matching       After Matching
mean treatment........    0.11284           0.11284 
mean control..........   0.095932           0.11228 
std mean diff.........     71.142            2.3342 

mean raw eQQ diff.....   0.016906         0.0017748 
med  raw eQQ diff.....      0.018             0.001 
max  raw eQQ diff.....      0.033             0.045 

mean eCDF diff........    0.13194          0.013378 
med  eCDF diff........   0.099743          0.010296 
max  eCDF diff........    0.34234          0.048906 

var ratio (Tr/Co).....     1.5867            1.0398 
T-test p-value........ < 2.22e-16           0.57795 
KS Bootstrap p-value.. < 2.22e-16             0.238 
KS Naive p-value...... < 2.22e-16           0.31067 
KS Statistic..........    0.34234          0.048906 


***** (V6) insuff_sleep *****
                       Before Matching       After Matching
mean treatment........    0.35465           0.35465 
mean control..........    0.32925            0.3563 
std mean diff.........     68.689           -4.4658 

mean raw eQQ diff.....     0.0254          0.007618 
med  raw eQQ diff.....      0.027             0.006 
max  raw eQQ diff.....      0.032             0.028 

mean eCDF diff........    0.13895          0.041729 
med  eCDF diff........    0.11583          0.041184 
max  eCDF diff........     0.3166           0.10682 

var ratio (Tr/Co).....     1.1012           0.75614 
T-test p-value........ < 2.22e-16           0.36823 
KS Bootstrap p-value.. < 2.22e-16        < 2.22e-16 
KS Naive p-value...... < 2.22e-16        0.00028217 
KS Statistic..........     0.3166           0.10682 


***** (V7) highschool_grad *****
                       Before Matching       After Matching
mean treatment........    0.89305           0.89305 
mean control..........    0.86432           0.88934 
std mean diff.........      34.78               4.5 

mean raw eQQ diff.....   0.033037          0.012938 
med  raw eQQ diff.....   0.029953         0.0084777 
max  raw eQQ diff.....    0.37636           0.46344 

mean eCDF diff........    0.15082          0.041283 
med  eCDF diff........    0.16345          0.041184 
max  eCDF diff........    0.26641           0.11583 

var ratio (Tr/Co).....     1.2754            1.7386 
T-test p-value........ 6.2084e-13           0.31027 
KS Bootstrap p-value.. < 2.22e-16        < 2.22e-16 
KS Naive p-value...... < 2.22e-16        5.9379e-05 
KS Statistic..........    0.26641           0.11583 


***** (V8) segregation *****
                       Before Matching       After Matching
mean treatment........     45.162            45.162 
mean control..........     56.059            48.876 
std mean diff.........    -60.758           -20.706 

mean raw eQQ diff.....     10.918            4.6238 
med  raw eQQ diff.....     10.675            4.5646 
max  raw eQQ diff.....     17.966            9.8243 

mean eCDF diff........    0.18644          0.079899 
med  eCDF diff........    0.20721          0.070785 
max  eCDF diff........    0.33076           0.20463 

var ratio (Tr/Co).....     1.5551            1.2463 
T-test p-value........ < 2.22e-16        1.7141e-06 
KS Bootstrap p-value.. < 2.22e-16        < 2.22e-16 
KS Naive p-value...... < 2.22e-16        1.4809e-14 
KS Statistic..........    0.33076           0.20463 


***** (V9) below_18 *****
                       Before Matching       After Matching
mean treatment........    0.21521           0.21521 
mean control..........    0.21435           0.21636 
std mean diff.........     2.3763           -3.1678 

mean raw eQQ diff.....  0.0033666         0.0038545 
med  raw eQQ diff.....  0.0025852         0.0031975 
max  raw eQQ diff.....   0.071666          0.099059 

mean eCDF diff........   0.023328          0.028745 
med  eCDF diff........   0.023166          0.027027 
max  eCDF diff........   0.050193           0.07722 

var ratio (Tr/Co).....     1.0897           0.97856 
T-test p-value........    0.63248           0.53529 
KS Bootstrap p-value..       0.29              0.01 
KS Naive p-value......    0.28165          0.019447 
KS Statistic..........   0.050193           0.07722 


***** (V10) above_65 *****
                       Before Matching       After Matching
mean treatment........    0.21521           0.21521 
mean control..........    0.19577           0.22345 
std mean diff.........     39.892           -16.919 

mean raw eQQ diff.....   0.019441          0.015299 
med  raw eQQ diff.....    0.02113         0.0090949 
max  raw eQQ diff.....   0.055136          0.057585 

mean eCDF diff........    0.13214          0.062442 
med  eCDF diff........    0.13385          0.065637 
max  eCDF diff........    0.24453           0.15315 

var ratio (Tr/Co).....    0.95543           0.59933 
T-test p-value........ 1.3767e-14         0.0032109 
KS Bootstrap p-value.. < 2.22e-16        < 2.22e-16 
KS Naive p-value...... < 2.22e-16        2.4317e-08 
KS Statistic..........    0.24453           0.15315 


***** (V11) ps1 *****
                       Before Matching       After Matching
mean treatment........    0.68035           0.68035 
mean control..........    0.31965           0.68035 
std mean diff.........     155.79        -0.0011278 

mean raw eQQ diff.....    0.36069         0.0014012 
med  raw eQQ diff.....    0.40833        0.00091168 
max  raw eQQ diff.....    0.48292         0.0093827 

mean eCDF diff........    0.34969         0.0026134 
med  eCDF diff........    0.38867          0.001287 
max  eCDF diff........    0.56113          0.019305 

var ratio (Tr/Co).....    0.93354           0.99994 
T-test p-value........ < 2.22e-16           0.97131 
KS Bootstrap p-value.. < 2.22e-16                 1 
KS Naive p-value...... < 2.22e-16           0.99869 
KS Statistic..........    0.56113          0.019305 


***** (V12) linps1 *****
                       Before Matching       After Matching
mean treatment........     1.0321            1.0321 
mean control..........   -0.98732            1.0247 
std mean diff.........     140.76            0.5161 

mean raw eQQ diff.....     2.0194          0.021914 
med  raw eQQ diff.....     2.0486         0.0054645 
max  raw eQQ diff.....     2.7235            1.7689 

mean eCDF diff........    0.34969         0.0026134 
med  eCDF diff........    0.38867          0.001287 
max  eCDF diff........    0.56113          0.019305 

var ratio (Tr/Co).....    0.98184            1.0326 
T-test p-value........ < 2.22e-16          0.030659 
KS Bootstrap p-value.. < 2.22e-16                 1 
KS Naive p-value...... < 2.22e-16           0.99869 
KS Statistic..........    0.56113          0.019305 


Before Matching Minimum p.value: < 2.22e-16 
Variable Name(s): poor_mental social_score diabetes insuff_sleep highschool_grad segregation above_65 ps1 linps1  Number(s): 1 4 5 6 7 8 10 11 12 

After Matching Minimum p.value: < 2.22e-16 
Variable Name(s): insuff_sleep highschool_grad segregation above_65  Number(s): 6 7 8 10

7.2.2 Tabulating Standardized Differences

We will name the covariates used in our model.

Code

covnames2 <- c( "poor_mental", "unemployment", "income_inequality", "social_score", 
                  "diabetes", "insuff_sleep", "highschool_grad", "segregation",
                  "below_18", "above_65", "raw PS", "linear PS")

Then we will extract the standardized differences.

Code

pre.szd2 <- NULL; post.szd2 <- NULL
for(i in 1:length(covnames2)) {
  pre.szd2[i] <- mb2$BeforeMatching[[i]]$sdiff.pooled
  post.szd2[i] <- mb2$AfterMatching[[i]]$sdiff.pooled
}

Now, we can have our final table.

Code

match_szd2 <- tibble(covnames2, pre.szd2, post.szd2, row.names=covnames2)
match_szd2

# A tibble: 12 × 4
   covnames2         pre.szd2 post.szd2 row.names        
   <chr>                <dbl>     <dbl> <chr>            
 1 poor_mental          54.7    8.29    poor_mental      
 2 unemployment          7.29   1.59    unemployment     
 3 income_inequality     8.73  -2.83    income_inequality
 4 social_score         -1.95  -6.19    social_score     
 5 diabetes             78.8    2.33    diabetes         
 6 insuff_sleep         70.3   -4.47    insuff_sleep     
 7 highschool_grad      36.8    4.50    highschool_grad  
 8 segregation         -67.0  -20.7     segregation      
 9 below_18              2.43  -3.17    below_18         
10 above_65             39.4  -16.9     above_65         
11 raw PS              153.    -0.00113 raw PS           
12 linear PS           140.     0.516   linear PS

We will use these results to build the love plot to better visualize the standardized differences.

7.2.3 Visualizing Standardized Differences

We will use the cobalt method to build plots for the standardized differences and the variance ratios.

Code

b2 <- bal.tab(match2, 
             dentist_n ~ poor_mental + unemployment  + income_inequality + social_score + 
                  diabetes  + insuff_sleep + highschool_grad + segregation +
                  below_18 + above_65 + ps1 + linps1, 
             data= chr_fin2, stats = c("m", "v"), un = TRUE,
             quick = FALSE)
b2

Balance Measures
                     Type Diff.Un V.Ratio.Un Diff.Adj V.Ratio.Adj
poor_mental       Contin.  0.5262     1.1800   0.0829      0.9952
unemployment      Contin.  0.0715     1.0862   0.0159      1.0950
income_inequality Contin.  0.0780     1.6836  -0.0283      1.4904
social_score      Contin. -0.0168     2.0485  -0.0619      1.2062
diabetes          Contin.  0.7114     1.5867   0.0233      1.0329
insuff_sleep      Contin.  0.6869     1.1012  -0.0447      0.7511
highschool_grad   Contin.  0.3478     1.2754   0.0450      1.7271
segregation       Contin. -0.6076     1.5551  -0.2071      1.2381
below_18          Contin.  0.0238     1.0897  -0.0317      0.9721
above_65          Contin.  0.3989     0.9554  -0.1692      0.5954
ps1               Contin.  1.5579     0.9335  -0.0000      0.9933
linps1            Contin.  1.4076     0.9818   0.0052      1.0257

Sample sizes
                     Control Treated
All                    777.      777
Matched (ESS)          126.7     777
Matched (Unweighted)   269.      777
Unmatched              508.        0

As previously mentioned, the effective size after 1:1 matching with replacement is 269 counties. We lost 508 counties from the sample because they were unmatched.

Now we will build the love plot for the standardized differences:

Code

lplot5 <- love.plot(b2, threshold = .1, size = 3,
               var.order = "unadjusted", stars = "raw",
               title = "Standardized Differences and 1:1 Matching with Replacement")
lplot5

All covariates have improved standardized difference after 1:1 matching adjustment, with the exception for social_score.

Plot of Variance Ratios

Code

lplot6 <- love.plot(b2, stat = "v",
               threshold = 1.25, size = 3,
               var.order = "unadjusted",
               title = "Variance Ratios and 1:1 Matching with Replacement")
lplot6 + theme_bw()

Most of the covariates displayed improved variance ratios with the exception of highschool_grad, insuff_sleep, unemployment, and above_65.

7.2.4 Building New Data Frame with Matched Sample

Code

matches2 <- factor(rep(match2$index.treated, 2))
chr.matchedsample2 <- cbind(matches2, chr_fin2[c(match2$index.control, match2$index.treated),])

Code

chr.matchedsample2 |> count(dentist_grp)

  dentist_grp   n
1        High 777
2         Low 777

Because the 1:1 matching method we used includes replacement, it is only natural to retain the original number of observations in the exposure groups.

7.2.5 Rubin’s Rules After Matching

First, we will check Rubin’s Rule 1:

Code

rubin1.match2 <- with(chr.matchedsample2,
      abs(100*(mean(linps1[dentist_n==1])-mean(linps1[dentist_n==0]))/sd(linps1)))
rubin1.match2

[1] 0.5203845

The absolute value of the standardized difference of the linear propensity score before adjustment was 114.8%. The value after adjustment (52.03%) shows substantial improvement, however the value is above 10% and far from to 0%. In this case, we barely pass Rubin’s Rule 1.

Next, we will check Rubin’s Rule 2:

Code

rubin2.match2 <- with(chr.matchedsample2, var(linps1[dentist_n==1])/var(linps1[dentist_n==0]))
rubin2.match2

[1] 1.032566

The previous ratio of the variance of the linear propensity score was 0.98. This value also shows mild improvement where the value is closer to 1 (1.03) and remains below 2. We also pass Rubin’s Rule 2 and can proceed to use the results of this PS adjustment to measure the estimates.

7.2.6 Estimating Effect of Exposure on Outcome After 1:1 Matching

Code

matches2 <- factor(rep(match2$index.treated, 2))
chr.matchedsamplef2 <- cbind(matches2, chr_fin2[c(match2$index.control, match2$index.treated),])

Now, we can produce an estimate using the matched sample from chr_fin2.

Code

chr.matchedsamplef2$matches2.f <- as.factor(chr.matchedsamplef2$matches2)

matched_mix.mod2 <- lmer(drinking_prop ~ dentist_n + (1 | matches2.f), REML = TRUE, data=chr.matchedsamplef2)

tidy_matched_mix.mod2 <- broom.mixed::tidy(matched_mix.mod2, conf.int = TRUE, conf.level = 0.95) |> 
  filter(term == "dentist_n")

tidy_matched_mix.mod2 |> kable(digits = 3)

effect	group	term	estimate	std.error	statistic	conf.low	conf.high
fixed	NA	dentist_n	0.396	0.122	3.255	0.158	0.635

After 1:1 matching with replacement, our model estimates the effect of low dentists proportions on excessive drinking prevalence in US counties to increase by 0.40 percentage points (95% CI: 0.16, 0.64). Our confidence interval does not include 0, therefore, following up with a sensitivity analysis is a reasonable choice.

7.3 Use ATT Weighting Adjustment

We will start by creating a data frame of our data set.

Code

chrfin_df2 <- base::data.frame(chr_fin2)

Now, we will calculate the propensity score using the ps function from the twang package.

Code

ps.chr2 <- twang::ps(dentist_n ~ poor_mental + unemployment  + income_inequality + social_score + 
                  diabetes  + insuff_sleep + highschool_grad + segregation +
                  below_18 + above_65 + ps1 + linps1,
             data = chrfin_df2,
             n.trees = 5000,
             interaction.depth = 2,
             stop.method = c("es.mean"),
             estimand = "ATT",
             verbose = FALSE)

Checking the iteration:

Code

plot(ps.chr2)

We can see that we have run enough iterations to reach a stable estimate.

Observing the effective sample size of the weighted estimates:

Code

summary(ps.chr2)

            n.treat n.ctrl ess.treat ess.ctrl    max.es   mean.es    max.ks
unw             777    777       777 777.0000 1.5578591 0.5361848 0.5611326
es.mean.ATT     777    777       777 163.8941 0.1794596 0.0667433 0.1205385
            max.ks.p    mean.ks iter
unw               NA 0.26490776   NA
es.mean.ATT       NA 0.07668971 1111

The effective size after ATT weighting is 163 counties.

7.3.1 Checking the balance

Code

plot(ps.chr2, plots = 2)

Code

plot(ps.chr2, plots = 3)

We can see that the weighted estimates have an absolute standard differences are below 0.5 and are closer to 0 compared to the unadjusted estimates.

We can also assess the balance using the cobalt method.

Code

b3 <- cobalt::bal.tab(ps.chr2, full.stop.method = "es.mean.att", 
        stats = c("m", "v"), un = TRUE)

b3

Balance Measures
                      Type Diff.Un V.Ratio.Un Diff.Adj V.Ratio.Adj
prop.score        Distance  2.2674     0.7698   0.5561      0.6932
poor_mental        Contin.  0.5262     1.1800   0.0079      0.9499
unemployment       Contin.  0.0715     1.0862   0.0512      1.1563
income_inequality  Contin.  0.0780     1.6836   0.0068      1.3792
social_score       Contin. -0.0168     2.0485   0.0007      1.0241
diabetes           Contin.  0.7114     1.5867   0.1795      1.4289
insuff_sleep       Contin.  0.6869     1.1012   0.0630      0.9451
highschool_grad    Contin.  0.3478     1.2754  -0.0281      1.3991
segregation        Contin. -0.6076     1.5551  -0.0901      0.9325
below_18           Contin.  0.0238     1.0897  -0.0110      1.0725
above_65           Contin.  0.3989     0.9554  -0.0601      0.8466
ps1                Contin.  1.5579     0.9335   0.1684      0.8879
linps1             Contin.  1.4076     0.9818   0.1342      0.8905

Effective sample sizes
           Control Treated
Unadjusted  777.       777
Adjusted    163.89     777

We will visualize the standardized differences using the Love plot.

Code

lplot7 <- cobalt::love.plot(b3, 
               threshold = .1, size = 3, stars = "raw",
               title = "Standardized Diffs and TWANG ATT weighting")
lplot7

All of the variables have observable improved standardized difference after TWANG ATT weighting adjustment.

Visualizing the variance ratio:

Code

lplot8 <- cobalt::love.plot(b3, stat = "v",
               threshold = 1.25, size = 3, 
               title = "Variance Ratios: TWANG ATT weighting")
lplot8 + theme_bw()

Most of the variables show some improvement in the variance ration after adjustment, though it is not very satisfying.

7.3.2 Estimating The Causal Effect After ATT Weight Adjustment

We will use the weight design previously created wt1 and chrwt1.design.

Code

adjout2.wt1 <- survey::svyglm(drinking_prop ~ dentist_n, design= chrwt1.design)

wt_att_results2 <- tidy(adjout2.wt1, conf.int = TRUE) |> filter(term == "dentist_n")

wt_att_results2

# A tibble: 1 × 7
  term      estimate std.error statistic p.value conf.low conf.high
  <chr>        <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
1 dentist_n     1.15     0.520      2.20  0.0276    0.126      2.17

After ATT weight adjustment, the estimated impact of low dentist proportions on excessive drinking prevalence in US counties is an increase of 1.15 percentage points (95% CI: 0.13, 2.17).

7.4 Use Double Robust Adjustment

We will use the previous double robust design chr.dr.design.

Code

dr.out2.wt1 <- survey::svyglm(drinking_prop ~ dentist_n + linps1, design= chr.dr.design)

dr_att_out2 <- tidy(dr.out2.wt1, conf.int = TRUE) |> filter(term == "dentist_n")

dr_att_out2

# A tibble: 1 × 7
  term      estimate std.error statistic p.value conf.low conf.high
  <chr>        <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
1 dentist_n   -0.106     0.222    -0.477   0.633   -0.542     0.330

After Double Robust adjustment, the estimated impact of low dentist proportions on excessive drinking prevalence in US counties is a decrease of -0.11 percentage points (95% CI: -0.54, 0.33). These results are very different from the ATT Weighting results.

7.5 Compare Results

Code

chr_results2 <- tibble(
    analysis = c("Unadjusted", "PS 1:1 Match", "TWANG ATT WT", "Double Robust"),
    estimate = c(-2.56, 0.40, 1.15, -0.11),
    conf.low = c(-2.87, 0.16, 0.13, -0.54),
    conf.high = c(-2.26, 0.64, 2.17, 0.33))

ggplot(chr_results2, aes(x = analysis, y = estimate)) +
    geom_errorbar(aes(ymax = conf.high, ymin = conf.low), width = 0.5) + 
    geom_label(aes(label = estimate), size = 5)

We can see that propensity score adjustment inversed the estimated affect of dentist proportions on excessive drinking prevalence in US counties (from -2.56% to 0.4-1.2%) . While two of the three adjustments present close estimates, the 1:1 matching with replacement adjustment displays the smallest confidence interval, the best balance, and largest effective size (273 after 1:1 matching compared to 163 after ATT weighting).

7.6 Sensitivity Analysis

The estimate produced from the 1:1 Matching with replacement has a statistically significant p-value. Therefore, we will need to perform a sensitivity analysis.

Code

Rc2 <- match2$mdata$Y[match2$mdata$Tr==0]
Rt2 <- match2$mdata$Y[match2$mdata$Tr==1] 


psens(Rt2, Rc2, Gamma = 5, GammaInc = 0.25)


 Rosenbaum Sensitivity Test for Wilcoxon Signed Rank P-Value 
 
Unconfounded estimate ....  NaN 

 Gamma Lower bound Upper bound
  1.00         NaN         NaN
  1.25         NaN         NaN
  1.50         NaN         NaN
  1.75         NaN         NaN
  2.00         NaN         NaN
  2.25         NaN         NaN
  2.50         NaN         NaN
  2.75         NaN         NaN
  3.00         NaN         NaN
  3.25         NaN         NaN
  3.50         NaN         NaN
  3.75         NaN         NaN
  4.00         NaN         NaN
  4.25         NaN         NaN
  4.50         NaN         NaN
  4.75         NaN         NaN
  5.00         NaN         NaN

 Note: Gamma is Odds of Differential Assignment To
 Treatment Due to Unobserved Factors

Our conclusions regarding the impact of low dentists proportions increasing excessive drinking prevalence in US counties hold true in absence of biases in the study. This conclusion is maintained in the presence of a biases level between L= 1.00 and L= 1.25. This means that only small hidden bias is needed to change our conclusions.

8 Discussion

This study found that counties with low dentist proportions are associated with a small increase in smoking prevalence and excessive drinking prevalence among adults (0.78% and 0.4% respectively). Although these estimates provided a meaningful size effect, the sensitivity analysis concluded that our conclusions may change in the presence of small hidden bias (gamma between 1.0 and 1.25). The weak findings may be attributed to the existing low number of dentists per 1000 population in US counties. Despite the attempt to dichotomize the dentist variable by using the first and third quartiles (0.2 and 0.6 per 1000 population), the difference was not large enough to detect observable impact. It is also important to note that increased presence of dentists does not spontaneously reduce negative oral habits (smoking and excessive drinking). The quality care and accessibility of services is an important factor in improving oral health and other related health behaviors.

This project emphasized the importance of using confounders to assess selection biases in observational studies. The difference in the exposure’s effect estimates after propensity score adjustment was remarkable. While our initial belief was that having more covariates would improve the analysis, this project corrected that misconception. In our analysis, multiple covariates were dropped due to high correlation, collinearity, or for disrupting the propensity score values. Selecting and evaluating appropriate covariates was the most critical part for the propensity score analysis. To improve my study in the future, we will need to explore more variables from the County Health Rankings (CHR) as we may have missed other useful variables during the initial selection phase. Also, using the Health Professional Shortage Area classification (HPSA) instead of the number of dentists per population can provide a better assessment of the impact of dentist quantities on oral-related health behaviors in US counties.

9 Session Information

Code

sessionInfo()

R version 4.5.1 (2025-06-13 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.4    forcats_1.0.1      stringr_1.5.2      purrr_1.1.0       
 [5] tidyr_1.3.1        tibble_3.3.0       tidyverse_2.0.0    rbounds_2.2       
 [9] car_3.1-3          carData_3.0-5      twang_2.6.1        patchwork_1.3.2   
[13] mice_3.18.0        survival_3.8-3     broom_1.0.10       cobalt_4.6.1      
[17] arm_1.14-4         lme4_1.1-37        Matching_4.10-15   MASS_7.3-65       
[21] Epi_2.61           gt_1.1.0           mosaic_1.9.2       mosaicData_0.20.4 
[25] ggformula_1.0.0    Matrix_1.7-3       ggplot2_4.0.0      lattice_0.22-7    
[29] tableone_0.13.2    see_0.12.0         report_0.6.1       parameters_0.28.2 
[33] performance_0.15.2 modelbased_0.13.0  insight_1.4.2      effectsize_1.0.1  
[37] datawizard_1.3.0   correlation_0.8.8  bayestestR_0.17.0  easystats_0.7.5   
[41] naniar_1.1.0       knitr_1.50         Hmisc_5.2-4        magrittr_2.0.4    
[45] readr_2.1.5        dplyr_1.1.4        janitor_2.2.1     

loaded via a namespace (and not attached):
  [1] splines_4.5.1           rpart_4.1.24            lifecycle_1.0.4        
  [4] Rdpack_2.6.4            globals_0.18.0          vroom_1.6.6            
  [7] backports_1.5.0         survey_4.4-8            sass_0.4.10            
 [10] rmarkdown_2.30          yaml_2.3.10             DBI_1.2.3              
 [13] minqa_1.2.8             RColorBrewer_1.1-3      abind_1.4-8            
 [16] nnet_7.3-20             gdtools_0.4.4           labelled_2.15.0        
 [19] gbm_2.2.2               listenv_0.9.1           MatrixModels_0.5-4     
 [22] parallelly_1.45.1       codetools_0.2-20        xml2_1.4.0             
 [25] tidyselect_1.2.1        shape_1.4.6.1           farver_2.1.2           
 [28] broom.mixed_0.2.9.6     base64enc_0.1-3         jsonlite_2.0.0         
 [31] e1071_1.7-16            mitml_0.4-5             Formula_1.2-5          
 [34] ggridges_0.5.7          iterators_1.0.14        systemfonts_1.3.1      
 [37] foreach_1.5.2           tools_4.5.1             cmprsk_2.2-12          
 [40] Rcpp_1.1.0              glue_1.8.0              gridExtra_2.3          
 [43] pan_1.9                 xfun_0.52               chk_0.10.0             
 [46] mgcv_1.9-3              withr_3.0.2             numDeriv_2016.8-1.1    
 [49] fastmap_1.2.0           mitools_2.4             latticeExtra_0.6-31    
 [52] boot_1.3-31             digest_0.6.37           timechange_0.3.0       
 [55] R6_2.6.1                visdat_0.6.0            colorspace_2.1-2       
 [58] jpeg_0.1-11             utf8_1.2.6              generics_0.1.4         
 [61] fontLiberation_0.1.0    data.table_1.17.8       class_7.3-23           
 [64] htmlwidgets_1.6.4       pkgconfig_2.0.3         gtable_0.3.6           
 [67] S7_0.2.0                furrr_0.3.1             htmltools_0.5.8.1      
 [70] fontBitstreamVera_0.1.1 scales_1.4.0            png_0.1-8              
 [73] reformulas_0.4.1        snakecase_0.11.1        rstudioapi_0.17.1      
 [76] tzdb_0.5.0              coda_0.19-4.1           checkmate_2.3.3        
 [79] nlme_3.1-168            nloptr_2.2.1            proxy_0.4-27           
 [82] zoo_1.8-14              parallel_4.5.1          foreign_0.8-90         
 [85] pillar_1.11.1           grid_4.5.1              vctrs_0.6.5            
 [88] jomo_2.7-6              xtable_1.8-4            cluster_2.1.8.1        
 [91] htmlTable_2.4.3         evaluate_1.0.5          cli_3.6.5              
 [94] etm_1.1.2               compiler_4.5.1          rlang_1.1.6            
 [97] crayon_1.5.3            labeling_0.4.3          interp_1.1-6           
[100] plyr_1.8.9              fs_1.6.6                ggiraph_0.9.2          
[103] stringi_1.8.7           deldir_2.0-4            glmnet_4.1-10          
[106] mosaicCore_0.9.5        fontquiver_0.2.1        hms_1.1.4              
[109] bit64_4.6.0-1           future_1.67.0           haven_2.5.5            
[112] rbibutils_2.3           bit_4.6.0               xgboost_1.7.11.1

R Setup

Loading CHR Data

2 Data Management

2.1 Joining data sets

The Codebook

2.2 Renaming the variables

2.3 Categorizing Exposure dentist_grp

2.4 Numeric Summaries Before Imputation

2.5 Dealing with Missingness

2.5.1 Filter Complete Cases For Exposure dentist_prop

2.5.2 Single Imputation for Covariates

3 Final Variable Selection

3.1 Check Correlation

4 Table 1

5 Propensity Score Assessment

5.1 Calculating and Evaluating PS Balance

5.1.1 Saving Propesnsity Score

5.1.2 Comparing The PS Distribution of The Two Exposure Groups

5.1.3 Fixed Propensity Score

5.1.4 Saving Propesnsity Score

5.1.5 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1 Fixed Propensity Score - Part 2

5.1.5.1.1 poor_mental

5.1.5.1.2 Saving Propesnsity Score

5.1.5.1.3 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.4 food_index

5.1.5.1.5 Saving Propesnsity Score

5.1.5.1.6 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.7 uninsured

5.1.5.1.8 Saving Propesnsity Score

5.1.5.1.9 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.10 income_inequality

5.1.5.1.11 Saving Propesnsity Score

5.1.5.1.12 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.13 primary_care

5.1.5.1.14 Saving Propesnsity Score

5.1.5.1.15 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.16 mental_care

5.1.5.1.17 Saving Propesnsity Score

5.1.5.1.18 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.19 highschool_grad

5.1.5.1.20 Saving Propesnsity Score

5.1.5.1.21 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.22 some_college

5.1.5.1.23 Saving Propesnsity Score

5.1.5.1.24 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.25 unemployment

5.1.5.1.26 Saving Propesnsity Score

5.1.5.1.27 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.28 social_score

5.1.5.1.29 Saving Propesnsity Score

5.1.5.1.30 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.31 life_expectancy

5.1.5.1.32 Saving Propesnsity Score

5.1.5.1.33 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.34 insuff_sleep

5.1.5.1.35 Saving Propesnsity Score

5.1.5.1.36 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.37 home_ownership

5.1.5.1.38 Saving Propesnsity Score

5.1.5.1.39 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.40 housing_burden

5.1.5.1.41 Saving Propesnsity Score

5.1.5.1.42 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.43 below_18

5.1.5.1.44 Saving Propesnsity Score

5.1.5.1.45 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.46 above_65

5.1.5.1.47 Saving Propesnsity Score

5.1.5.1.48 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.49 asian

5.1.5.1.50 Saving Propesnsity Score

5.1.5.1.51 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.52 female

5.1.5.1.53 Saving Propesnsity Score

5.1.5.1.54 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.55 rural

5.1.5.1.56 Saving Propesnsity Score

5.1.5.1.57 Comparing The PS Distribution of The Two Exposure Groups

5.1.5.1.58 obesity

2.3 Categorizing Exposure `dentist_grp`

2.5.1 Filter Complete Cases For Exposure `dentist_prop`

5.1.5.1.1 `poor_mental`

5.1.5.1.4 `food_index`

5.1.5.1.7 `uninsured`

5.1.5.1.10 `income_inequality`

5.1.5.1.13 `primary_care`

5.1.5.1.16 `mental_care`

5.1.5.1.19 `highschool_grad`

5.1.5.1.22 `some_college`

5.1.5.1.25 `unemployment`

5.1.5.1.28 `social_score`

5.1.5.1.31 `life_expectancy`

5.1.5.1.34 `insuff_sleep`

5.1.5.1.37 `home_ownership`

5.1.5.1.40 `housing_burden`

5.1.5.1.43 `below_18`

5.1.5.1.46 `above_65`

5.1.5.1.49 `asian`

5.1.5.1.52 `female`

5.1.5.1.55 `rural`

5.1.5.1.58 `obesity`

5.1.5.1.61 `segregation`

6.2 Unadjusted Outcome 1 `smoking_prop`

7.1 Unadjusted Outcome 2 `drinking_prop`