Although theory and evidence have revealed numerous social determinants of educational attainment, we still know little about which determinants matter the most, for whom, and in which contexts. Further, prior findings describe predominantly White samples and do not necessarily hold in racial/ethnic minority groups. Grounded in the Cultural Ecological Model, this project will conduct a comprehensive analysis using machine learning based on the large-scale Add Health national longitudinal study to: identify which social determinants, including individual family, school, and neighborhood factors, measured during adolescence, are most predictive of educational in White, Black and Latinx adults, respectively; interpret prediction patterns of the key determinants within each racial/ethnic group, including their directionality, (non)linearity, and interactions; and reveal differences in the prediction models across these groups. Analyses will train random forest models based on hundreds of social determinants in Add Health Waves I (grades 7-12) and II (1 year later) to predict highest levels of education at Wave IV (aged 24-32 years old) among White, Black, and Latinx groups. Key predictors will be identified with feature selection and interpreted with partial dependence plots. Cross-group model testing will reveal differences in predictions across races/ethnicities.
Machine Learning for Family Research