Shifan Jia
Title: Advanced Functional Data Analysis Methods: From Regression Modeling to Multimodal Learning
Date: April 29th 2026
Time: 10am
Location: Zoom
Supervised by: Haolun Shi & Tianyu Guan
Abstract:
Functional data analysis (FDA) considers the continuity of curves or functions, and is a topic of increasing interest in the statistics community. FDA has been widely applied in many fields, including economics, medicine, time-series analysis, and spatial data analysis. In this thesis, we take advantage of functional data and propose novel models to address important challenges in modeling complex functional data, ranging from regression models to deep learning frameworks. We first introduce a function-on-function combined regression model that predicts a functional response through both a nonlinear dynamic effect of a functional predictor and a linear concurrent effect of another functional predictor. This model improves the flexibility of nonlinear modeling while preserving the interpretability of the linear concurrent effect. Second, we propose a variable-domain weighted scalar-on-function regression model to capture
the dynamic effects of gestational weight gain (GWG) on newborn weight and to predict newborn weight. Since the gestational period may vary across pregnant women due to different childbirth times, the observed domain of the GWG trajectory differs among individuals. Therefore, we use a variable-domain framework. In addition, our model incorporates heteroscedasticity in the error structure to account for the varying variances associated with different gestational ages at birth, thereby enhancing model flexibility and accuracy. Furthermore, we present a deep learning framework for multimodal inputs, multiple competing risks, and survival modeling that incorporates ideas from functional data analysis, such as B-spline basis representations, to estimate smooth continuous-time cause-specific density incidence function. This framework greatly enhances traditional statistical models by incorporating an attention-based shared encoder.