Abstract
Survival prediction models most commonly use Cox Proportional Hazards (CPH) models, and are frequently used in medical statistics and clinical practice. However, such models underperform when the predictor variables are missing. By building Bayesian networks we automatically construct a model with the most important risk factors and relationships between risk factors and Bayesian networks are able to infer the likely values of missing data. We therefore propose a hybrid solution, consisting of a CPH model and a BN, where the predictive variables in the CPH model are the child nodes of a BN, which we call CSBN. We learn the CPH and BN models separately, using standard techniques, with the only constraint being that the variables that are predictors in the CPH model are child nodes in the BN. This allows us to fuse the two models, using the predictors of the CPH models as the join points. We test our approach by examining the performance of the CPH model, against the hybrid CSBN model, using both complete data cases and in cases with missing data. We calculate the performance of the survival prediction for both CPH and CSBN using the C-index and a normalised error function as metrics. For the CPH model, predictive error was significantly larger for missing data (±3120.8 days) compared to complete data (±1171.5 days;p= 3.6e−07). This was also true for the CSBN±1387.3 days for missing data compared with±1171.5 days with complete data (p= 0.01568). However, with missing data, the predictive error was significantly larger for the CPH model (±3120.8 days) than theCSBN (±1171.5 days;p= 0.03274). In conclusion the CSBN methodology provides a more effective method of predicting survival when using incomplete data.
Original language | English |
---|---|
Publication status | Published - 14 Sept 2018 |
Event | ECML/PKDD Workshop on Advanced Analytics and Learning on Temporal Data - Dublin, Ireland Duration: 14 Sept 2018 → 14 Sept 2018 https://project.inria.fr/aaldt18/ |
Workshop
Workshop | ECML/PKDD Workshop on Advanced Analytics and Learning on Temporal Data |
---|---|
Abbreviated title | AALTD'18 |
Country/Territory | Ireland |
City | Dublin |
Period | 14/09/18 → 14/09/18 |
Internet address |