Khalida

Skills:

Survival of Patient

Survival of Patient

This project aims to develop a robust predictive model that can assess the likelihood of patient survival for one year post-treatment. This project is particularly crucial for a hospital seeking to improve patient care by analyzing and understanding survival predictors within historical data.

Project Overview

Despite efforts, the hospital has struggled to isolate factors strongly associated with improved survival rates. This project addresses this by using advanced machine learning techniques to build a model capable of predicting survival with high accuracy, thereby guiding the hospital toward more effective treatment protocols.

Objectives

The project’s main objective is to deliver a high-accuracy model capable of predicting one-year survival rates, utilizing features such as demographic details, medical history, and treatment factors.

Methodology

  • Data Cleaning: Initial data preparation included handling missing values through median and mode imputation, removing uncertain responses (e.g., “Cannot say” for smoking status), and identifying outliers using z-scores.
  • Model Selection and Training: A suite of machine learning algorithms was trained, including Decision Trees, AdaBoost, Gradient Boosting, Random Forest, Support Vector Machine (SVM), and Logistic Regression. Each model underwent hyperparameter tuning to enhance performance.
  • Hyperparameter Tuning: GridSearchCV was applied to find the optimal settings for each model
  • Comparison with AutoML: For further validation, PyCaret’s AutoML was employed to benchmark the models, aiming for an automated comparison of model performances and hyperparameters. This allowed for an efficient and comprehensive evaluation of potential models.

Key Findings

After extensive hyperparameter tuning, Gradient Boosting achieved notable performance, with an accuracy of 80%. PyCaret’s AutoML confirmed the model’s effectiveness, further validating that Gradient Boosting is well-suited for this data.

Expected Outcomes

The project aims to deliver an accurate predictive tool for healthcare providers, enhancing decision-making around patient care improvements and potentially personalizing treatment based on survival likelihood predictions. ​