Primary Care Demand and Capacity

Forecasting


Introduction

Appointment volumes have been increasing in SNEE. In order to estimate future appointments demand in the SNEE footprint, the project aims to leverage historical appointments data, analyze key patterns and trends, and develop a predictive/machine learning model to forecast the number of GP appointments at a regional level within the NHS framework. The project is focused on providing data-driven insights that can assist healthcare providers in better resource allocation, planning, and operational management. As patient numbers fluctuate due to various factors, predicting the number of future appointments can help mitigate overbooking or underutilization of healthcare services. The ultimate goal is to create a reliable and efficient predictive model that accounts for historical data and other relevant factors.

Data source

The primary data used for this analysis is derived from the extensive dataset provided by NHS England.

Dataset Description Website URL Download Zip
NHS GP Appointments by Region This dataset spans from Nov 2021 to Apr 2024, providing current GP appointment data at a SUB-ICB level, including details on healthcare professional types, appointment counts, and months. NHS GP Appointments by Region Download
NHS GP Appointments Historical Data This historical dataset covers Oct 2019 to Mar 2022, allowing analysis of GP appointments trends, with details on healthcare professional types and appointment counts. NHS GP Appointments Historical Data Download
Population Projections for CCGs by ONS Used for forward-looking appointment estimates in the model, based on 2018 projections revised in 2020. Data is at the CCG (now SUB-ICB) level. ONS Population Projections N/A



Methodology

The predictive approach leverages machine learning techniques implemented through Scikit-learn, with particular focus on preprocessing large datasets, feature selection, and model evaluation using various metrics such as Mean Squared Error (MSE) and Root Mean Squared Error (RMSE).

Data Loading and Preprocessing:

Exploratory Data Analysis (EDA):

Modeling:

Forecasting:

Model Saving:



Primary care Appointments in England and SNEE-ICB

alt text

alt text



Trend plot for Primary care Appointments/working-day in SNEE-ICB

alt text



Statistical Forecast

There is a clear 12-month observed where the number of appointments peaks around october each year. When correcting for the number of working days in a year, this observed seasonality becomes stronger.


Machine learning Model Evaluation

Model Best Params RMSE MAE
Linear Regression {'model__copy_X': True, 'model__fit_intercept': True, 'model__n_jobs': None, 'preprocessor__pca__n_components': 4} 0.002683 0.002065
Lasso {'model__alpha': 0.01, 'model__fit_intercept': True, 'model__max_iter': 1000, 'preprocessor__pca__n_components': 1} 0.003091 0.002443
Ridge {'model__alpha': 0.1, 'model__fit_intercept': True, 'model__solver': 'auto', 'preprocessor__pca__n_components': 4} 0.002672 0.002064



Ridge Regression Models Outputs

alt text

Actual vs. Predicted Values (Left Plot):

Residuals vs. Predicted Values (Middle Plot):

QQ Plot of Residuals (Right Plot):



Conclusion

Overall: The model seems to perform reasonably well but not perfectly. The residuals are generally well-distributed, though there may be slight issues with heteroscedasticity and some outliers or non-normality in the residuals. The model’s predictions are close but not exact, and there may be some room for improvement in how it handles extreme values or certain ranges of the data.