Hierarchical forecasting of hospital admissions- Classical forecast

June 09, 2021

Intro
Reconciliation
Dataset
Feature Engineering
Models
Evaluation
- ARIMA
- ARIMA vs ETS
Conclusion
- Error

Intro

The aim of this series of blog is do predict monthly admissions to Singapore public acute adult hospitals. The dataset starts from Jan 2016 and ends in Feb 2021. EDA for the dataset was explored in past posts ( part 1 ; part 2 ).

There are several approaches to forecast the admissions including

Summing up all the individual hospital admissions and forecasting the admissions at a national level
If the admission of all the hospitals is desired, either each hospital is forecasted individually or all the hospitals are forecasted as a global group
Forecasting at each hierarchical level. Every country has an organisation order to its public hospitals. In Singapore, there are 3 levels:

National level
|– Cluster level (Clusters are a network of hospitals based on geographical regions. There are 3 health clusters in Singapore.)
|– Hospital level (There are 8 public acute adult hospitals.)

Hierarchical modelling can provide figures to decision makers at each level of the healthcare system. This field of forecasting has gained much attention thanks to the M5 competition and soon there will be a comprehensive publications of innovations in hierarchical time series..

library(tidyverse)

# cleaned up dataset downloaded from  my github. Clean up of OG dataset done in 1st post
raw<- read_csv("https://raw.githubusercontent.com/notast/hierarchical-forecasting/main/stat_sg_CLEAN.csv") 

# convert to tsibble 
library(fpp3)
df_tsib<- raw %>%
  # monthly index- https://stackoverflow.com/questions/59538702/tsibble-how-do-you-get-around-implicit-gaps-when-there-are-none
  mutate(Date= yearmonth(as.character(Date))) %>%
  # if there are two index, need to use group_by and summarise to isolate a specific index 
  # https://www.mitchelloharawild.com/blog/feasts/
  as_tsibble(key= c(Hospital, Cluster), index= Date)

# A hierarchical established with `aggregate_key`.  
# LHS: parent/RHS: child 
(df_hts<- df_tsib %>% aggregate_key(Cluster/Hospital, Admission= sum(Admission, na.rm = T))%>%  
  # cant use between 
  mutate(Covid= case_when(
         Date==yearmonth("2020-01-01")~ "Yes", 
         Date==yearmonth("2020-02-01")~ "Yes",
         Date==yearmonth("2020-03-01")~ "Yes",
         Date==yearmonth("2020-04-01")~ "Yes",
         Date==yearmonth("2020-05-01")~ "Yes",
         Date==yearmonth("2020-06-01")~ "Yes",
         Date==yearmonth("2020-07-01")~ "Yes",
         T ~ "No")))

## # A tsibble: 744 x 5 [1M]
## # Key:       Cluster, Hospital [12]
##        Date Cluster      Hospital     Admission Covid
##       <mth> <chr*>       <chr*>           <dbl> <chr>
##  1 2016 Jan <aggregated> <aggregated>     26555 No   
##  2 2016 Feb <aggregated> <aggregated>     24898 No   
##  3 2016 Mar <aggregated> <aggregated>     28002 No   
##  4 2016 Apr <aggregated> <aggregated>     27488 No   
##  5 2016 May <aggregated> <aggregated>     27280 No   
##  6 2016 Jun <aggregated> <aggregated>     27724 No   
##  7 2016 Jul <aggregated> <aggregated>     28349 No   
##  8 2016 Aug <aggregated> <aggregated>     28640 No   
##  9 2016 Sep <aggregated> <aggregated>     27309 No   
## 10 2016 Oct <aggregated> <aggregated>     27790 No   
## # ... with 734 more rows

Two broad approaches were used for this project:

Classical approach which uses e.g. ETS and ARMIA.
Machine learning approach.

No approach is superior and it likely boils down to experimentation. This post will focus on the classical approach which is supported by the fpp3 meta-package, which follows tidyverse principles and is nicknamed the tidyverts .

Reconciliation

Traditionally, hierarchical forecasting using classical approach involved selecting a hierarchal level and forecast for that level before adding to higher levels or distributing it lower levels.. However, there are disadvantages to these techniques:

For bottoms-up approach: The bottom-level can be noisy and more difficult to predict.
For top-down approach: Traits of individual time series e.g. special events and different seasonal patterns are lost due to information aggregation and it produces less accurate forecasts at lower levels.

In reconciliation forecasting, all levels are individually forecasted as base forecasts and a regression model using ordinary linear squares (OLS) combines these predictions to give a set of optimal reconciled forecasts. fpp3 uses Minimum Trace (MINT) reconciliation technique which is an extension of the above technique that minimizes the variance of the reconciled forecasts. MINT uses all information from all levels of the hierarchy. Some information may be completely hidden or not easily identifiable at other levels e.g. seasonal difference will be smoothed at the superordinate level due to aggregation.

Dataset

The training set was from Jan 16 to Apr 20 (3 years, 4months) and the test set was from May 20 to Feb 21 (10 months) and the forecast future period would be Mar 21 to Dec 21 (10 months). The forecast horizon will be 10 months; in other words, to predict the admissions for the remaining of 2021.

Feature Engineering

Two external regressors were considered for some ARIMA models:

Heightened periods of Covid-19 between Jan 21- Jul 21.
Lag predictors of Covid peak periods. Though the peak Covid periods were over, individuals could have reframed from being unnecessarily admitted as they were afraid of the infectious nature associated with hospitals.

Models

Base models included:

ETS
ARIMA
ARIMA with Covid (peak period) as regressor
ARIMA with Covid regressor with 1 month lag
ARIMA with Covid regressor with 2 month lag
ARIMA with Covid regressor with 3 month lag

Three hierarchical forecasting techniques were used:

bottoms up bu
reconciliation using ordinary least square ols
reconciliation using minimum trace with sample covariance mint_cov

fun_reconcile<- function(R, M, B, BU="bu", OLS="ols", MINT="mint"){
  LHS= "Admission"
  RHS= R
  model_spec= as.formula(paste0(LHS, RHS, sep=""))
  
  df_hts %>% 
  filter( Date < yearmonth("2020 May")) %>%
  model(base = model_spec %>% M) %>% 
  reconcile(
  bu = bottom_up(base), ols = min_trace(base, method = "ols"), mint = min_trace(base, method = "mint_cov")) %>% 
#https://stackoverflow.com/questions/35023375/r-renaming-passed-columns-in-functions
    rename({{B}} :=base) %>% 
#https://www.tidyverse.org/blog/2020/02/glue-strings-and-tidy-eval/
    rename("{{B}}_{BU}":= bu) %>% rename("{{B}}_{OLS}" := ols) %>% rename("{{B}}_{MINT}" := mint) 
  } 

m_ets<- fun_reconcile("~ error() + trend() + season()", ETS, ets)
m_arima<- fun_reconcile("~ pdq() + PDQ()", ARIMA, arima)
m_arima_covid<- fun_reconcile("~ Covid", ARIMA, arima_covid)
m_arima_covidL1<- fun_reconcile("~ Covid +lag(Covid)", ARIMA, arima_covidL1)
m_arima_covidL2<- fun_reconcile("~ Covid +lag(Covid,1)", ARIMA, arima_covidL2)
m_arima_covidL3<- fun_reconcile("~ Covid +lag(Covid,2)", ARIMA, arima_covidL3)

# save models
save(m_ets, m_arima, m_arima_covid, m_arima_covidL1, m_arima_covidL2, m_arima_covidL3, file = "3bClassic")

Evaluation

ARIMA

The best ARIMA model class was selected using AICc. The best ARIMA model and ETS model were then evaluated against the test set using rmse and mae. AICc can only be used with models in the same class as the calculation for AICc for ARIMA and ETS are different

The best ARIMA model was one with an external regressor for Covid peak period, m_armia_covid, i.e. ARIMA(Admission ~ Covid).

fun_reconcile_glance<- function(mab){
  glance(mab) %>% 
  group_by(.model) %>% summarise(avg_aicc=mean(AICc), sdv_aicc=sd(AICc))
}

bind_rows(fun_reconcile_glance(m_arima), 
          fun_reconcile_glance(m_arima_covid),
          fun_reconcile_glance(m_arima_covidL1), 
          fun_reconcile_glance(m_arima_covidL2), 
          fun_reconcile_glance(m_arima_covidL3)) %>% 
  arrange(avg_aicc, sort=T)

## # A tibble: 20 x 3
##    .model             avg_aicc sdv_aicc
##    <chr>                 <dbl>    <dbl>
##  1 arima_covid            685.     81.0
##  2 arima_covid_bu         685.     81.0
##  3 arima_covid_mint       685.     81.0
##  4 arima_covid_ols        685.     81.0
##  5 arima_covidL3          705.     66.3
##  6 arima_covidL3_bu       705.     66.3
##  7 arima_covidL3_mint     705.     66.3
##  8 arima_covidL3_ols      705.     66.3
##  9 arima_covidL1          715.     67.9
## 10 arima_covidL1_bu       715.     67.9
## 11 arima_covidL1_mint     715.     67.9
## 12 arima_covidL1_ols      715.     67.9
## 13 arima_covidL2          715.     67.9
## 14 arima_covidL2_bu       715.     67.9
## 15 arima_covidL2_mint     715.     67.9
## 16 arima_covidL2_ols      715.     67.9
## 17 arima                  720.     90.4
## 18 arima_bu               720.     90.4
## 19 arima_mint             720.     90.4
## 20 arima_ols              720.     90.4

ARIMA vs ETS

ARIMA was better than ETS.
Reconciliation approaches (i.e. arima_covid_mint, arima_covid_ols) performed better as they likely captured information from all levels of the hierarchy.
MINT technique performed better than OLS .

### test period ### 
# need to create manually, horizon in forecast doesnt work w external regressors
# https://stackoverflow.com/questions/67436733/how-to-reconcile-forecasts-with-hierarchical-structure-and-exogenous-regressors
test_set<- df_hts %>% filter( Date < yearmonth("2020 May")) %>% new_data(10)%>% 
  mutate(Covid= case_when(
         Date==yearmonth("2020-05-01")~ "Yes", 
         Date==yearmonth("2020-06-01")~ "Yes",
         Date==yearmonth("2020-07-01")~ "Yes",
          T ~ "No"))

### function for accuracy ###
fun_acc<- function(Forecasted){
    Forecasted %>% accuracy(df_hts, measures = list(rmse=RMSE, mae=MAE))}

### forecast on testing set ### 
m_arima_covid_test<- m_arima_covid %>% forecast(test_set) %>% fun_acc()

## Warning in mapply(FUN = .f, ..., MoreArgs = MoreArgs, SIMPLIFY = SIMPLIFY):
## longer argument not a multiple of length of shorter

## Warning in fc[btm] <- NextMethod(): number of items to replace is not a multiple
## of replacement length

m_ets_test<- m_ets %>% forecast(test_set) %>% fun_acc()

## Warning in mapply(FUN = .f, ..., MoreArgs = MoreArgs, SIMPLIFY = SIMPLIFY):
## longer argument not a multiple of length of shorter

## Warning in mapply(FUN = .f, ..., MoreArgs = MoreArgs, SIMPLIFY = SIMPLIFY):
## number of items to replace is not a multiple of replacement length

### function for average accuracy ###
fun_acc_avg<- function(ACC){
  ACC %>%  group_by(.model) %>%  summarise(avg_rmse=mean(rmse), avg_mae=mean(mae))
}

### compare models against test ###   
bind_rows(fun_acc_avg(m_arima_covid_test), 
          fun_acc_avg(m_ets_test)) %>% 
   arrange(avg_rmse, sort=T)

## # A tibble: 8 x 3
##   .model           avg_rmse avg_mae
##   <chr>               <dbl>   <dbl>
## 1 arima_covid_mint     847.    745.
## 2 arima_covid_ols     1085.    937.
## 3 arima_covid         1117.    991.
## 4 arima_covid_bu      1142.   1043.
## 5 ets                 1951.   1788.
## 6 ets_ols             2193.   2028.
## 7 ets_bu              2343.   2146.
## 8 ets_mint            3045.   2814.

Performance for each level

The above accuracy was for across all the time series. The performance for individual levels were reviewed and visualized. Specific hierarchical level information is filtered out using is_aggregated as an argument for filter.

filter(!is_aggregated(level)) does not aggregate members at that specific level thus analysis for that level can be conducted.
filter(is_aggregated(level)) aggregates values from that level and provides the aggregated information for the next level.

L_hospital<- m_arima_covid %>% select(arima_covid_mint) %>% forecast(test_set) %>%
  filter(!is_aggregated(Hospital)) 

L_cluster<-m_arima_covid %>% select(arima_covid_mint) %>% forecast(test_set) %>%
  filter(is_aggregated(Hospital), !is_aggregated(Cluster))

L_national<-m_arima_covid %>% select(arima_covid_mint) %>% forecast(test_set) %>%
    filter(is_aggregated(Cluster), is_aggregated(Hospital))

### Function accuracy for each level 
fun_acc_lvl<- function(DF, L){
  DF %>% 
    # get the accuracy for each member of the level 
    fun_acc() %>% 
    # take the average 
    fun_acc_avg() %>%
    # add level id
    mutate(Level=paste(L), .before=1)  
}

While smaller rmse are favoured in general, care needs to be taken when appreciating the rmse for each level as the magnitude of admission differs for each hierarchical level, the superordinate levels have more admissions thus a larger rmse can be expected.

bind_rows(fun_acc_lvl(L_hospital, "Hospital"), 
          fun_acc_lvl(L_cluster, "Cluster"), 
          fun_acc_lvl(L_national, "National"))

## # A tibble: 3 x 4
##   Level    .model           avg_rmse avg_mae
##   <chr>    <chr>               <dbl>   <dbl>
## 1 Hospital arima_covid_mint     700.    637.
## 2 Cluster  arima_covid_mint    1123.    959.
## 3 National arima_covid_mint    1190.    971.

Hospital level

The model overfitted 4/8 hospitals (AH, CGH, KTPH, NTFGH); underfitted 1/8 hospital (TTSH), appropriately fitted for 3/8 hospitals (NUH, SGH, SKH).

L_hospital %>% autoplot(df_hts %>% filter(Date > yearmonth("2020-01-01")))  +
 facet_wrap(vars(Hospital), scales = "free_y")  + guides(x =  guide_axis(angle = 40))

Cluster level

The model underfitted NHG, relatively fitted NUHS well, slightly overfitted SHS.

 L_cluster%>% autoplot(df_hts %>% filter(Date > yearmonth("2020-01-01")))

National level

The model does generally well at the national level.

L_national %>% autoplot(df_hts %>% filter(Date > yearmonth("2020-01-01")))

Conclusion

The best classical approach was an ARIMA model with an external regressor for Covid without any lags ARIMA(Admission ~ Covid) as the base and the forecast reconciled using minimum trace technique with sample covariance mint_cov. This approach achieved an average RMSE of 847 on the testing set. In the next post, machine learning approaches would be used.

Error

The following errors were encountered during scripting.

1. cant use `between` in `tsibble`

 df_tsib %>% aggregate_key(Cluster/Hospital, Admission= sum(Admission, na.rm = T))%>%  
  mutate(Covid= ifelse(between(Date, yearmonth("2020-01-01"),yearmonth("2020-05-01")), 
                               "yes", "no")) %>% 
  tibble() %>% count(Covid)

## Warning: between() called on numeric vector with S3 class

## # A tibble: 1 x 2
##   Covid     n
##   <chr> <int>
## 1 no      744

df_tsib %>% aggregate_key(Cluster/Hospital, Admission= sum(Admission, na.rm = T))%>%  
  # cant use between 
  mutate(Covid= case_when(
         Date==yearmonth("2020-01-01")~ "Yes", 
         Date==yearmonth("2020-02-01")~ "Yes",
         Date==yearmonth("2020-03-01")~ "Yes",
         Date==yearmonth("2020-04-01")~ "Yes",
         Date==yearmonth("2020-05-01")~ "Yes",
         T ~ "No")) %>%
  tibble() %>% count(Covid)

## # A tibble: 2 x 2
##   Covid     n
##   <chr> <int>
## 1 No      684
## 2 Yes      60

2. `horizon` argument in in `forecast` doesn’t work with external regressors

m_arima_covid %>% forecast(h=1)

## Error: Problem with `mutate()` column `arima_covid`.
## i `arima_covid = (function (object, ...) ...`.
## x object 'Covid' not found
##   Unable to compute required variables from provided `new_data`.
##   Does your model require extra variables to produce forecasts?