Model Customization and Control

Users may edit any model training routine through accessing a list of function arguments. For machine learning techniques, this entails editing caret arguments including: tuning grid, control grid, method, and accuracy metric. For univariate time series forecasting, this entails passing arguments to forecast package model functions. For imputing missing variables, this entails passing arguments to imputeTS package functions.

Univariate models from forecast

Models sourced from the forecast package require all differ slightly, thus the forecsat documentation should be a users first stop when deciding what arguments to pass through to the function. By default, OOS uses the default forecast implementation of a given model. However, here we show a brief example of how to dictate that the Arima function takes on the form of an AR(12).

A brief example using an Arima model to forecast univariate time series:

# 1. create the central list of univariate model training arguments, univariate.forecast.training  
forecast_univariate.control_panel = instantiate.forecast_univariate.control_panel() 
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
# 2. select an item to edit, for example the Arima order to create an ARMA(1,1)   
# view default model arguments (there are none)  
forecast_univariate.control_panel$arguments[['Arima']] 
## NULL
# add our own function arguments  
forecast_univariate.control_panel$arguments[['Arima']]$order = c(1,0,1) 

Data imputation with imputeTS

Methods used to impute missing data are sourced from the imputeTS package, and a user interfaces with these routines in the same manner as they would with functions from the forecast package. By default, OOS uses the imputeTS default implementation of a given method.

Multivariate models from caret

Multivariate models are trained via the caret package, which means a user may access and edit:

  1. caret.engine: a string declaring the name of the caret recognized model to use
  2. tuning.grids: a matrix of parameters for a grid search training routine
  3. control: a list of variables controlling the parameter estimation (training) routine
  4. accuracy: a string denoting what accuracy metric to use in model (estimation) training

Note that all models access the same control and accuracy.

A brief example using the Random Forest to combine forecasts:

# 1. create the central list of ML training arguments 
forecast_combinations.control_panel = instantiate.forecast_combinations.control_panel()  

# 2. select an item to edit, for example the random forest tuning grid   
# view RF model name 
forecast_combinations.control_panel$caret.engine[['RF']] 
## [1] "rf"
# view default tuning grid  
forecast_combinations.control_panel$tuning.grids[['RF']]  
##   mtry
## 1    1
## 2    2
## 3    3
## 4    4
# edit tuning grid   
forecast_combinations.control_panel$tuning.grids[['RF']] = expand.grid(mtry = c(1:6)) 
# and view result
forecast_combinations.control_panel$tuning.grids[['RF']]
##   mtry
## 1    1
## 2    2
## 3    3
## 4    4
## 5    5
## 6    6
# view default training control
forecast_combinations.control_panel$control
## $method
## [1] "cv"
## 
## $number
## [1] 5
## 
## $repeats
## [1] NA
## 
## $search
## [1] "grid"
## 
## $p
## [1] 0.75
## 
## $initialWindow
## NULL
## 
## $horizon
## [1] 1
## 
## $fixedWindow
## [1] TRUE
## 
## $skip
## [1] 0
## 
## $verboseIter
## [1] FALSE
## 
## $returnData
## [1] TRUE
## 
## $returnResamp
## [1] "final"
## 
## $savePredictions
## [1] FALSE
## 
## $classProbs
## [1] FALSE
## 
## $summaryFunction
## function (data, lev = NULL, model = NULL) 
## {
##     if (is.character(data$obs)) 
##         data$obs <- factor(data$obs, levels = lev)
##     postResample(data[, "pred"], data[, "obs"])
## }
## <bytecode: 0x00000000222779c0>
## <environment: namespace:caret>
## 
## $selectionFunction
## [1] "best"
## 
## $preProcOptions
## $preProcOptions$thresh
## [1] 0.95
## 
## $preProcOptions$ICAcomp
## [1] 3
## 
## $preProcOptions$k
## [1] 5
## 
## $preProcOptions$freqCut
## [1] 19
## 
## $preProcOptions$uniqueCut
## [1] 10
## 
## $preProcOptions$cutoff
## [1] 0.9
## 
## 
## $sampling
## NULL
## 
## $index
## NULL
## 
## $indexOut
## NULL
## 
## $indexFinal
## NULL
## 
## $timingSamps
## [1] 0
## 
## $predictionBounds
## [1] FALSE FALSE
## 
## $seeds
## [1] NA
## 
## $adaptive
## $adaptive$min
## [1] 5
## 
## $adaptive$alpha
## [1] 0.05
## 
## $adaptive$method
## [1] "gls"
## 
## $adaptive$complete
## [1] TRUE
## 
## 
## $trim
## [1] FALSE
## 
## $allowParallel
## [1] TRUE