Tools for working with H2O AutoML results — rank

Functions that returns a tibble describing model performances.

rank_results() ranks average cross validation performances of candidate models on each metric.
collect_metrics() computes average statistics of performance metrics (summarized) for each model, or raw value in each resample (unsummarized).
tidy() computes average performance for each model.
member_weights() computes member importance for stacked ensemble models, i.e., the relative importance of base models in the meta-learner. This is typically the coefficient magnitude in the second-level GLM model.

extract_fit_engine() extracts single candidate model from auto_ml() results. When id is null, it returns the leader model.

refit() re-fits an existing AutoML model to add more candidates. The model to be re-fitted needs to have engine argument save_data = TRUE, and keep_cross_validation_predictions = TRUE if stacked ensembles is needed for later models.

Usage

# S3 method for workflow
rank_results(x, ...)

# S3 method for `_H2OAutoML`
rank_results(x, ...)

# S3 method for H2OAutoML
rank_results(x, n = NULL, id = NULL, ...)

# S3 method for workflow
collect_metrics(x, ...)

# S3 method for `_H2OAutoML`
collect_metrics(x, ...)

# S3 method for H2OAutoML
collect_metrics(x, summarize = TRUE, n = NULL, id = NULL, ...)

# S3 method for `_H2OAutoML`
tidy(x, n = NULL, id = NULL, keep_model = TRUE, ...)

get_leaderboard(x, n = NULL, id = NULL)

member_weights(x, ...)

# S3 method for `_H2OAutoML`
extract_fit_parsnip(x, id = NULL, ...)

# S3 method for `_H2OAutoML`
extract_fit_engine(x, id = NULL, ...)

# S3 method for workflow
refit(object, ...)

# S3 method for `_H2OAutoML`
refit(object, verbosity = NULL, ...)

Arguments

...: Not used.
n: An integer for the number of top models to extract from AutoML results, default to all.
id: A character vector of model ids to retrieve.
summarize: A logical; should metrics be summarized over resamples (TRUE) or return the values for each individual resample.
keep_model: A logical value for if the actual model object should be retrieved from the server. Defaults to TRUE.
object, x: A fitted auto_ml() model or workflow.
verbosity: Verbosity of the backend messages printed during training; Must be one of NULL (live log disabled), "debug", "info", "warn", "error". Defaults to NULL.

Value

A tibble::tibble().

Details

H2O associates with each model in AutoML an unique id. This can be used for model extraction and prediction, i.e., extract_fit_engine(x, id = id) returns the model and predict(x, id = id) will predict for that model. extract_fit_parsnip(x, id = id) wraps the h2o model with parsnip parsnip model object is discouraged.

The algorithm column corresponds to the model family H2O use for a particular model, including xgboost ("XGBOOST"), gradient boosting ("GBM"), random forest and variants ("DRF", "XRT"), generalized linear model ("GLM"), and neural network ("deeplearning"). See the details section in h2o::h2o.automl() for more information.

Examples

if (h2o_running()) {
 auto_fit <- auto_ml() %>%
   set_engine("h2o", max_runtime_secs = 5) %>%
   set_mode("regression") %>%
   fit(mpg ~ ., data = mtcars)

   rank_results(auto_fit, n = 5)
   collect_metrics(auto_fit, summarize = FALSE)
   tidy(auto_fit)
   member_weights(auto_fit)
}