Combining Metrics with metric_set()
metric_set() allows you to combine multiple yardstick metrics into a single function that calculates all of them at once. This is more efficient than calling metrics individually and integrates seamlessly with tidymodels workflows.
Overview
Use when: - You want to calculate multiple metrics on the same data - You’re using tidymodels workflows (tune, recipes, workflows) - You want to avoid repeating metric calculations - You need consistent metric evaluation across resamples
Key benefits: - Efficiency: Shared calculations performed once (e.g., confusion matrix) - Convenience: One function call instead of many - Integration: Works with tune package for model tuning - Consistency: All metrics use same data preprocessing
Basic Usage
library(yardstick)
# Create a metric set
my_metrics <- metric_set(rmse, rsq, mae)
# Use it like any other metric function
my_metrics(data, truth = actual, estimate = predicted)
# Returns tibble with all metrics:
# .metric .estimator .estimate
# rmse standard 0.123
# rsq standard 0.891
# mae standard 0.098Compatibility Rules
Metrics in a set must be compatible. You can mix:
✓ Valid Combinations
1. All numeric metrics:
numeric_metrics <- metric_set(rmse, mae, rsq, huber_loss)2. Mix of class and class probability metrics:
class_metrics <- metric_set(accuracy, precision, recall, roc_auc, pr_auc)3. Mix of survival metrics (any combination):
surv_metrics <- metric_set(
concordance_survival, # Static
brier_survival, # Dynamic
brier_survival_integrated # Integrated
)✗ Invalid Combinations
Cannot mix metric types:
# ERROR: Cannot mix numeric and classification
metric_set(rmse, accuracy)
# ERROR: Cannot mix classification and survival
metric_set(accuracy, concordance_survival)Function Signatures by Type
The returned function has different arguments depending on metric types:
Numeric Metrics
regression_metrics <- metric_set(rmse, mae, rsq)
# Signature:
regression_metrics(
data,
truth,
estimate,
na_rm = TRUE,
case_weights = NULL,
...
)
# Usage:
regression_metrics(test_data, truth = y, estimate = y_pred)Class/Probability Metrics
class_metrics <- metric_set(accuracy, roc_auc, pr_auc)
# Signature:
class_metrics(
data,
truth,
..., # For probability columns
estimate, # Must be named!
estimator = NULL,
na_rm = TRUE,
event_level = yardstick_event_level(),
case_weights = NULL
)
# Usage - note estimate is named:
class_metrics(
test_data,
truth = obs,
VF:L, # Probability columns
estimate = pred # Named argument!
)Survival Metrics
surv_metrics <- metric_set(concordance_survival, brier_survival)
# Signature:
surv_metrics(
data,
truth,
..., # For survival predictions
estimate, # Named for time predictions
na_rm = TRUE,
case_weights = NULL
)Important: Named estimate Argument
⚠️ For class/probability and survival metric sets, you MUST name the estimate argument.
class_metrics <- metric_set(accuracy, roc_auc)
# ✓ Correct
class_metrics(data, truth = obs, estimate = pred)
# ✗ Wrong - estimate captured by ...
class_metrics(data, truth = obs, pred)
# Error: Can't find estimate columnWhy? The estimate argument comes after ... in the signature, so unnamed arguments get captured by ....
Working with Groups
Metric sets respect dplyr::group_by():
metrics <- metric_set(accuracy, kap, roc_auc)
# Compute metrics for each resample
hpc_cv |>
group_by(Resample) |>
metrics(truth = obs, VF:L, estimate = pred)
# Returns one row per metric per group:
# .metric .estimator .estimate Resample
# accuracy multiclass 0.709 Fold01
# kap multiclass 0.583 Fold01
# roc_auc hand_till 0.901 Fold01
# accuracy multiclass 0.713 Fold02
# ...Using metric_tweak() with metric_set()
Use metric_tweak() to set custom defaults for metrics before adding them to a set:
# Create tweaked version with custom parameter
f2_meas <- metric_tweak("f2_meas", f_meas, beta = 2)
mase12 <- metric_tweak("mase12", mase, m = 12)
# Add to metric set
my_metrics <- metric_set(
precision,
recall,
f_meas, # Default beta = 1
f2_meas # Custom beta = 2
)
my_metrics(data, truth = obs, estimate = pred)
# Both f_meas and f2_meas calculated with different beta valuesWhy this matters: Once metrics are in a set, you can’t change their parameters. Tweak them first.
Complete Examples
Regression Workflow
library(yardstick)
library(dplyr)
# Define metric set
regression_metrics <- metric_set(
rmse,
mae,
rsq,
huber_loss
)
# Use on test data
results <- regression_metrics(
solubility_test,
truth = solubility,
estimate = prediction
)
results
# .metric .estimator .estimate
# rmse standard 0.789
# mae standard 0.582
# rsq standard 0.892
# huber_loss standard 0.341Classification with Probabilities
# Mix class and probability metrics
class_metrics <- metric_set(
accuracy,
precision,
recall,
f_meas,
roc_auc,
pr_auc
)
# Use with class probabilities
results <- class_metrics(
two_class_example,
truth = truth,
Class1, # Probability column
estimate = predicted
)
results
# .metric .estimator .estimate
# accuracy binary 0.838
# precision binary 0.819
# recall binary 0.875
# f_meas binary 0.846
# roc_auc binary 0.939
# pr_auc binary 0.946Multiclass Classification
multi_metrics <- metric_set(
accuracy,
bal_accuracy,
kap,
roc_auc,
precision,
recall
)
# Specify macro averaging for precision/recall
hpc_cv |>
multi_metrics(
truth = obs,
VF:L, # Probability columns
estimate = pred,
estimator = "macro"
)Cross-Validation
library(rsample)
# Define metrics once
cv_metrics <- metric_set(rmse, rsq, mae)
# Use across all folds
cv_results <- vfold_cv(training_data, v = 10) |>
mutate(
metrics = map(splits, function(split) {
# Fit model and predict
model <- fit_model(analysis(split))
preds <- predict(model, assessment(split))
# Calculate all metrics at once
cv_metrics(
assessment(split),
truth = outcome,
estimate = preds
)
})
)
# Aggregate across folds
cv_results |>
unnest(metrics) |>
group_by(.metric) |>
summarize(mean = mean(.estimate), se = sd(.estimate))With Groupwise Metrics
# Create groupwise metric
accuracy_diff <- new_groupwise_metric(
fn = accuracy,
name = "accuracy_diff",
aggregate = function(x) diff(range(x$.estimate))
)
# Combine with regular metrics
fairness_metrics <- metric_set(
accuracy,
precision,
recall,
accuracy_diff(protected_attr) # Add groupwise metric
)
fairness_metrics(data, truth = obs, estimate = pred)Creating Custom Metrics for metric_set()
To use your custom metric in a set, wrap it with the appropriate new_*_metric():
# Define your metric function
my_custom_metric <- function(data, truth, estimate, na_rm = TRUE, ...) {
# Implementation
# ...
tibble(
.metric = "my_custom",
.estimator = "standard",
.estimate = result
)
}
# Wrap with new_*_metric() - required for metric_set()
my_custom_metric <- new_numeric_metric(
my_custom_metric,
direction = "maximize"
)
# Now it works in metric sets
my_metrics <- metric_set(rmse, mae, my_custom_metric)Key requirements: 1. Must be wrapped with new_*_metric() 2. Must follow standard yardstick signature patterns 3. Must return standard yardstick output format
Using with tune Package
Metric sets integrate with tune for model tuning:
library(tune)
library(workflows)
# Define metrics for tuning
tune_metrics <- metric_set(
rmse,
rsq,
mae
)
# Use in tune_grid()
tune_results <- tune_grid(
workflow,
resamples = cv_folds,
grid = param_grid,
metrics = tune_metrics # Pass metric set
)
# Best models selected based on all metrics
show_best(tune_results, metric = "rmse")Performance Benefits
Metric sets are more efficient than individual calls:
# Inefficient - confusion matrix calculated 3 times
accuracy(data, truth, estimate)
precision(data, truth, estimate)
recall(data, truth, estimate)
# Efficient - confusion matrix calculated once, shared
metrics <- metric_set(accuracy, precision, recall)
metrics(data, truth, estimate)Shared calculations: - Confusion matrices (for class metrics) - ROC curves (for ROC-based metrics) - Group-by operations - Missing value handling
Advanced Patterns
Conditional Metrics
# Select metrics based on data
metrics <- if (is_binary) {
metric_set(accuracy, sensitivity, specificity, roc_auc)
} else {
metric_set(accuracy, bal_accuracy, kap)
}
metrics(data, truth = obs, estimate = pred)Parameterized Sets
create_metric_set <- function(include_auc = TRUE) {
base_metrics <- c(accuracy, precision, recall)
if (include_auc) {
base_metrics <- c(base_metrics, list(roc_auc))
}
do.call(metric_set, base_metrics)
}
# Use
metrics <- create_metric_set(include_auc = TRUE)Multiple Tweaked Versions
# Different F-measures
f0.5 <- metric_tweak("f0.5_meas", f_meas, beta = 0.5)
f1 <- f_meas
f2 <- metric_tweak("f2_meas", f_meas, beta = 2)
# All in one set
f_metrics <- metric_set(f0.5, f1, f2)Troubleshooting
Error: Cannot mix metric types
# Error
metric_set(rmse, accuracy)Solution: Keep metrics of compatible types together.
Error: estimate not found
# Wrong
class_metrics(data, truth, pred)Solution: Name the estimate argument:
class_metrics(data, truth, estimate = pred)Error: Metric doesn’t work in set
my_metric <- function(data, truth, estimate) { ... }
metric_set(rmse, my_metric) # ErrorSolution: Wrap custom metrics:
my_metric <- new_numeric_metric(my_metric, direction = "minimize")
metric_set(rmse, my_metric) # WorksBest Practices
- Define once, use everywhere: Create metric sets at the top of your analysis
- Name your sets: Use descriptive names like
classification_metrics, notmetrics - Use with groups: Leverage group-aware behavior for cross-validation
- Tweak before combining: Set custom parameters with
metric_tweak()first - Keep compatible types: Don’t mix numeric, class, and survival metrics
- Named estimate: Always name the
estimateargument for class/survival metrics - Integration: Use with tune package for consistent tuning metrics
Common Metric Sets
# Standard regression
regression_std <- metric_set(rmse, mae, rsq)
# Regression with alternatives
regression_robust <- metric_set(mae, huber_loss, mape)
# Binary classification
binary_clf <- metric_set(
accuracy, sensitivity, specificity,
roc_auc, pr_auc
)
# Multiclass classification
multiclass_clf <- metric_set(
accuracy, bal_accuracy, kap,
roc_auc # Uses hand_till method for multiclass
)
# Survival analysis
survival_std <- metric_set(
concordance_survival,
brier_survival,
brier_survival_integrated
)
# Fairness analysis
fairness_set <- metric_set(
accuracy,
demographic_parity(group),
equal_opportunity(group)
)See Also
- Metric System - Understanding basic metric architecture
- Groupwise Metrics - Creating disparity metrics
- metric_tweak() - Customizing metric parameters
?metric_set- Full documentation