Extension Development: Best Practices

Context: This guide is for extension development - creating new packages that extend tidymodels packages.

Key principle: ❌ Never use internal functions (accessed with :::)

Guide to writing high-quality R code for tidymodels extension packages.

Best Practices

Guide to writing high-quality R code for tidymodels extension packages.

Code Style

Use base pipe

# Good
recipe(mpg ~ ., data = mtcars) |>
  step_center(all_numeric_predictors())

# Avoid
recipe(mpg ~ ., data = mtcars) %>%
  step_center(all_numeric_predictors())

The base pipe |> is faster, built-in, and the tidymodels standard.

Anonymous functions

# Single line: use backslash notation
map(x, \(i) i + 1)

# Multi-line: use function()
map(x, function(i) {
  result <- complex_computation(i)
  result + 1
})

For-loops over map()

# Preferred (better error messages)
for (col in columns) {
  new_data[[col]] <- transform(new_data[[col]])
}

# Avoid (harder to debug)
new_data <- map(columns, \(col) transform(new_data[[col]]))

Why prefer for-loops:

Better error messages (shows which iteration failed)
More familiar to most R users
Easier to debug with browser()
Consistent with tidymodels style

Minimal comments

# Good: code is self-documenting
means <- colMeans(data)
centered <- sweep(data, 2, means, "-")

# Avoid: over-commenting obvious code
# Calculate column means
means <- colMeans(data)
# Subtract means from each column
centered <- sweep(data, 2, means, "-")

Write clear code that doesn’t need comments. Add comments only for:

Complex algorithms
Non-obvious optimization tricks
Warnings about edge cases

Error Messages

Use cli functions

# Good: cli provides better formatting
if (invalid) {
  cli::cli_abort("{.arg param} must be positive, not {.val {param}}.")
}

if (risky) {
  cli::cli_warn("Column{?s} {.var {col_names}} returned Inf or NaN.")
}

# Avoid: base R error functions
stop("param must be positive")
warning("columns returned Inf or NaN")

cli formatting syntax

# Argument names
cli::cli_abort("{.arg your_param} must be numeric.")

# Code/function names
cli::cli_abort("Use {.code binary} estimator for two classes.")

# Values
cli::cli_abort("Expected 3 columns, got {.val {ncol(data)}}.")

# Variable names
cli::cli_warn("Column{?s} {.var {col_names}} has/have missing values.")

# Pluralization
cli::cli_abort("Found {length(x)} error{?s}.")  # Handles 1 vs many

Error message guidelines

Be specific about what’s wrong
Tell users what they can do to fix it
Include actual values when helpful
Use proper English grammar

# Good
cli::cli_abort(
  "{.arg threshold} must be between 0 and 1, not {.val {threshold}}."
)

# Avoid
stop("Invalid threshold")

Documentation Standards

Be explicit

#' @param threshold Threshold value for classification. Must be a numeric
#'   value between 0 and 1. Default is 0.5.

Include:

Type (numeric, logical, character, factor)
Valid range or options
Default value
Effect on function behavior

US English

Use American spelling: “normalize” not “normalise”
Use sentence case: “Calculate the mean” not “calculate the mean”
Be consistent throughout

Wrap roxygen at 80 characters

#' This is a long line that should be wrapped to ensure it doesn't exceed the
#' 80-character limit for better readability in various text editors.

Include practical examples

#' @examples
#' # Basic usage
#' metric_name(data, truth, estimate)
#'
#' # With grouped data
#' data |>
#'   dplyr::group_by(fold) |>
#'   metric_name(truth, estimate)

Show realistic use cases, not just minimal examples.

Don’t use dynamic roxygen code

# Bad: calling non-exported functions
#' @return Range: `r metric_range()`  # metric_range() not exported

# Good: static documentation
#' @return Range: 0 to 1

Performance

Vectorization over loops

Always prefer vectorized operations:

# Good: vectorized
errors <- truth - estimate
squared_errors <- errors^2
mean(squared_errors)

# Bad: loop
total <- 0
for (i in seq_along(truth)) {
  total <- total + (truth[i] - estimate[i])^2
}
total / length(truth)

Vectorized functions:

Arithmetic: +, -, *, /, ^
Comparisons: ==, !=, >, <, >=, <=
Logical: &, |, !
Math: abs(), sqrt(), log(), exp(), sin(), cos()
Aggregations: sum(), mean(), max(), min(), median()

Use matrix operations

Efficient per-class calculations:

# Good: matrix operations
confusion_matrix <- yardstick_table(truth, estimate)
tp <- diag(confusion_matrix)
fp <- colSums(confusion_matrix) - tp
fn <- rowSums(confusion_matrix) - tp

# Bad: looping over classes
tp <- numeric(n_classes)
for (i in seq_len(n_classes)) {
  tp[i] <- confusion_matrix[i, i]
}

Use colSums() and rowSums():

# Good
class_totals <- colSums(confusion_matrix)

# Avoid
class_totals <- apply(confusion_matrix, 2, sum)  # Slower

Avoid repeated computations

General principle: Calculate once, use many times.

# Good: compute once in prep() for recipe steps
prep.step_yourname <- function(x, training, ...) {
  means <- colMeans(training[col_names])  # Computed once, stored
}

# Good: validate once at entry point
metric_vec <- function(truth, estimate, ...) {
  check_numeric_metric(truth, estimate, case_weights)  # Validate once
  metric_impl(truth, estimate, ...)  # Trust the data
}

# Good: pre-compute before loops
levels_list <- levels(truth)
n_levels <- length(levels_list)
for (i in seq_len(n_levels)) {
  # Use pre-computed values
}

# Bad: recomputing unnecessarily
for (i in seq_len(length(levels(truth)))) {
  levels_list <- levels(truth)  # Redundant!
}

Handle case weights efficiently

Convert hardhat weights once:

# Good: convert once at the start
if (!is.null(case_weights)) {
  if (inherits(case_weights, c("hardhat_importance_weights",
                               "hardhat_frequency_weights"))) {
    case_weights <- as.double(case_weights)
  }
  # Now use case_weights multiple times
}

# Bad: converting repeatedly
if (!is.null(case_weights)) {
  result1 <- weighted.mean(x, as.double(case_weights))
  result2 <- weighted.mean(y, as.double(case_weights))  # Converting again!
}

Profile before optimizing

Focus optimization where it matters:

Start with clear, correct code
Profile with profvis::profvis() if performance is an issue
Optimize the actual bottlenecks
Don’t prematurely optimize

# Profile your code
profvis::profvis({
  for (i in 1:100) {
    your_function(data)
  }
})

When performance doesn’t matter

Don’t optimize unnecessarily:

Functions typically called once or few times per evaluation
Calculation is usually fast compared to model fitting
Readability and correctness are more important

Do optimize when:

Function called thousands of times (tuning, cross-validation)
Working with very large datasets (millions of observations)
Profiling shows the function is the bottleneck

Code Validation

Validate early

step_yourname <- function(recipe, ..., your_param = 1) {
  # Validate parameters early
  if (!is.numeric(your_param) || your_param <= 0) {
    cli::cli_abort("{.arg your_param} must be a positive number.")
  }

  # ... rest of function
}

prep.step_yourname <- function(x, training, ...) {
  # Validate data early
  col_names <- recipes_eval_select(x$terms, training, info)
  check_type(training[, col_names], types = c("double", "integer"))

  # ... rest of function
}

Give actionable error messages

# Good: tells user what to do
cli::cli_abort(
  "Columns {.var {bad_cols}} must be numeric.
  Convert to numeric with {.code as.numeric()}."
)

# Avoid: vague errors
stop("Invalid columns")

Memory Management

Don’t store entire datasets

# Good: store only necessary parameters
prep.step_center <- function(x, training, ...) {
  means <- colMeans(training[col_names])  # Just means, not data
  # Return step with means stored
}

# Bad: storing entire training set
prep.step_center <- function(x, training, ...) {
  # Return step with training data stored (memory leak!)
}

Consider memory usage for large data

Store statistics/parameters, not raw data
Use sparse matrices when appropriate
Consider memory-mapped files for very large data

Code Formatting

After writing code, format it:

# Format current package
air::air_format(".")

Or use RStudio: Code → Reformat Code (Cmd/Ctrl + Shift + A)

Version Control

Commit messages

# Good: descriptive commits
"Add support for multiclass metrics"
"Fix NA handling in case weights"
"Update documentation examples"

# Avoid: vague commits
"Fix bug"
"Update code"
"Changes"

Commit frequency

Commit after each logical unit of work
Commit working, tested code
Don’t commit broken code (except on branches)