Evaluate Model Fit with Descriptive and Likelihood-Based Metrics

Computes a variety of descriptive and likelihood-based statistics for evaluating the fit of a model to observed data. This includes common descriptive error metrics, a concordance coefficient, and information criteria (AIC, BIC) based on a beta-distributed error model.

Usage

choice_mod_eval(observed, predicted, k = 0, epsilon = 0.001, ...)

Arguments

observed: A numeric vector of observed values (e.g., choice proportions). Values must lie within the closed interval [0, 1].
predicted: A numeric vector of predicted values. Must be the same length as observed, and also constrained to the closed interval [0, 1].
k: An integer specifying the number of free parameters that were used to generate the predicted values. The function incorporates an error model that adds one to this value (see Details).
epsilon: A small continuity correction used to constrain values strictly within the open interval (0, 1). Defaults to 0.001. This is necessary for the beta error model, which is undefined at 0 and 1.
...: Additional arguments passed to internal functions.

Value

An object of class "choice_mod_eval" containing:

desc_stats: A data frame of descriptive fit statistics: sample size, R-squared, mean bias, RMSE, MAE, median absolute error, and the concordance correlation coefficient.
info_criteria: A data frame listing the parameter amount, estimated $\phi$ parameter value (precision of the beta distribution), log-likelihood, AIC, and BIC.
residuals: A numeric vector of residuals (observed - predicted), not printed by default.

Details

The residual-based coefficient of determination ($R^2$) is calculated as: $$R^2 = 1 - \frac{\sum_i (y_i - \hat{y}_i)^2}{\sum_i (y_i - \bar{y})^2}$$ where $y_i$ are the observed values, $\hat{y}_i$ are the predicted values, and $\bar{y}$ is the mean of the observed values.

While this form of $R^2$ is widely used as a descriptive measure of fit, it should be interpreted with caution in the context of nonlinear models. Unlike in linear regression, $R^2$ does not have a clear interpretation in terms of explained variance, nor is it guaranteed to fall between 0 and 1. In particular, it can take on negative values when the model fits worse than the mean. As such, it is best used here as a rough, supplementary indicator of model performance rather than a definitive measure of fit-effectively, a "pseudo-$R^2$."

For the likelihood-based metrics in $info_criteria, a beta-distributed error model for choice proportions is incorporated. This adds one fitted parameter ($\phi$), which is shared across all observations. Consequently, AIC and BIC values are computed as having k + 1 free parameters.

The error model assumes that each observed value $y_i$ is drawn independently from a Beta distribution with a mean equal to the choice model's prediction $\mu_i$ and a precision parameter $\phi$: $$y_i \sim \text{Beta}(\alpha_i, \beta_i)$$ with: $$\alpha_i = \mu_i \cdot \phi,\quad \beta_i = (1 - \mu_i) \cdot \phi$$ where $\mu_i$ is the predicted value and $\phi$ is the precision parameter.

The total log-likelihood is computed as: $$\log L = \sum_i \log \left[ \text{Beta}(y_i | \alpha_i, \beta_i) \right]$$

The AIC and BIC are then computed as: $$\text{AIC} = 2k - 2\log L,\quad \text{BIC} = \log(n) \cdot k - 2\log L$$ where $k$ is the number of estimated parameters and $n$ is the number of observations. Note, because of the incorporation of the Beta error model, the function adds a value of 1 to whatever value of k is supplied. i.e., k = k + 1

Observed and predicted values are assumed to lie strictly within the unit interval (0, 1). Values near 0 or 1 are adjusted using the epsilon parameter to avoid undefined behaviour under the beta distribution.

The choice of epsilon can strongly influence the log-likelihood, particularly when values approach 0 or 1. Users are encouraged to check the sensitivity of the output to this value if model selection is a focus.

When the sample size is less than 30, a message is printed to remind users that information criteria may be unstable or overly sensitive in small-sample settings.

Examples

obs <- c(0.2, 0.4, 0.6, 0.8)
pred <- c(0.25, 0.35, 0.65, 0.75)
result <- choice_mod_eval(obs, pred)
#> ℹ Sample size is small (n < 30); AIC and BIC may be unstable or less reliable for model comparison.
#> Warning: ⚠ The ratio of observations to free parameters is low (n = 4, k = 1). Information criteria such as AIC and BIC may be unreliable when n/k ≤ 10. Consider simplifying your model or increasing sample size.
result
#> $desc_stats
#>   n r_squared     mean_bias rmse  mae median_ae      ccc
#> 1 4      0.95 -1.387779e-17 0.05 0.05      0.05 0.972973
#> 
#> $info_criteria
#>   n_parameters      phi   logLik       AIC       BIC
#> 1            1 79.54191 6.409243 -10.81849 -11.43219
#> 
#> Use `object$residuals` to access the residuals.
result$residuals  # Access residuals directly
#> [1] -0.05  0.05 -0.05  0.05