Skip to contents
library(mars)

mars_rf() is the package’s internal random-forest workflow for exploratory, structure-aware meta-analysis. It is designed as an honest first release:

  • it supports univariate, multivariate, and multilevel data structures
  • it uses package-internal tree fitting rather than rpart or ranger
  • it uses structure-aware node fitting and prediction
  • it still uses approximate split screening for speed, rather than full likelihood optimization at every candidate split

This vignette shows one example for each supported path.

Univariate Example

rf_uni <- mars_rf_univariate(
  data = teacher_expectancy,
  formula = yi ~ year + weeks + factor(setting) + factor(tester),
  studyID = "study",
  variance = "vi",
  varcov_type = "univariate",
  num_trees = 10,
  seed = 123
)

summary(rf_uni)
#> Random forest meta-analysis
#> Structure: univariate 
#> Trees: 10 
#> Predictors: year, weeks, factor.setting., factor.tester. 
#> OOB coverage: 100 %
#> OOB RMSE: 0.2547 
#> OOB R-squared: -0.3615 
#> 
#> Top variable importance:
#>        predictor importance
#>            weeks  24.179760
#>             year   7.242460
#>   factor.tester.   3.907405
#>  factor.setting.   0.000000
head(rf_importance(rf_uni))
#>         predictor importance
#> 1           weeks  0.6844047
#> 2            year  0.2049968
#> 3  factor.tester.  0.1105985
#> 4 factor.setting.  0.0000000
predict(rf_uni, newdata = teacher_expectancy[1:5, , drop = FALSE])
#> [1] 0.06393069 0.04316834 0.04041068 0.11715802 0.13678954

Multivariate Example

rf_multi <- mars_rf_multivariate(
  data = becker09,
  studyID = "ID",
  effectID = "numID",
  sample_size = "N",
  effectsize_type = "cor",
  varcov_type = "weighted",
  variable_names = c(
    "Cognitive_Performance",
    "Somatic_Performance",
    "Selfconfidence_Performance",
    "Somatic_Cognitive",
    "Selfconfidence_Cognitive",
    "Selfconfidence_Somatic"
  ),
  multivariate_covs = ~ Team,
  num_trees = 10,
  seed = 123
)

summary(rf_multi)
#> Random forest meta-analysis
#> Structure: multivariate 
#> Trees: 10 
#> Predictors: effect_id, Team 
#> OOB coverage: 100 %
#> OOB RMSE: 0.2709 
#> OOB R-squared: 0.6072 
#> 
#> Top variable importance:
#>  predictor importance
#>  effect_id   3301.379
#>       Team    224.976
head(rf_importance(rf_multi))
#>   predictor importance
#> 1 effect_id 0.93620156
#> 2      Team 0.06379844
predict(rf_multi, newdata = rf_multi$data[1:6, , drop = FALSE])
#> [1] -0.08298025 -0.11205409  0.18601101  0.33692012 -0.30669645 -0.28303643

Multilevel Example

rf_multi_level <- mars_rf_multilevel(
  data = school,
  formula = effect ~ year + (1 | district/study),
  studyID = "district",
  variance = "var",
  varcov_type = "multilevel",
  num_trees = 10,
  seed = 123
)

summary(rf_multi_level)
#> Random forest meta-analysis
#> Structure: multilevel 
#> Trees: 10 
#> Predictors: year 
#> OOB coverage: 100 %
#> OOB RMSE: 0.2949 
#> OOB R-squared: -0.7787 
#> 
#> Top variable importance:
#>  predictor importance
#>       year          0
rf_multi_level$random_tau
#> [1] 0.06506195 0.03273651
predict(rf_multi_level, newdata = school[1:5, , drop = FALSE])
#> [1] 0.07806810 0.07132870 0.17304830 0.04687979 0.17574037

Interpretation Notes

  • rf_importance() reports split-based importance accumulated across trees.
  • OOB metrics are useful for quick exploration, but they are still forest-style diagnostics rather than formal confirmatory model-fit statistics.
  • The multivariate and multilevel paths are structure-aware and use likelihood-based node refinement where possible, but the full split search is still approximate.
  • For confirmatory estimation and formal inferential output, use [mars()] and related likelihood-based modeling functions.