The function ggstatsplot::ggcoefstats generates dot-and-whisker plots for regression models saved in a tidy data frame (produced with the broom package and its mixed-effects modeling variant broom.mixed package).

By default, the plot displays 95% confidence intervals for the regression coefficients. The function currently supports only those classes of object that are supported by the broom package. For an exhaustive list, see- https://broom.tidyverse.org/articles/available-methods.html

In this vignette, we will see examples of how to use this function. We will try to cover as many classes of objects as possible. Unfortunately, there is no single dataset that will be helpful for carrying out all types of regression analyses and, therefore, we will use various datasets to explore data-specific hypotheses using regression models.

Note before: The following demo uses the pipe operator (%>%), so in case you are not familiar with this operator, here is a good explanation: http://r4ds.had.co.nz/pipes.html

General structure of the plots

Although the statistical models displayed in the plot may differ based on the class of models being investigated, there are few aspects of the plot that will be invariant across models:

  • The dot-whisker plot contains a dot representing the estimate and their confidence intervals (95% is the default). The estimate can either be effect sizes (for tests that depend on the F statistic) or regression coefficients (for tests with t and z statistic), etc. The function will, by default, display a helpful x-axis label that should clear up what estimates are being displayed. The confidence intervals can sometimes be asymmetric if bootstrapping was used.

  • The caption will always contain diagnostic information, if available, about models that can be useful for model selection: The smaller the Akaike’s Information Criterion (AIC) and the Bayesian Information Criterion (BIC) values, the “better” the model is. Additionally, the higher the log-likelihood value the “better” is the model fit.

  • The output of this function will be a ggplot2 object and, thus, it can be further modified (e.g., change themes, etc.) with ggplot2 functions.

Most of the regression models that are supported in the broom and broom.mixed packages with tidy and glance methods are also supported by ggcoefstats. For example-

aareg, anova, aov, aovlist, Arima, bglmerMod, bigglm, biglm, blmerMod, bracl, brglm2, brmsfit, btergm, cch, clm, clmm, confusionMatrix, coxph, drc, emmGrid, epi.2by2, ergm, felm, fitdistr, glmerMod, glmmTMB, gls, gam, Gam, gamlss, garch, glm, glmmadmb, glmmPQL, glmmTMB, glmRob, glmrob, gmm, ivreg, lm, lm.beta, lmerMod, lmodel2, lmRob, lmrob, mcmc, MCMCglmm, mclogit, mmclogit, mediate, mjoint, mle2, mlm, multinom, negbin, nlmerMod, nlrq, nls, orcutt, plm, polr, ridgelm, rjags, rlm, rlmerMod, rq, speedglm, speedlm, stanreg, survreg, svyglm, svyolr, svyglm, tobit, wblm, etc.

In the following examples, we will try out a number of regression models and, additionally, we will also see how we can change different aspects of the plot itself.

omnibus ANOVA (aov)

For this analysis, let’s use the movies_long dataset, which provides information about IMDB ratings, budget, length, MPAA ratings (R-rated, PG, or PG-13), and genre for a number of movies. Let’s say our hypothesis is that the IMDB ratings for a movie are predicted by a multiplicative effect of the genre and the MPAA rating it got. Let’s carry out an omnibus ANOVA to see if this is the case.

As this plot shows, there is no interaction effect between these two factors.

Note that we can also use this function for model selection. You can try out different models with the code below and see how the AIC, BIC, and log-likelihood values change. Looking at the model diagnostics, you should be able to see that the model with only genre as the predictor of ratings seems to perform almost equally well as more complicated additive and multiplicative models. Although there is certainly some improvement with additive and multiplicative models, it is by no means convincing enough for us to abandon a simpler model.

linear model (lm)

Now that we have figured out that the movie’s genre explains a fair amount of the variation in how people rate the movie on IMDB, let’s run a linear regression model to see how different types of genres compare with each other using Action movies as our comparison point.

As can be seen from the regression coefficients, compared to action movies, comedies and action comedies are not rated significantly better. All three of the “drama” types (pure, action or comedy dramas) have statistical significantly higher regression coefficients. This finding occurs even with our more conservative 0.99 confidence interval.

linear mixed-effects model (lmer/lmerMod)

Now let’s say we want to see how movie’s budget relates to how good the movie is rated to be on IMDB (e.g., more money, better ratings?). But we have reasons to believe that the relationship between these two variables might be different for different genres (e.g., budget might be a good predictor of how good the movie is rated to be for animations or actions movies as more money can help with better visual effects and animations, but this may not be true for dramas, so we don’t want to use stats::lm. In this case, therefore, we will be running a linear mixed-effects model (using lme4::lmer and p-values generated using the sjstats::p_values function) with a random slope for the genre variable.

As can be seen from these plots, although there seems to be a really small correlation between budget and rating in a linear model, this effect is not significant once we take into account the hierarchical structure of the data.

Note that for mixed-effects models, only the fixed effects are shown because there are no confidence intervals for random effects terms. In case, you would like to see these terms, you can enter the same object you entered as x argument to ggcoefstats in broom::tidy:

non-linear mixed-effects model (nlmer/nlmerMod)

non-linear least-squares model (nls)

So far we have been assuming a linear relationship between movie budget and rating. But what if we want to also explore the possibility of a non-linear relationship? In that case, we can run a non-linear least squares regression. Note that you need to choose some non-linear function, which will be based on prior exploratory data analysis (y ~ k/x + c implemented here, but you can try out other non-linear functions, e.g. Y ~ k * exp(-b*c)).

This analysis shows that there is indeed a possible non-linear association between rating and budget (non-linear regression term k is significant), at least with the particular non-linear function we used.

generalized linear model (glm)

In all the analyses carried out thus far, the outcome variable (y in y ~ x) has been continuous. In case the outcome variable is nominal/categorical/factor, we can use the generalized form of linear model that works even if the response is a numeric vector or a factor vector, etc.

To explore this model, we will use the Titanic dataset, which tabulates information on the fate of passengers on the fatal maiden voyage of the ocean liner Titanic, summarized according to economic status (class), sex, age, and survival. Let’s say we want to know what was the strongest predictor of whether someone survived the Titanic disaster-

As can be seen from the regression coefficients, all entered predictors were significant predictors of the outcome. More specifically, being a female was associated with higher likelihood of survival (compared to male). On other hand, being an adult was associated with decreased likelihood of survival (compared to child).

Note: Few things to keep in mind for glm models,

  • The exact statistic will depend on the family used. Below we will see a host of different function calls to glm with a variety of different families.

  • Some families will have a t statistic associated with them, while others a z statistic. The function will figure this out for you.

# creating dataframes to use for regression analyses
library(ggstatsplot)

# dataframe #1
(
  df.counts <-
    base::data.frame(
      treatment = gl(n = 3, k = 3, length = 9),
      outcome = gl(n = 3, k = 1, length = 9),
      counts = c(18, 17, 15, 20, 10, 20, 25, 13, 12)
    ) %>%
    tibble::as_tibble(x = .)
)
#> # A tibble: 9 x 3
#>   treatment outcome counts
#>   <fct>     <fct>    <dbl>
#> 1 1         1           18
#> 2 1         2           17
#> 3 1         3           15
#> 4 2         1           20
#> 5 2         2           10
#> 6 2         3           20
#> 7 3         1           25
#> 8 3         2           13
#> 9 3         3           12

# dataframe #2
(df.clotting <- data.frame(
  u = c(5, 10, 15, 20, 30, 40, 60, 80, 100),
  lot1 = c(118, 58, 42, 35, 27, 25, 21, 19, 18),
  lot2 = c(69, 35, 26, 21, 18, 16, 13, 12, 12)
) %>%
  tibble::as_tibble(x = .))
#> # A tibble: 9 x 3
#>       u  lot1  lot2
#>   <dbl> <dbl> <dbl>
#> 1     5   118    69
#> 2    10    58    35
#> 3    15    42    26
#> 4    20    35    21
#> 5    30    27    18
#> 6    40    25    16
#> 7    60    21    13
#> 8    80    19    12
#> 9   100    18    12

# dataframe #3
x1 <- stats::rnorm(50)
y1 <- stats::rpois(n = 50, lambda = exp(1 + x1))
(df.3 <- data.frame(x = x1, y = y1) %>%
  tibble::as_tibble(x = .))
#> # A tibble: 50 x 2
#>          x     y
#>      <dbl> <int>
#>  1  1.56      12
#>  2  0.0705     5
#>  3  0.129      3
#>  4  1.72      14
#>  5  0.461      8
#>  6 -1.27       0
#>  7 -0.687      0
#>  8 -0.446      4
#>  9  1.22      11
#> 10  0.360      2
#> # ... with 40 more rows

# dataframe #4
x2 <- stats::rnorm(50)
y2 <- rbinom(
  n = 50,
  size = 1,
  prob = stats::plogis(x2)
)

(df.4 <- data.frame(x = x2, y = y2) %>%
  tibble::as_tibble(x = .))
#> # A tibble: 50 x 2
#>          x     y
#>      <dbl> <int>
#>  1 -0.779      1
#>  2 -0.375      1
#>  3 -0.319      1
#>  4  0.0845     0
#>  5 -0.768      1
#>  6 -0.626      0
#>  7 -0.901      0
#>  8  0.664      1
#>  9  0.300      1
#> 10  0.0749     1
#> # ... with 40 more rows

# combining all plots in a single plot
ggstatsplot::combine_plots(
  # Family: Poisson
  ggstatsplot::ggcoefstats(
    x = stats::glm(
      formula = counts ~ outcome + treatment,
      data = df.counts,
      family = stats::poisson(link = "log")
    ),
    title = "Family: Poisson",
    stats.label.color = "black"
  ),
  # Family: Gamma
  ggstatsplot::ggcoefstats(
    x = stats::glm(
      formula = lot1 ~ log(u),
      data = df.clotting,
      family = stats::Gamma(link = "inverse")
    ),
    title = "Family: Gamma",
    stats.label.color = "black"
  ),
  # Family: Quasi
  ggstatsplot::ggcoefstats(
    x = stats::glm(
      formula = y ~ x,
      family = quasi(variance = "mu", link = "log"),
      data = df.3
    ),
    title = "Family: Quasi",
    stats.label.color = "black"
  ),
  # Family: Quasibinomial
  ggstatsplot::ggcoefstats(
    x = stats::glm(
      formula = y ~ x,
      family = stats::quasibinomial(link = "logit"),
      data = df.4
    ),
    title = "Family: Quasibinomial",
    stats.label.color = "black"
  ),
  # Family: Quasipoisson
  ggstatsplot::ggcoefstats(
    x = stats::glm(
      formula = y ~ x,
      family = stats::quasipoisson(link = "log"),
      data = df.4
    ),
    title = "Family: Quasipoisson",
    stats.label.color = "black"
  ),
  # Family: Gaussian
  ggstatsplot::ggcoefstats(
    x = stats::glm(
      formula = Sepal.Length ~ Species,
      family = stats::gaussian(link = "identity"),
      data = iris
    ),
    title = "Family: Gaussian",
    stats.label.color = "black"
  ),
  labels = c("(a)", "(b)", "(c)", "(d)", "(e)", "(f)"),
  ncol = 2,
  title.text = "Exploring models with different `glm` families",
  title.color = "blue"
)

generalized linear mixed-effects model (glmer/glmerMod)

In the previous example, we saw how being a female and being a child was predictive of surviving the Titanic disaster. But in that analysis, we didn’t take into account one important factor: the passenger class in which people were traveling. Naively, we have reasons to believe that the effects of sex and age might be dependent on the class (maybe rescuing passengers in the first class were given priority?). To take into account this hierarchical structure of the data, we can run generalized linear mixed effects model with a random slope for class.

As we had expected, once we take into account the differential relationship that might exist between survival and predictors across different passenger classes, only the sex factor remain a significant predictor. In other words, being a female was the strongest predictor of whether someone survived the tragedy that befell the Titanic.

generalized linear mixed models using Template Model Builder (glmmTMB)

glmmTMB package allows for flexibly fitting generalized linear mixed models (GLMMs) and extensions. Model objects from this package are also supported.

generalized linear mixed models using AD Model Builder (glmmadmb)

Another option is to use glmmadmb package.

repeated measures ANOVA (aovlist)

Let’s now consider an example of a repeated measures design where we want to run omnibus ANOVA with a specific error structure. To carry out this analysis, we will first have to convert the iris dataset from wide to long format such that there is one column corresponding to attribute (which part of the calyx of a flower is being measured: sepal or petal?) and one column corresponding to measure used (length or width?). Note that this is within-subjects design since the same flower has both measures for both attributes. The question we are interested in is how much of the variance in measurements is explained by both of these factors and their interaction.

As revealed by this analysis, all effects of this model are significant. But most of the variance is explained by the attribute, with the next important explanatory factor being the measure used. A very little amount of variation in measurement is accounted for by the interaction between these two factors.

fit a linear model with multiple group fixed effects (felm)

Models of class felm from lfe package are also supported. This method is used to fit linear models with multiple group fixed effects, similarly to lm. It uses the Method of Alternating projections to sweep out multiple group effects from the normal equations before estimating the remaining coefficients with OLS.

analysis of factorial experiments (mixed)

# setup
set.seed(123)
library(afex)
data(sleepstudy)

# data
sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE)
sleepstudy$mysubgrp <- NA
for (i in 1:5) {
  filter_group <- sleepstudy$mygrp == i
  sleepstudy$mysubgrp[filter_group] <-
    sample(1:30, size = sum(filter_group), replace = TRUE)
}

# linear model
m1 <- afex::mixed(Reaction ~ Days + (1 + Days | Subject),
  data = sleepstudy
)

# linear mixed-effects model
m2 <- afex::mixed(Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject),
  data = sleepstudy
)

# plot
ggstatsplot::combine_plots(
  ggstatsplot::ggcoefstats(m1, title = "linear model (`afex` package)"),
  ggstatsplot::ggcoefstats(m1, title = "linear mixed-effects model (`afex` package)"),
  labels = c("(a)", "(b)")
)

mixed conditional logit models (mmclogit)

Cox proportional hazards regression model (coxph)

Fitted proportional hazards regression model - as implemented in the survival package - can also be displayed in a dot-whisker plot.

tobit regression (tobit)

# setup
set.seed(123)
library(AER)
data("Affairs", package = "AER")

# model
m1 <-
  AER::tobit(affairs ~ age + yearsmarried + religiousness + occupation + rating,
    data = Affairs
  )

generalized additive models with integrated smoothness estimation (gam)

Important: These model outputs contains both parametric and smooth terms. ggcoefstats only displays the parametric terms.

linear model using generalized least squares (gls)

The nlme package provides a function to fit a linear model using generalized least squares. The errors are allowed to be correlated and/or have unequal variances.

TERGM by bootstrapped pseudolikelihood or MCMC MLE (btergm)

generalized autoregressive conditional heteroscedastic (garch)

Bayesian generalized (non-)linear multivariate multilevel models (brmsfit)

# setup
set.seed(123)
library(brms)

# prior
bprior1 <- prior(student_t(5, 0, 10), class = b) +
  prior(cauchy(0, 2), class = sd)

# model
fit1 <- brms::brm(
  formula = count ~ Age + Base * Trt + (1 | patient),
  data = epilepsy,
  family = poisson(),
  prior = bprior1,
  silent = TRUE
)
#> 
#> SAMPLING FOR MODEL '7fc0444ee315785f19884bc3c9b2a5ab' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 10.108 seconds (Warm-up)
#> Chain 1:                5.617 seconds (Sampling)
#> Chain 1:                15.725 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL '7fc0444ee315785f19884bc3c9b2a5ab' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 0 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 22.917 seconds (Warm-up)
#> Chain 2:                54.186 seconds (Sampling)
#> Chain 2:                77.103 seconds (Total)
#> Chain 2: 
#> 
#> SAMPLING FOR MODEL '7fc0444ee315785f19884bc3c9b2a5ab' NOW (CHAIN 3).
#> Chain 3: 
#> Chain 3: Gradient evaluation took 0 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3: 
#> Chain 3: 
#> Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 3: 
#> Chain 3:  Elapsed Time: 8.572 seconds (Warm-up)
#> Chain 3:                5.484 seconds (Sampling)
#> Chain 3:                14.056 seconds (Total)
#> Chain 3: 
#> 
#> SAMPLING FOR MODEL '7fc0444ee315785f19884bc3c9b2a5ab' NOW (CHAIN 4).
#> Chain 4: 
#> Chain 4: Gradient evaluation took 0 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4: 
#> Chain 4: 
#> Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 4: 
#> Chain 4:  Elapsed Time: 7.43 seconds (Warm-up)
#> Chain 4:                6.643 seconds (Sampling)
#> Chain 4:                14.073 seconds (Total)
#> Chain 4:

# plot
ggstatsplot::ggcoefstats(
  x = fit1,
  exclude.intercept = FALSE,
  conf.method = "HPDinterval",
  title = "Bayesian generalized (non-)linear \nmultivariate multilevel models",
  subtitle = "using `brms` package"
)
#> Note: No model diagnostics information available, so skipping caption.
#> Note: No p-values and/or statistic available for the model object;
#> skipping labels with stats.

Let’s see another example where we use brms to run the same model on multiple datasets-

# setup
set.seed(123)
library(brms)
library(mice)

# data
imp <- mice(nhanes2)
#> 
#>  iter imp variable
#>   1   1  bmi  hyp  chl
#>   1   2  bmi  hyp  chl
#>   1   3  bmi  hyp  chl
#>   1   4  bmi  hyp  chl
#>   1   5  bmi  hyp  chl
#>   2   1  bmi  hyp  chl
#>   2   2  bmi  hyp  chl
#>   2   3  bmi  hyp  chl
#>   2   4  bmi  hyp  chl
#>   2   5  bmi  hyp  chl
#>   3   1  bmi  hyp  chl
#>   3   2  bmi  hyp  chl
#>   3   3  bmi  hyp  chl
#>   3   4  bmi  hyp  chl
#>   3   5  bmi  hyp  chl
#>   4   1  bmi  hyp  chl
#>   4   2  bmi  hyp  chl
#>   4   3  bmi  hyp  chl
#>   4   4  bmi  hyp  chl
#>   4   5  bmi  hyp  chl
#>   5   1  bmi  hyp  chl
#>   5   2  bmi  hyp  chl
#>   5   3  bmi  hyp  chl
#>   5   4  bmi  hyp  chl
#>   5   5  bmi  hyp  chl

# fit the model using mice and lm
fit_imp1 <- with(lm(bmi ~ age + hyp + chl), data = imp)

# fit the model using brms
fit_imp2 <- brms::brm_multiple(
  formula = bmi ~ age + hyp + chl, data = imp,
  chains = 1
)
#> 
#> SAMPLING FOR MODEL 'd4eb97f08850361da56832c67c1bb81f' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.203 seconds (Warm-up)
#> Chain 1:                0.051 seconds (Sampling)
#> Chain 1:                0.254 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'd4eb97f08850361da56832c67c1bb81f' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.256 seconds (Warm-up)
#> Chain 1:                0.065 seconds (Sampling)
#> Chain 1:                0.321 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'd4eb97f08850361da56832c67c1bb81f' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.171 seconds (Warm-up)
#> Chain 1:                0.035 seconds (Sampling)
#> Chain 1:                0.206 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'd4eb97f08850361da56832c67c1bb81f' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.22 seconds (Warm-up)
#> Chain 1:                0.041 seconds (Sampling)
#> Chain 1:                0.261 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'd4eb97f08850361da56832c67c1bb81f' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.221 seconds (Warm-up)
#> Chain 1:                0.046 seconds (Sampling)
#> Chain 1:                0.267 seconds (Total)
#> Chain 1:

# plot
ggstatsplot::ggcoefstats(
  x = fit_imp2,
  title = "Same `brms` model with multiple datasets",
  conf.level = 0.99
)
#> Note: No model diagnostics information available, so skipping caption.
#> Note: No p-values and/or statistic available for the model object;
#> skipping labels with stats.

dataframes (tbl_df, tbl, data.frame)

Sometimes you don’t have a model object but a custom dataframe that you want display using this function. If a data frame is to be plotted, it must contain columns named term (names of predictors), and estimate (corresponding estimates of coefficients or other quantities of interest). Other optional columns are conf.low and conf.high (for confidence intervals), and p.value. You will also have to specify the type of statistic relevant for regression models ("t", "z", "f") in case you want to display statistical labels.

meta-analysis

In case the estimates you are displaying come from multiple studies, you can also use this function to carry out random-effects meta-analysis (as implemented in the metafor package; see metafor::rma()).

The dataframe you enter must contain at the minimum the following three columns- term, estimate, std.error.

Or you can also provide a dataframe containing all the other relevant information for additionally displaying labels with statistical information.

And much more…

This vignette was supposed to give a comprehensive account of regression models supported by ggcoefstats. The list of supported models will keep expanding as additional tidiers are added to the broom and broom.mixed package: https://broom.tidyverse.org/articles/available-methods.html

Note that not all models supported by broom will be supported by ggcoefstats. In particular, classes of objects for which there is no column for estimate (e.g., kmeans, optim, muhaz, survdiff, zoo, etc.) are not supported.

Suggestions

If you find any bugs or have any suggestions/remarks, please file an issue on GitHub: https://github.com/IndrajeetPatil/ggstatsplot/issues