Dot-and-whisker plots for regression analyses

ggcoefstats(
  x,
  output = "plot",
  statistic = NULL,
  conf.int = TRUE,
  conf.level = 0.95,
  k = 2L,
  exclude.intercept = TRUE,
  exponentiate = FALSE,
  effsize = "eta",
  meta.analytic.effect = FALSE,
  meta.type = "parametric",
  bf.message = TRUE,
  sort = "none",
  xlab = "regression coefficient",
  ylab = "term",
  title = NULL,
  subtitle = NULL,
  caption = NULL,
  only.significant = FALSE,
  point.args = list(size = 3, color = "blue"),
  errorbar.args = list(height = 0),
  vline = TRUE,
  vline.args = list(size = 1, linetype = "dashed"),
  stats.labels = TRUE,
  stats.label.color = NULL,
  stats.label.args = list(size = 3, direction = "y"),
  package = "RColorBrewer",
  palette = "Dark2",
  ggtheme = ggplot2::theme_bw(),
  ggstatsplot.layer = TRUE,
  ...
)

Arguments

x

A model object to be tidied, or a tidy data frame containing results from a regression model. Function internally uses parameters::model_parameters or broom::tidy to get a tidy dataframe. If a data frame is used, it must contain columns named term (names of predictors) and estimate (corresponding estimates of coefficients or other quantities of interest).

output

Character describing the expected output from this function: "plot" (visualization of regression coefficients) or "tidy" (tidy dataframe of results from broom::tidy) or "glance" (object from broom::glance) or "augment" (object from broom::augment).

statistic

Which statistic is to be displayed (either "t" or "f"or "z" or "chi") in the label. This is relevant if the x argument is a dataframe.

conf.int

Logical. Decides whether to display confidence intervals as error bars (Default: TRUE).

conf.level

Numeric deciding level of confidence or credible intervals (Default: 0.95).

k

Number of digits after decimal point (should be an integer) (Default: k = 2L).

exclude.intercept

Logical that decides whether the intercept should be excluded from the plot (Default: TRUE).

exponentiate

If TRUE, the x-axis will be logarithmic (Default: FALSE). Note that exponents for the coefficient estimates and associated standard errors plus confidence intervals are computed by the underlying tidying packages (broom/parameters) and not done by ggcoefstats. So this might not work if the underlying packages don't support exponentiation.

effsize

Character describing the effect size to be displayed: "eta" (default) or "omega". This argument is relevant only for models objects of class aov, anova, aovlist, "Gam", and "manova".

meta.analytic.effect

Logical that decides whether subtitle for meta-analysis via linear (mixed-effects) models (default: FALSE). If TRUE, input to argument subtitle will be ignored. This will be mostly relevant if a data frame with estimates and their standard errors is entered.

meta.type

Type of statistics used to carry out random-effects meta-analysis. If "parametric" (default), metafor::rma function will be used. If "robust", metaplus::metaplus function will be used. If "bayes", metaBMA::meta_random function will be used.

bf.message

Logical that decides whether results from running a Bayesian meta-analysis assuming that the effect size d varies across studies with standard deviation t (i.e., a random-effects analysis) should be displayed in caption. Defaults to TRUE.

sort

If "none" (default) do not sort, "ascending" sort by increasing coefficient value, or "descending" sort by decreasing coefficient value.

xlab, ylab

Labels for x- and y- axis variables, respectively (Defaults: "regression coefficient" and "term").

title

The text for the plot title.

subtitle

The text for the plot subtitle. The input to this argument will be ignored if meta.analytic.effect is set to TRUE.

caption

Text to display as caption. This argument is relevant only when output = "caption".

only.significant

If TRUE, only stats labels for significant effects is shown (Default: FALSE). This can be helpful when a large number of regression coefficients are to be displayed in a single plot. Relevant only when the output is a plot.

point.args

Additional arguments that will be passed to ggplot2::geom_point geom. Please see documentation for that function to know more about these arguments.

errorbar.args

Additional arguments that will be passed to ggplot2::geom_errorbarh geom. Please see documentation for that function to know more about these arguments.

vline

Decides whether to display a vertical line (Default: "TRUE").

vline.args

Additional arguments that will be passed to ggplot2::geom_vline geom. Please see documentation for that function to know more about these arguments.

stats.labels

Logical. Decides whether the statistic and p-values for each coefficient are to be attached to each dot as a text label using ggrepel (Default: TRUE).

stats.label.color

Color for the labels. If set to NULL, colors will be chosen from the specified package (Default: "RColorBrewer") and palette (Default: "Dark2").

stats.label.args

Additional arguments that will be passed to ggrepel::geom_label_repel geom. Please see documentation for that function to know more about these arguments.

package

Name of the package from which the given palette is to be extracted. The available palettes and packages can be checked by running View(paletteer::palettes_d_names).

palette

Name of the package from which the given palette is to be extracted. The available palettes and packages can be checked by running View(paletteer::palettes_d_names).

ggtheme

A function, ggplot2 theme name. Default value is ggplot2::theme_bw(). Any of the ggplot2 themes, or themes from extension packages are allowed (e.g., ggthemes::theme_fivethirtyeight(), hrbrthemes::theme_ipsum_ps(), etc.).

ggstatsplot.layer

Logical that decides whether theme_ggstatsplot theme elements are to be displayed along with the selected ggtheme (Default: TRUE). theme_ggstatsplot is an opinionated theme layer that override some aspects of the selected ggtheme.

...

Additional arguments to tidying method. For more, see parameters::model_parameters and broom::tidy.

Value

Plot with the regression coefficients' point estimates as dots with confidence interval whiskers and other statistical details included as labels.

Note

All rows of regression estimates where either of the following quantities is NA will be removed if labels are requested: estimate, statistic, p.value.

References

https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggcoefstats.html

Examples

# \donttest{ # for reproducibility set.seed(123) # -------------- with model object -------------------------------------- # model object mod <- lm(formula = mpg ~ cyl * am, data = mtcars) # to get a plot ggstatsplot::ggcoefstats(x = mod, output = "plot")
# to get a tidy dataframe ggstatsplot::ggcoefstats(x = mod, output = "tidy")
#> # A tibble: 3 x 10 #> term estimate std.error conf.low conf.high statistic df.error p.value #> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> #> 1 cyl -1.98 0.449 -2.89 -1.06 -4.40 28 0.000141 #> 2 am 10.2 4.30 1.36 19.0 2.36 28 0.0253 #> 3 cyl:am -1.31 0.707 -2.75 0.143 -1.85 28 0.0755 #> significance #> <chr> #> 1 *** #> 2 * #> 3 ns #> label #> <chr> #> 1 list(~widehat(italic(beta))==-1.98, ~italic(t)(28)==-4.40, ~italic(p)==1.41e-~ #> 2 list(~widehat(italic(beta))==10.18, ~italic(t)(28)==2.36, ~italic(p)==0.025) #> 3 list(~widehat(italic(beta))==-1.31, ~italic(t)(28)==-1.85, ~italic(p)==0.076)
# to get a glance summary ggstatsplot::ggcoefstats(x = mod, output = "glance")
#> # A tibble: 1 x 13 #> r.squared adj.r.squared sigma statistic p.value df loglik aic bic #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 0.785 0.762 2.94 34.1 0.00000000173 3 -77.8 166. 173. #> deviance df.residual nobs rmse #> <dbl> <int> <int> <dbl> #> 1 242. 28 32 2.75
# to get augmented dataframe ggstatsplot::ggcoefstats(x = mod, output = "augment")
#> # A tibble: 32 x 10 #> .rownames mpg cyl am .fitted .resid .std.resid .hat .sigma #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 Mazda RX4 21 6 1 21.4 -0.364 -0.131 0.106 2.99 #> 2 Mazda RX4 Wag 21 6 1 21.4 -0.364 -0.131 0.106 2.99 #> 3 Datsun 710 22.8 4 1 27.9 -5.13 -1.86 0.117 2.80 #> 4 Hornet 4 Drive 21.4 6 0 19.0 2.38 0.842 0.0735 2.96 #> 5 Hornet Sportabout 18.7 8 0 15.1 3.63 1.29 0.0784 2.90 #> 6 Valiant 18.1 6 0 19.0 -0.919 -0.325 0.0735 2.99 #> 7 Duster 360 14.3 8 0 15.1 -0.768 -0.272 0.0784 2.99 #> 8 Merc 240D 24.4 4 0 23.0 1.43 0.563 0.255 2.98 #> 9 Merc 230 22.8 4 0 23.0 -0.171 -0.0672 0.255 2.99 #> 10 Merc 280 19.2 6 0 19.0 0.181 0.0639 0.0735 2.99 #> .cooksd #> <dbl> #> 1 0.000510 #> 2 0.000510 #> 3 0.114 #> 4 0.0141 #> 5 0.0353 #> 6 0.00209 #> 7 0.00157 #> 8 0.0271 #> 9 0.000387 #> 10 0.0000811 #> # ... with 22 more rows
# -------------- with custom dataframe ----------------------------------- # creating a dataframe df <- structure( list( term = structure( c(3L, 4L, 1L, 2L, 5L), .Label = c( "Africa", "Americas", "Asia", "Europe", "Oceania" ), class = "factor" ), estimate = c( 0.382047603321706, 0.780783111514665, 0.425607573765058, 0.558365541235078, 0.956473848429961 ), std.error = c( 0.0465576338644502, 0.0330218199731529, 0.0362834986178494, 0.0480571500648261, 0.062215818388157 ), statistic = c( 8.20590677855356, 23.6444603038067, 11.7300588415607, 11.6187818146078, 15.3734833553524 ), conf.low = c( 0.290515146096969, 0.715841986960399, 0.354354575031406, 0.46379116008131, 0.827446138277154 ), conf.high = c( 0.473580060546444, 0.845724236068931, 0.496860572498711, 0.652939922388847, 1.08550155858277 ), p.value = c( 3.28679518728519e-15, 4.04778497135963e-75, 7.59757330804449e-29, 5.45155840151592e-26, 2.99171217913312e-13 ), df.error = c( 394L, 358L, 622L, 298L, 22L ) ), row.names = c(NA, -5L), class = c( "tbl_df", "tbl", "data.frame" ) ) # plotting the dataframe ggstatsplot::ggcoefstats( x = df, statistic = "t", meta.analytic.effect = TRUE, k = 3 )
#> Warning: There were 8 divergent transitions after warmup. See #> http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup #> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
# }