R/ggbetweenstats.R
ggbetweenstats.Rd
A combination of box and violin plots along with jittered data points for betweensubjects designs with statistical details included in the plot as a subtitle.
ggbetweenstats( data, x, y, plot.type = "boxviolin", type = "parametric", pairwise.comparisons = FALSE, pairwise.annotation = "p.value", pairwise.display = "significant", p.adjust.method = "holm", effsize.type = "unbiased", partial = TRUE, effsize.noncentral = TRUE, bf.prior = 0.707, bf.message = TRUE, results.subtitle = TRUE, xlab = NULL, ylab = NULL, caption = NULL, title = NULL, subtitle = NULL, stat.title = NULL, sample.size.label = TRUE, k = 2, var.equal = FALSE, conf.level = 0.95, nboot = 100, tr = 0.1, sort = "none", sort.fun = mean, axes.range.restrict = FALSE, mean.label.size = 3, mean.label.fontface = "bold", mean.label.color = "black", notch = FALSE, notchwidth = 0.5, linetype = "solid", outlier.tagging = FALSE, outlier.shape = 19, outlier.label = NULL, outlier.label.color = "black", outlier.color = "black", outlier.coef = 1.5, mean.plotting = TRUE, mean.ci = FALSE, mean.size = 5, mean.color = "darkred", point.jitter.width = NULL, point.jitter.height = 0, point.dodge.width = 0.6, ggtheme = ggplot2::theme_bw(), ggstatsplot.layer = TRUE, package = "RColorBrewer", palette = "Dark2", direction = 1, ggplot.component = NULL, return = "plot", messages = TRUE )
data  A dataframe (or a tibble) from which variables specified are to be taken. A matrix or tables will not be accepted. 

x  The grouping variable from the dataframe 
y  The response (a.k.a. outcome or dependent) variable from the
dataframe 
plot.type  Character describing the type of plot. Currently supported
plots are 
type  Type of statistic expected ( 
pairwise.comparisons  Logical that decides whether pairwise comparisons
are to be displayed (default: 
pairwise.annotation  Character that decides the annotations to use for
pairwise comparisons. Either 
pairwise.display  Decides which pairwise comparisons to display.
Available options are 
p.adjust.method  Adjustment method for pvalues for multiple
comparisons. Possible methods are: 
effsize.type  Type of effect size needed for parametric tests. The
argument can be 
partial  Logical that decides if partial etasquared or omegasquared
are returned (Default: 
effsize.noncentral  Logical indicating whether to use noncentral
tdistributions for computing the confidence interval for Cohen's d
or Hedge's g (Default: 
bf.prior  A number between 
bf.message  Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: 
results.subtitle  Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: 
xlab, ylab  Labels for 
caption  The text for the plot caption. 
title  The text for the plot title. 
subtitle  The text for the plot subtitle. Will work only if

stat.title  A character describing the test being run, which will be
added as a prefix in the subtitle. The default is 
sample.size.label  Logical that decides whether sample size information
should be displayed for each level of the grouping variable 
k  Number of digits after decimal point (should be an integer)
(Default: 
var.equal  a logical variable indicating whether to treat the
variances in the samples as equal. If 
conf.level  Scalar between 0 and 1. If unspecified, the defaults return

nboot  Number of bootstrap samples for computing confidence interval
for the effect size (Default: 
tr  Trim level for the mean when carrying out 
sort  If 
sort.fun  The function used to sort (default: 
axes.range.restrict  Logical that decides whether to restrict the axes
values ranges to 
mean.label.size, mean.label.fontface, mean.label.color  Aesthetics for
the label displaying mean. Defaults: 
notch  A logical. If 
notchwidth  For a notched box plot, width of the notch relative to the
body (default 
linetype  Character strings ( 
outlier.tagging  Decides whether outliers should be tagged (Default:

outlier.shape  Hiding the outliers can be achieved by setting

outlier.label  Label to put on the outliers that have been tagged. This
can't be the same as 
outlier.label.color  Color for the label to to put on the outliers that
have been tagged (Default: 
outlier.color  Default aesthetics for outliers (Default: 
outlier.coef  Coefficient for outlier detection using Tukey's method.
With Tukey's method, outliers are below (1st Quartile) or above (3rd
Quartile) 
mean.plotting  Logical that decides whether mean is to be highlighted
and its value to be displayed (Default: 
mean.ci  Logical that decides whether 
mean.size  Point size for the data point corresponding to mean
(Default: 
mean.color  Color for the data point corresponding to mean (Default:

point.jitter.width  Numeric specifying the degree of jitter in 
point.jitter.height  Numeric specifying the degree of jitter in 
point.dodge.width  Numeric specifying the amount to dodge in the 
ggtheme  A function, 
ggstatsplot.layer  Logical that decides whether 
package  Name of package from which the palette is desired as string or symbol. 
palette  If a character string (e.g., 
direction  Either 
ggplot.component  A 
return  Character that describes what is to be returned: can be

messages  Decides whether messages references, notes, and warnings are
to be displayed (Default: 
For parametric tests, Welch's ANOVA/ttest are used as a default (i.e.,
var.equal = FALSE
).
References:
ANOVA: Delacre, Leys, Mora, & Lakens, PsyArXiv, 2018
ttest: Delacre, Lakens, & Leys, International Review of Social Psychology, 2017
If robust tests are selected, following tests are used is .
ANOVA: oneway ANOVA on trimmed means (see ?WRS2::t1way
)
ttest: Yuen's test for trimmed means (see ?WRS2::yuen
)
For more about how the effect size measures (for nonparametric tests) and
their confidence intervals are computed, see ?rcompanion::wilcoxonR
.
For repeated measures designs, use ggwithinstats
.
https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggbetweenstats.html
# \donttest{ # to get reproducible results from bootstrapping set.seed(123) library(ggstatsplot) # simple function call with the defaults ggstatsplot::ggbetweenstats( data = mtcars, x = am, y = mpg, title = "Fuel efficiency by type of car transmission", caption = "Transmission (0 = automatic, 1 = manual)" )#> Note: ShapiroWilk Normality Test for mpg: pvalue = 0.123#>#> Note: Bartlett's test for homogeneity of variances for factor am: pvalue = 0.072#># more detailed function call ggstatsplot::ggbetweenstats( data = datasets::morley, x = Expt, y = Speed, type = "np", plot.type = "box", conf.level = 0.99, xlab = "The experiment number", ylab = "Speedoflight measurement", pairwise.comparisons = TRUE, pairwise.annotation = "p.value", p.adjust.method = "fdr", outlier.tagging = TRUE, outlier.label = Run, nboot = 10, ggtheme = ggplot2::theme_grey(), ggstatsplot.layer = FALSE )#> Warning: extreme order statistics used as endpoints#> Note: 99% CI for effect size estimate was computed with 10 bootstrap samples. #>#>#> # A tibble: 10 x 5 #> group1 group2 W p.value significance #> <chr> <chr> <dbl> <dbl> <chr> #> 1 1 2 3.24 0.369 ns #> 2 1 3 3.55 0.295 ns #> 3 1 4 4.37 0.145 ns #> 4 1 5 4.13 0.145 ns #> 5 2 3 0.289 1.000 ns #> 6 2 4 2.15 0.918 ns #> 7 2 5 1.65 0.962 ns #> 8 3 4 1.88 0.961 ns #> 9 3 5 2.25 0.918 ns #> 10 4 5 0.673 1.000 ns #> Note: ShapiroWilk Normality Test for Speedoflight measurement: pvalue = 0.514#>#> Note: Bartlett's test for homogeneity of variances for factor The experiment number: pvalue = 0.021#># }