FAQ

Here are some of the common questions that have been asked in GitHub issues and on social media platforms.

I just want the plot, not the statistical details. How can I turn them off?

All functions in ggstatsplot that display results from statistical analysis in a subtitle have argument results.subtitle. Setting it to FALSE will return only the plot.

What statistical test was carried out?

In case you are not sure what was the statistical test that produced the results shown in the subtitle of the plot, the best way to get that information is to either look at the documentation for the function used or check out the associated vignette. For example, if you used the function ggbetweenstats, the details of the tests can be seen at the summary table. Such summary tables are available for each function.

Does ggstatsplot work with plotly?

The plotly R graphing library makes it easy to produce interactive web graphics via ‘plotly.js’. ggstatsplot functions are compatible with plotly.

# for reproducibility
set.seed(123)
library(ggstatsplot)
library(plotly)

# creating ggplot object with `ggstatsplot`
p <-
  ggstatsplot::ggbetweenstats(
    data = mtcars,
    x = cyl,
    y = mpg
  )

# converting to plotly object
plotly::ggplotly(p, width = 480, height = 480)

How can I use grouped_ functions with more than one group?

Currently, the grouped_ variants of functions only support repeating the analysis across a single grouping variable. Often, you have to run the same analysis across a combination of more than two grouping variables. This can be easily achieved using purrr package.

Here is an example-

# setup
set.seed(123)
library(tidyverse, warn.conflicts = FALSE)
library(ggstatsplot)

# creating a list by splitting dataframe by combination of two different
# grouping variables
df_list <- mpg %>%
  dplyr::filter(drv %in% c("4", "f"), fl %in% c("p", "r")) %>%
  split(x = ., f = list(.$drv, .$fl), drop = TRUE)

# checking if the length of the list is 4
length(df_list)
#> [1] 4

# running correlation analyses between
# this will return a *list* of plots
plot_list <-
  purrr::pmap(
    .l = list(
      data = df_list,
      x = "displ",
      y = "hwy",
      results.subtitle = FALSE,
      marginal.type = "densigram"
    ),
    .f = ggstatsplot::ggscatterstats
  )

# arragen the list in a single plot
ggstatsplot::combine_plots(
  plotlist = plot_list,
  nrow = 2,
  labels = c("(i)", "(ii)", "(iii)", "(iv)")
)

How can I include statistical expressions in facet labels?

set.seed(123)
library(ggplot2)
library(ggstatsplot)

# data
mtcars1 <- mtcars
statistics <-
  grouped_ggbetweenstats(
    data = mtcars1,
    x = cyl,
    y = mpg,
    grouping.var = am,
    output = "subtitle"
  )
mtcars1$am <-
  factor(mtcars1$am, levels = c(0, 1), labels = statistics)

# plot
mtcars1 %>%
  ggplot(aes(x = cyl, y = mpg)) +
  geom_jitter() +
  facet_wrap(
    vars(am),
    ncol = 1,
    strip.position = "top",
    labeller = ggplot2::label_parsed
  )

Can you customize which pairs are shown in pairwise comparisons?

Currently, for ggbetweenstats and ggwithinstats, you can either display all significant comparisons, all non-significant comparisons, or all comparisons. But what if I am only interested in just one particular comparison?

Here is a workaround using ggsignif:

set.seed(123)
library(ggstatsplot)
library(ggsignif)

# displaying only one comparison
ggbetweenstats(mtcars, cyl, wt) +
  geom_signif(comparisons = list(c("4", "6")))

How to access dataframe with results from pairwise comparisons?

library(ggstatsplot)
library(ggplot2)

# way-1

p <- ggbetweenstats(mtcars, cyl, wt, pairwise.comparisons = TRUE)

pb <- ggplot_build(p)

pb$plot$plot_env$df_pairwise
#> # A tibble: 3 x 11
#>   group1 group2 mean.difference    se t.value    df p.value significance label
#>   <chr>  <chr>            <dbl> <dbl>   <dbl> <dbl>   <dbl> <chr>        <chr>
#> 1 4      8                1.71  0.188    6.44  23.0   0     ***          list~
#> 2 6      4               -0.831 0.154    3.81  16.0   0.008 **           list~
#> 3 6      8                0.882 0.172    3.62  19.0   0.008 **           list~
#> # ... with 2 more variables: test.details <chr>, p.value.adjustment <chr>

# way-2

library(pairwiseComparisons)

pairwise_comparisons(mtcars, cyl, wt)
#> # A tibble: 3 x 11
#>   group1 group2 mean.difference    se t.value    df p.value significance label
#>   <chr>  <chr>            <dbl> <dbl>   <dbl> <dbl>   <dbl> <chr>        <chr>
#> 1 4      8                1.71  0.188    6.44  23.0   0     ***          list~
#> 2 6      4               -0.831 0.154    3.81  16.0   0.008 **           list~
#> 3 6      8                0.882 0.172    3.62  19.0   0.008 **           list~
#> # ... with 2 more variables: test.details <chr>, p.value.adjustment <chr>

How can I remove a a particular geom layer from the plot?

Sometimes you may not want a particular geom layer to be displayed. You can remove them using gginnards.

For example, let’s say we want to remove the geom_point() from ggbetweenstats default plot.

# needed libraries
library(ggstatsplot)
library(gginnards)

# plot with all geoms
p <-
  ggbetweenstats(
    data = iris,
    x = Species,
    y = Sepal.Length,
    mean.plotting = FALSE
  )

# delete geom corresponding to points
gginnards::delete_layers(x = p, match_type = "GeomPoint")

This can be helpful to add a new layer with aesthetic specifications of your liking.

set.seed(123)

# needed libraries
library(ggstatsplot)
library(gginnards)
library(ggplot2)

# basic plot without mean tagging
p <-
  ggwithinstats(
    data = bugs_long,
    x = condition,
    y = desire,
    mean.plotting = FALSE
  )

# delete the geom_point layer
p <- gginnards::delete_layers(x = p, match_type = "GeomPoint")

# add a new layers for points with a different shape
p + geom_point(shape = 23, aes(color = condition))

Suggestions

If you find any bugs or have any suggestions/remarks, please file an issue on GitHub: https://github.com/IndrajeetPatil/ggstatsplot/issues