Skip to contents

You can cite this package/vignette as:


  Patil, I. (2021). Visualizations with statistical details: The
  'ggstatsplot' approach. Journal of Open Source Software, 6(61), 3167,
  doi:10.21105/joss.03167

A BibTeX entry for LaTeX users is

  @Article{,
    doi = {10.21105/joss.03167},
    url = {https://doi.org/10.21105/joss.03167},
    year = {2021},
    publisher = {{The Open Journal}},
    volume = {6},
    number = {61},
    pages = {3167},
    author = {Indrajeet Patil},
    title = {{Visualizations with statistical details: The {'ggstatsplot'} approach}},
    journal = {{Journal of Open Source Software}},
  }

Introduction

Pairwise comparisons with ggstatsplot.

Summary of types of statistical analyses

Following table contains a brief summary of the currently supported pairwise comparison tests-

Between-subjects design

Type Equal variance? Test p-value adjustment? Function used
Parametric No Games-Howell test stats::pairwise.t.test
Parametric Yes Student’s t-test PMCMRplus::gamesHowellTest
Non-parametric No Dunn test PMCMRplus::kwAllPairsDunnTest
Robust No Yuen’s trimmed means test WRS2::lincon
Bayesian NA Student’s t-test NA BayesFactor::ttestBF

Within-subjects design

Type Test p-value adjustment? Function used
Parametric Student’s t-test stats::pairwise.t.test
Non-parametric Durbin-Conover test PMCMRplus::durbinAllPairsTest
Robust Yuen’s trimmed means test WRS2::rmmcp
Bayesian Student’s t-test NA BayesFactor::ttestBF

Examples

Here we will see specific examples of how to use this function for different types of

  • designs (between or within subjects)
  • statistics (parametric, non-parametric, robust, Bayesian)
  • p-value adjustment methods

Between-subjects design

## for reproducibility
set.seed(123)
library(ggstatsplot)
library(statsExpressions) ## for data

## parametric
## if `var.equal = TRUE`, then Student's *t*-test will be run
pairwise_comparisons(
  data            = ggplot2::msleep,
  x               = vore,
  y               = brainwt,
  type            = "parametric",
  var.equal       = TRUE,
  paired          = FALSE,
  p.adjust.method = "bonferroni"
)
#> # A tibble: 6 × 6
#>   group1  group2  p.value test.details     p.value.adjustment
#>   <chr>   <chr>     <dbl> <chr>            <chr>             
#> 1 carni   herbi     1     Student's t-test Bonferroni        
#> 2 carni   insecti   1     Student's t-test Bonferroni        
#> 3 carni   omni      1     Student's t-test Bonferroni        
#> 4 herbi   insecti   1     Student's t-test Bonferroni        
#> 5 herbi   omni      0.979 Student's t-test Bonferroni        
#> 6 insecti omni      1     Student's t-test Bonferroni        
#>   label                                       
#>   <chr>                                       
#> 1 list(~italic(p)[Bonferroni-corrected]==1.00)
#> 2 list(~italic(p)[Bonferroni-corrected]==1.00)
#> 3 list(~italic(p)[Bonferroni-corrected]==1.00)
#> 4 list(~italic(p)[Bonferroni-corrected]==1.00)
#> 5 list(~italic(p)[Bonferroni-corrected]==0.98)
#> 6 list(~italic(p)[Bonferroni-corrected]==1.00)

## if `var.equal = FALSE`, then Games-Howell test will be run
pairwise_comparisons(
  data            = ggplot2::msleep,
  x               = vore,
  y               = brainwt,
  type            = "parametric",
  var.equal       = FALSE,
  paired          = FALSE,
  p.adjust.method = "bonferroni"
)
#> # A tibble: 6 × 11
#>   group1  group2  statistic p.value alternative method            distribution
#>   <chr>   <chr>       <dbl>   <dbl> <chr>       <chr>             <chr>       
#> 1 carni   herbi        2.17       1 two.sided   Games-Howell test q           
#> 2 carni   insecti     -2.17       1 two.sided   Games-Howell test q           
#> 3 carni   omni         1.10       1 two.sided   Games-Howell test q           
#> 4 herbi   insecti     -2.41       1 two.sided   Games-Howell test q           
#> 5 herbi   omni        -1.87       1 two.sided   Games-Howell test q           
#> 6 insecti omni         2.19       1 two.sided   Games-Howell test q           
#>   p.adjustment test.details      p.value.adjustment
#>   <chr>        <chr>             <chr>             
#> 1 none         Games-Howell test Bonferroni        
#> 2 none         Games-Howell test Bonferroni        
#> 3 none         Games-Howell test Bonferroni        
#> 4 none         Games-Howell test Bonferroni        
#> 5 none         Games-Howell test Bonferroni        
#> 6 none         Games-Howell test Bonferroni        
#>   label                                       
#>   <chr>                                       
#> 1 list(~italic(p)[Bonferroni-corrected]==1.00)
#> 2 list(~italic(p)[Bonferroni-corrected]==1.00)
#> 3 list(~italic(p)[Bonferroni-corrected]==1.00)
#> 4 list(~italic(p)[Bonferroni-corrected]==1.00)
#> 5 list(~italic(p)[Bonferroni-corrected]==1.00)
#> 6 list(~italic(p)[Bonferroni-corrected]==1.00)

## non-parametric
pairwise_comparisons(
  data            = ggplot2::msleep,
  x               = vore,
  y               = brainwt,
  type            = "nonparametric",
  paired          = FALSE,
  p.adjust.method = "none"
)
#> # A tibble: 6 × 11
#>   group1  group2  statistic p.value alternative method               
#>   <chr>   <chr>       <dbl>   <dbl> <chr>       <chr>                
#> 1 carni   herbi       0.582  0.561  two.sided   Dunn's all-pairs test
#> 2 carni   insecti     1.88   0.0595 two.sided   Dunn's all-pairs test
#> 3 carni   omni        1.14   0.254  two.sided   Dunn's all-pairs test
#> 4 herbi   insecti     1.63   0.102  two.sided   Dunn's all-pairs test
#> 5 herbi   omni        0.717  0.474  two.sided   Dunn's all-pairs test
#> 6 insecti omni        1.14   0.254  two.sided   Dunn's all-pairs test
#>   distribution p.adjustment test.details p.value.adjustment
#>   <chr>        <chr>        <chr>        <chr>             
#> 1 z            none         Dunn test    None              
#> 2 z            none         Dunn test    None              
#> 3 z            none         Dunn test    None              
#> 4 z            none         Dunn test    None              
#> 5 z            none         Dunn test    None              
#> 6 z            none         Dunn test    None              
#>   label                              
#>   <chr>                              
#> 1 list(~italic(p)[uncorrected]==0.56)
#> 2 list(~italic(p)[uncorrected]==0.06)
#> 3 list(~italic(p)[uncorrected]==0.25)
#> 4 list(~italic(p)[uncorrected]==0.10)
#> 5 list(~italic(p)[uncorrected]==0.47)
#> 6 list(~italic(p)[uncorrected]==0.25)

## robust
pairwise_comparisons(
  data            = ggplot2::msleep,
  x               = vore,
  y               = brainwt,
  type            = "robust",
  paired          = FALSE,
  p.adjust.method = "fdr"
)
#> # A tibble: 6 × 10
#>   group1  group2  estimate conf.level conf.low conf.high p.value
#>   <chr>   <chr>      <dbl>      <dbl>    <dbl>     <dbl>   <dbl>
#> 1 carni   herbi   -0.0323        0.95  -0.248     0.184    0.790
#> 2 carni   insecti  0.0451        0.95  -0.0484    0.139    0.552
#> 3 carni   omni     0.00520       0.95  -0.114     0.124    0.898
#> 4 herbi   insecti  0.0774        0.95  -0.133     0.288    0.552
#> 5 herbi   omni     0.0375        0.95  -0.182     0.257    0.790
#> 6 insecti omni    -0.0399        0.95  -0.142     0.0625   0.552
#>   test.details              p.value.adjustment
#>   <chr>                     <chr>             
#> 1 Yuen's trimmed means test FDR               
#> 2 Yuen's trimmed means test FDR               
#> 3 Yuen's trimmed means test FDR               
#> 4 Yuen's trimmed means test FDR               
#> 5 Yuen's trimmed means test FDR               
#> 6 Yuen's trimmed means test FDR               
#>   label                                
#>   <chr>                                
#> 1 list(~italic(p)[FDR-corrected]==0.79)
#> 2 list(~italic(p)[FDR-corrected]==0.55)
#> 3 list(~italic(p)[FDR-corrected]==0.90)
#> 4 list(~italic(p)[FDR-corrected]==0.55)
#> 5 list(~italic(p)[FDR-corrected]==0.79)
#> 6 list(~italic(p)[FDR-corrected]==0.55)

## Bayesian
pairwise_comparisons(
  data   = ggplot2::msleep,
  x      = vore,
  y      = brainwt,
  type   = "bayes",
  paired = FALSE
)
#> # A tibble: 6 × 20
#>   group1  group2  term       effectsize      estimate conf.level conf.low
#>   <chr>   <chr>   <chr>      <chr>              <dbl>      <dbl>    <dbl>
#> 1 carni   herbi   Difference Bayesian t-test  -0.376        0.95  -1.33  
#> 2 carni   insecti Difference Bayesian t-test   0.0339       0.95  -0.0447
#> 3 carni   omni    Difference Bayesian t-test  -0.0447       0.95  -0.233 
#> 4 herbi   insecti Difference Bayesian t-test   0.353        0.95  -0.826 
#> 5 herbi   omni    Difference Bayesian t-test   0.364        0.95  -0.289 
#> 6 insecti omni    Difference Bayesian t-test  -0.0766       0.95  -0.353 
#>   conf.high    pd rope.percentage prior.distribution prior.location prior.scale
#>       <dbl> <dbl>           <dbl> <chr>                       <dbl>       <dbl>
#> 1     0.525 0.800           0.171 cauchy                          0       0.707
#> 2     0.126 0.812           0.136 cauchy                          0       0.707
#> 3     0.151 0.688           0.214 cauchy                          0       0.707
#> 4     1.53  0.742           0.178 cauchy                          0       0.707
#> 5     1.11  0.848           0.164 cauchy                          0       0.707
#> 6     0.146 0.743           0.170 cauchy                          0       0.707
#>    bf10 method          log_e_bf10 n.obs expression  
#>   <dbl> <chr>                <dbl> <int> <list>      
#> 1 0.540 Bayesian t-test     -0.617    29 <expression>
#> 2 0.718 Bayesian t-test     -0.332    14 <expression>
#> 3 0.427 Bayesian t-test     -0.851    26 <expression>
#> 4 0.540 Bayesian t-test     -0.616    25 <expression>
#> 5 0.571 Bayesian t-test     -0.560    37 <expression>
#> 6 0.545 Bayesian t-test     -0.606    22 <expression>
#>   label                         test.details    
#>   <chr>                         <chr>           
#> 1 list(~log[e](BF['01'])==0.62) Student's t-test
#> 2 list(~log[e](BF['01'])==0.33) Student's t-test
#> 3 list(~log[e](BF['01'])==0.85) Student's t-test
#> 4 list(~log[e](BF['01'])==0.62) Student's t-test
#> 5 list(~log[e](BF['01'])==0.56) Student's t-test
#> 6 list(~log[e](BF['01'])==0.61) Student's t-test

Within-subjects design

## for reproducibility
set.seed(123)

## parametric
pairwise_comparisons(
  data            = bugs_long,
  x               = condition,
  y               = desire,
  subject.id      = subject,
  type            = "parametric",
  paired          = TRUE,
  p.adjust.method = "BH"
)
#> # A tibble: 6 × 6
#>   group1 group2  p.value test.details     p.value.adjustment
#>   <chr>  <chr>     <dbl> <chr>            <chr>             
#> 1 HDHF   HDLF   1.06e- 3 Student's t-test FDR               
#> 2 HDHF   LDHF   7.02e- 2 Student's t-test FDR               
#> 3 HDHF   LDLF   3.95e-12 Student's t-test FDR               
#> 4 HDLF   LDHF   6.74e- 2 Student's t-test FDR               
#> 5 HDLF   LDLF   1.99e- 3 Student's t-test FDR               
#> 6 LDHF   LDLF   6.66e- 9 Student's t-test FDR               
#>   label                                    
#>   <chr>                                    
#> 1 list(~italic(p)[FDR-corrected]==1.06e-03)
#> 2 list(~italic(p)[FDR-corrected]==0.07)    
#> 3 list(~italic(p)[FDR-corrected]==3.95e-12)
#> 4 list(~italic(p)[FDR-corrected]==0.07)    
#> 5 list(~italic(p)[FDR-corrected]==1.99e-03)
#> 6 list(~italic(p)[FDR-corrected]==6.66e-09)

## non-parametric
pairwise_comparisons(
  data            = bugs_long,
  x               = condition,
  y               = desire,
  subject.id      = subject,
  type            = "nonparametric",
  paired          = TRUE,
  p.adjust.method = "BY"
)
#> # A tibble: 6 × 11
#>   group1 group2 statistic  p.value alternative
#>   <chr>  <chr>      <dbl>    <dbl> <chr>      
#> 1 HDHF   HDLF        4.78 1.44e- 5 two.sided  
#> 2 HDHF   LDHF        2.44 4.47e- 2 two.sided  
#> 3 HDHF   LDLF        8.01 5.45e-13 two.sided  
#> 4 HDLF   LDHF        2.34 4.96e- 2 two.sided  
#> 5 HDLF   LDLF        3.23 5.05e- 3 two.sided  
#> 6 LDHF   LDLF        5.57 4.64e- 7 two.sided  
#>   method                                                                
#>   <chr>                                                                 
#> 1 Durbin's all-pairs test for a two-way balanced incomplete block design
#> 2 Durbin's all-pairs test for a two-way balanced incomplete block design
#> 3 Durbin's all-pairs test for a two-way balanced incomplete block design
#> 4 Durbin's all-pairs test for a two-way balanced incomplete block design
#> 5 Durbin's all-pairs test for a two-way balanced incomplete block design
#> 6 Durbin's all-pairs test for a two-way balanced incomplete block design
#>   distribution p.adjustment test.details        p.value.adjustment
#>   <chr>        <chr>        <chr>               <chr>             
#> 1 t            none         Durbin-Conover test BY                
#> 2 t            none         Durbin-Conover test BY                
#> 3 t            none         Durbin-Conover test BY                
#> 4 t            none         Durbin-Conover test BY                
#> 5 t            none         Durbin-Conover test BY                
#> 6 t            none         Durbin-Conover test BY                
#>   label                                   
#>   <chr>                                   
#> 1 list(~italic(p)[BY-corrected]==1.44e-05)
#> 2 list(~italic(p)[BY-corrected]==0.04)    
#> 3 list(~italic(p)[BY-corrected]==5.45e-13)
#> 4 list(~italic(p)[BY-corrected]==0.05)    
#> 5 list(~italic(p)[BY-corrected]==5.05e-03)
#> 6 list(~italic(p)[BY-corrected]==4.64e-07)

## robust
pairwise_comparisons(
  data            = bugs_long,
  x               = condition,
  y               = desire,
  subject.id      = subject,
  type            = "robust",
  paired          = TRUE,
  p.adjust.method = "hommel"
)
#> # A tibble: 6 × 11
#>   group1 group2 estimate conf.level conf.low conf.high     p.value  p.crit
#>   <chr>  <chr>     <dbl>      <dbl>    <dbl>     <dbl>       <dbl>   <dbl>
#> 1 HDHF   HDLF      1.03        0.95   0.140      1.92  0.00999     0.0127 
#> 2 HDHF   LDHF      0.454       0.95  -0.104      1.01  0.0520      0.025  
#> 3 HDHF   LDLF      1.95        0.95   1.09       2.82  0.000000564 0.00851
#> 4 HDLF   LDHF     -0.676       0.95  -1.61       0.256 0.0520      0.05   
#> 5 HDLF   LDLF      0.889       0.95   0.0244     1.75  0.0203      0.0169 
#> 6 LDHF   LDLF      1.35        0.95   0.560      2.14  0.000102    0.0102 
#>   test.details              p.value.adjustment
#>   <chr>                     <chr>             
#> 1 Yuen's trimmed means test Hommel            
#> 2 Yuen's trimmed means test Hommel            
#> 3 Yuen's trimmed means test Hommel            
#> 4 Yuen's trimmed means test Hommel            
#> 5 Yuen's trimmed means test Hommel            
#> 6 Yuen's trimmed means test Hommel            
#>   label                                       
#>   <chr>                                       
#> 1 list(~italic(p)[Hommel-corrected]==9.99e-03)
#> 2 list(~italic(p)[Hommel-corrected]==0.05)    
#> 3 list(~italic(p)[Hommel-corrected]==5.64e-07)
#> 4 list(~italic(p)[Hommel-corrected]==0.05)    
#> 5 list(~italic(p)[Hommel-corrected]==0.02)    
#> 6 list(~italic(p)[Hommel-corrected]==1.02e-04)

## Bayesian
pairwise_comparisons(
  data       = bugs_long,
  x          = condition,
  y          = desire,
  subject.id = subject,
  type       = "bayes",
  paired     = TRUE,
  bf.prior   = 0.77
)
#> # A tibble: 6 × 20
#>   group1 group2 term       effectsize      estimate conf.level conf.low
#>   <chr>  <chr>  <chr>      <chr>              <dbl>      <dbl>    <dbl>
#> 1 HDHF   HDLF   Difference Bayesian t-test    1.10        0.95   0.492 
#> 2 HDHF   LDHF   Difference Bayesian t-test    0.453       0.95  -0.0703
#> 3 HDHF   LDLF   Difference Bayesian t-test    2.13        0.95   1.63  
#> 4 HDLF   LDHF   Difference Bayesian t-test   -0.653       0.95  -1.29  
#> 5 HDLF   LDLF   Difference Bayesian t-test    0.980       0.95   0.379 
#> 6 LDHF   LDLF   Difference Bayesian t-test    1.66        0.95   1.15  
#>   conf.high    pd rope.percentage prior.distribution prior.location prior.scale
#>       <dbl> <dbl>           <dbl> <chr>                       <dbl>       <dbl>
#> 1    1.73   1               0     cauchy                          0        0.77
#> 2    0.931  0.954           0.193 cauchy                          0        0.77
#> 3    2.64   1               0     cauchy                          0        0.77
#> 4    0.0745 0.968           0.169 cauchy                          0        0.77
#> 5    1.60   0.999           0     cauchy                          0        0.77
#> 6    2.15   1               0     cauchy                          0        0.77
#>       bf10 method          log_e_bf10 n.obs expression  
#>      <dbl> <chr>                <dbl> <int> <list>      
#> 1 3.95e+ 1 Bayesian t-test      3.68     88 <expression>
#> 2 5.42e- 1 Bayesian t-test     -0.612    88 <expression>
#> 3 1.22e+10 Bayesian t-test     23.2      88 <expression>
#> 4 6.50e- 1 Bayesian t-test     -0.430    88 <expression>
#> 5 1.72e+ 1 Bayesian t-test      2.84     88 <expression>
#> 6 4.78e+ 6 Bayesian t-test     15.4      88 <expression>
#>   label                           test.details    
#>   <chr>                           <chr>           
#> 1 list(~log[e](BF['01'])==-3.68)  Student's t-test
#> 2 list(~log[e](BF['01'])==0.61)   Student's t-test
#> 3 list(~log[e](BF['01'])==-23.22) Student's t-test
#> 4 list(~log[e](BF['01'])==0.43)   Student's t-test
#> 5 list(~log[e](BF['01'])==-2.84)  Student's t-test
#> 6 list(~log[e](BF['01'])==-15.38) Student's t-test

Using pairwise_comparisons() with ggsignif

Example-1: between-subjects

## needed libraries
set.seed(123)
library(ggplot2)
library(ggsignif)

## converting to factor
mtcars$cyl <- as.factor(mtcars$cyl)

## creating a basic plot
p <- ggplot(mtcars, aes(cyl, wt)) +
  geom_boxplot()

## using `pairwise_comparisons()` package to create a dataframe with results
set.seed(123)
(df <-
  pairwise_comparisons(mtcars, cyl, wt) %>%
  dplyr::mutate(groups = purrr::pmap(.l = list(group1, group2), .f = c)) %>%
  dplyr::arrange(group1))
#> # A tibble: 3 × 12
#>   group1 group2 statistic   p.value alternative method            distribution
#>   <chr>  <chr>      <dbl>     <dbl> <chr>       <chr>             <chr>       
#> 1 4      6           5.39 0.00831   two.sided   Games-Howell test q           
#> 2 4      8           9.11 0.0000124 two.sided   Games-Howell test q           
#> 3 6      8           5.12 0.00831   two.sided   Games-Howell test q           
#>   p.adjustment test.details      p.value.adjustment
#>   <chr>        <chr>             <chr>             
#> 1 none         Games-Howell test Holm              
#> 2 none         Games-Howell test Holm              
#> 3 none         Games-Howell test Holm              
#>   label                                      groups   
#>   <chr>                                      <list>   
#> 1 list(~italic(p)[Holm-corrected]==8.31e-03) <chr [2]>
#> 2 list(~italic(p)[Holm-corrected]==1.24e-05) <chr [2]>
#> 3 list(~italic(p)[Holm-corrected]==8.31e-03) <chr [2]>

## using `geom_signif` to display results
## (note that you can choose not to display all comparisons)
p +
  ggsignif::geom_signif(
    comparisons = list(df$groups[[1]]),
    annotations = df$label[[1]],
    test        = NULL,
    na.rm       = TRUE,
    parse       = TRUE
  )

Example-2: within-subjects

## needed libraries
library(ggplot2)
library(ggstatsplot)
library(ggsignif)

## creating a basic plot
p <- ggplot(WRS2::WineTasting, aes(Wine, Taste)) +
  geom_boxplot()

## using `pairwise_comparisons()` package to create a dataframe with results
set.seed(123)
(df <-
  pairwise_comparisons(
    WRS2::WineTasting,
    Wine,
    Taste,
    subject.id = Taster,
    type = "bayes",
    paired = TRUE
  ) %>%
  dplyr::mutate(groups = purrr::pmap(.l = list(group1, group2), .f = c)) %>%
  dplyr::arrange(group1))
#> # A tibble: 3 × 21
#>   group1 group2 term       effectsize      estimate conf.level conf.low
#>   <chr>  <chr>  <chr>      <chr>              <dbl>      <dbl>    <dbl>
#> 1 Wine A Wine B Difference Bayesian t-test  0.00721       0.95  -0.0404
#> 2 Wine A Wine C Difference Bayesian t-test  0.0755        0.95   0.0107
#> 3 Wine B Wine C Difference Bayesian t-test  0.0693        0.95   0.0331
#>   conf.high    pd rope.percentage prior.distribution prior.location prior.scale
#>       <dbl> <dbl>           <dbl> <chr>                       <dbl>       <dbl>
#> 1    0.0569 0.624          0.404  cauchy                          0       0.707
#> 2    0.136  0.990          0.0103 cauchy                          0       0.707
#> 3    0.112  1.00           0      cauchy                          0       0.707
#>     bf10 method          log_e_bf10 n.obs expression  
#>    <dbl> <chr>                <dbl> <int> <list>      
#> 1  0.235 Bayesian t-test      -1.45    22 <expression>
#> 2  3.71  Bayesian t-test       1.31    22 <expression>
#> 3 50.5   Bayesian t-test       3.92    22 <expression>
#>   label                          test.details     groups   
#>   <chr>                          <chr>            <list>   
#> 1 list(~log[e](BF['01'])==1.45)  Student's t-test <chr [2]>
#> 2 list(~log[e](BF['01'])==-1.31) Student's t-test <chr [2]>
#> 3 list(~log[e](BF['01'])==-3.92) Student's t-test <chr [2]>

## using `geom_signif` to display results
p +
  ggsignif::geom_signif(
    comparisons      = df$groups,
    map_signif_level = TRUE,
    tip_length       = 0.01,
    y_position       = c(6.5, 6.65, 6.8),
    annotations      = df$label,
    test             = NULL,
    na.rm            = TRUE,
    parse            = TRUE
  )