class: center, middle, inverse, title-slide #
ggstatsplot
:
ggplot2
Based Plots with Statistical Details ## An Introductory Tutorial ### Indrajeet Patil (Center for Humans and Machines, Max Planck Institute, Berlin) ### 2021-03-01
--- <style type="text/css"> body, td { font-size: 16px; } code.r{ font-size: 14px; } </style> --- layout: true # Plan --- - Why `ggstatsplot`? - Primary functions - Customizability - Benefits - Misconceptions - Limitations and Future scope --- layout: false class: inverse, center, middle # Why *ggstatsplot*? --- layout: true # Raison d'Γͺtre --- -- .right-column[ .font120[ Current count of packages on the Comprehensive R Archive Network (**CRAN**) **<font color="red"> > 16,000</font>** ] .footnote[<https://cran.r-project.org/web/packages/>] ] -- .left-column[  ] -- .right-column[.font120[ Short answer: <br> `ggstatsplot` provides a collection of <font color="blue">*information-rich*</font> plots with <font color="blue">*statistical details*</font> and is suitable for scholarly publications and **faster** (exploratory) data analysis. ] ] --- layout: true # Simpler/faster data analysis workflow --- -- .img-center[  ] .footnote[[(Grolemund & Wickham, *R for Data Science*, 2017)](https://r4ds.had.co.nz/)] -- <br> <br> <br> <br> <br> <br> <br> <br> In a typical **exploratory** data analysis workflow, <font color="blue">data visualization</font> and <font color="blue">statistical modeling</font> are two different phases: visualization informs modeling, and modeling can suggest a different visualization, and so on and so forth. -- The central idea of `ggstatsplot` is simple: combine these two phases into one! --- layout: true class: center # Information-rich graphic is worth a thousand words --- .img-center[  ] .footnote[[(Matejka & Fitzmaurice, *Autodesk Research*, 2017)](https://www.autodeskresearch.com/publications/samestats)] <br> <br> <br> <br> <br> <br> <br> <br> <br> **Graphical** summaries can reveal problems not visible from **numerical** statistics. βI plotted my data and what I found surprised me!" - BuzzFeed --- layout: false # Ready-made plot = no customization -- The **grammar of graphics** is a powerful framework [(Wilkinson, 2011)](https://www.google.com/books/edition/_/iI1kcgAACAAJ?hl=en&sa=X&ved=2ahUKEwiGl8rJ2KztAhWyElkFHa8NAvkQre8FMBR6BAgMEAc) and can help you make infinite number of graphics, each tailored for your specific data visualization problem! But... -- .pull-left[  ] .pull-right[  ] -- `\(\sum_{time}\)` (Needed time β + Likelihood to graphical explore data β) = Avoidance habit --- layout: false class: inverse, center, middle # And a LOT more! ...but we will come back to that later π Let's get started first! --- layout: false # Installation -- Install the stable version (latest: `0.7.0`) of `ggstatsplot` from [CRAN](https://cran.r-project.org/web/packages/ggstatsplot/index.html): ```r install.packages("ggstatsplot") ``` -- You can get the development version of the package from [Github](https://github.com/IndrajeetPatil/ggstatsplot): ```r remotes::install_github("IndrajeetPatil/ggstatsplot") ``` -- Load the needed packages- ```r library(ggstatsplot) library(ggplot2) ``` --- layout: false class: inverse, center, middle # Primary functions --- layout: false class: inverse, center, middle # Hypothesis about group differences π¬ *ggbetweenstats*, *ggwithinstats*: multiple groups π *gghistostats*, *ggdotplotstats*: single group --- layout: true # ggbetweenstats - For between group comparisons --- .left-code[ ```r ggbetweenstats( data = movies_long, x = mpaa, y = rating ) ``` .font70[ Function internally decides tests - *t*-test if **2** groups - ANOVA if **> 2** groups π **Defaults** return <br> β raw data + distributions <br> β descriptive statistics <br> β inferential statistics <br> β effect size + CIs <br> β pairwise comparisons <br> β Bayesian hypothesis-testing <br> β Bayesian estimation <br> ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggbetweenstats_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggbetweenstats - pairwise comparisons --- .left-code[ ```r ggbetweenstats( data = movies_long, x = mpaa, y = rating, * type = "np", * pairwise.display = "ns" ) ``` .font70[ Changing the `type` of test β `"p"` β **parametric** (default) <br> β `"np"` β **non-parametric** <br> β `"r"` β **robust** <br> β `"bf"` β **Bayesian** Changing pairwise comparisons displayed βΉοΈ `"ns"` β only **non-significant** <br> βΉοΈ `"s"` β only **significant** <br> βΉοΈ `"all"` β **all** ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggbetweenstats_2-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggbetweenstats - outlier tagging --- .left-code[ ```r ggbetweenstats( data = movies_long, x = mpaa, y = rating, type = "r", pairwise.comparisons = FALSE, * outlier.tagging = TRUE, * outlier.label = title ) ``` .font70[ [Tukey's fences](https://en.wikipedia.org/wiki/Outlier#Tukey's_fences) method using interquartile range flags outliers. Centrality measures β `"p"` β `\(\mu_{mean}\)` <br> β `"np"` β `\(\mu_{median}\)` <br> β `"r"` β `\(\mu_{trimmed}\)` <br> β `"bf"` β `\(\mu_{MAP}\)` ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggbetweenstats_3-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggwithinstats - repeated measures equivalent --- .left-code[ ```r ggwithinstats( data = WRS2::WineTasting, x = Wine, y = Taste ) ``` .font70[ π **Defaults** return <br> β raw data + distributions <br> β descriptive statistics <br> β inferential statistics <br> β effect size + CIs <br> β pairwise comparisons <br> β Bayesian hypothesis-testing <br> β Bayesian estimation <br> Changing the `type` of test β `"p"` β **parametric** (default) <br> β `"np"` β **non-parametric** <br> β `"r"` β **robust** <br> β `"bf"` β **Bayesian** ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggwithinstats_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # gghistostats - Distribution of a numeric variable --- .left-code[ ```r gghistostats( data = movies_long, x = budget, * test.value = 30 ) ``` .font70[ π **Defaults** return <br> β counts + proportion for bins <br> β descriptive statistics <br> β inferential statistics <br> β effect size + CIs <br> β Bayesian hypothesis-testing <br> β Bayesian estimation <br> Centrality measures β `"p"` β `\(\mu_{mean}\)` <br> β `"np"` β `\(\mu_{median}\)` <br> β `"r"` β `\(\mu_{trimmed}\)` <br> β `"bf"` β `\(\mu_{MAP}\)` ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/gghistostats_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggdotplotstats - Labeled numeric variable --- .left-code[ ```r ggdotplotstats( data = movies_long, x = budget, y = genre, * test.value = 30 ) ``` .font70[ π **Defaults** return <br> β descriptive statistics <br> β inferential statistics <br> β effect size + CIs <br> β Bayesian hypothesis-testing <br> β Bayesian estimation <br> Centrality measures β `"p"` β `\(\mu_{mean}\)` <br> β `"np"` β `\(\mu_{median}\)` <br> β `"r"` β `\(\mu_{trimmed}\)` <br> β `"bf"` β `\(\mu_{MAP}\)` ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggdotplotstats_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: false class: inverse, center, middle # Hypothesis about correlation <br> π¬ *ggscatterstats*: Two numeric variables <br> π§ͺ *ggcorrmat*: Multiple numeric variables --- layout: true # ggscatterstats - Two numeric variables --- .left-code[ ```r ggscatterstats( data = movies_long, x = budget, y = rating ) ``` .font70[ π **Defaults** return <br> β raw data + distributions <br> β marginal distributions <br> β inferential statistics <br> β effect size + CIs <br> β Bayesian hypothesis-testing <br> β Bayesian estimation <br> Changing the `type` of test β `"p"` β **parametric** (default) <br> β `"np"` β **non-parametric** <br> β `"r"` β **robust** <br> β `"bf"` β **Bayesian** ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggscatterstats_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggscatterstats - conditional point tagging --- .left-code[ ```r ggscatterstats( data = movies_long, x = budget, y = rating, type = "r", * label.var = title, * label.expression = budget > 150 * & rating > 7.5, * marginal.type = "boxplot" ) ``` .font70[ Changing the marginal type βΉοΈ **histogram** <br> βΉοΈ **boxplot** <br> βΉοΈ **density** <br> βΉοΈ **violin** <br> βΉοΈ **densigram** ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggscatterstats_2-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggcorrmat - multiple numeric variables --- .left-code[ ```r ggcorrmat(dplyr::starwars) ``` .font70[ π **Defaults** return <br> β effect size + significance <br> β careful handling of `NA`s Changing the `type` of test β `"p"` β **parametric** (default) <br> β `"np"` β **non-parametric** <br> β `"r"` β **robust** <br> β `"bf"` β **Bayesian** ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggcorrmat_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggcorrmat - getting a dataframe --- .font70[ In addition to `output = "plot"`, this function can also be used to get a <font color="blue">dataframe</font>: ] .font50[ ```r library(ggplot2) # for data options(digits = 2) ggcorrmat( data = dplyr::select(msleep, sleep_rem, awake, brainwt), type = "bayes", * output = "dataframe" ) ``` <div class="kable-table"> <table> <thead> <tr> <th style="text-align:left;"> parameter1 </th> <th style="text-align:left;"> parameter2 </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> conf.level </th> <th style="text-align:right;"> conf.low </th> <th style="text-align:right;"> conf.high </th> <th style="text-align:right;"> pd </th> <th style="text-align:right;"> rope.percentage </th> <th style="text-align:left;"> prior.distribution </th> <th style="text-align:right;"> prior.location </th> <th style="text-align:right;"> prior.scale </th> <th style="text-align:right;"> bayes.factor </th> <th style="text-align:left;"> method </th> <th style="text-align:right;"> n.obs </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> sleep_rem </td> <td style="text-align:left;"> awake </td> <td style="text-align:right;"> -0.73 </td> <td style="text-align:right;"> 0.95 </td> <td style="text-align:right;"> -0.82 </td> <td style="text-align:right;"> -0.63 </td> <td style="text-align:right;"> 1.00 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:left;"> beta </td> <td style="text-align:right;"> 1.4 </td> <td style="text-align:right;"> 1.4 </td> <td style="text-align:right;"> 3.0e+09 </td> <td style="text-align:left;"> Bayesian Pearson correlation </td> <td style="text-align:right;"> 61 </td> </tr> <tr> <td style="text-align:left;"> sleep_rem </td> <td style="text-align:left;"> brainwt </td> <td style="text-align:right;"> -0.21 </td> <td style="text-align:right;"> 0.95 </td> <td style="text-align:right;"> -0.41 </td> <td style="text-align:right;"> 0.01 </td> <td style="text-align:right;"> 0.93 </td> <td style="text-align:right;"> 0.19 </td> <td style="text-align:left;"> beta </td> <td style="text-align:right;"> 1.4 </td> <td style="text-align:right;"> 1.4 </td> <td style="text-align:right;"> 6.5e-01 </td> <td style="text-align:left;"> Bayesian Pearson correlation </td> <td style="text-align:right;"> 48 </td> </tr> <tr> <td style="text-align:left;"> awake </td> <td style="text-align:left;"> brainwt </td> <td style="text-align:right;"> 0.34 </td> <td style="text-align:right;"> 0.95 </td> <td style="text-align:right;"> 0.17 </td> <td style="text-align:right;"> 0.53 </td> <td style="text-align:right;"> 1.00 </td> <td style="text-align:right;"> 0.02 </td> <td style="text-align:left;"> beta </td> <td style="text-align:right;"> 1.4 </td> <td style="text-align:right;"> 1.4 </td> <td style="text-align:right;"> 7.3e+00 </td> <td style="text-align:left;"> Bayesian Pearson correlation </td> <td style="text-align:right;"> 56 </td> </tr> </tbody> </table> </div> ] .font70[ Partial correlations are also supported! Just set `partial = TRUE`. ] --- layout: false class: inverse, center, middle # Hypothesis of composition of categorical variables <br> π§ͺ *ggpiestats*: If you like π <br> π¬ *ggbarstats*: Otherwise --- layout: true # ggpiestats - goodness-of-fit test --- .left-code[ ```r ggpiestats( data = as.data.frame(Titanic), x = Class, * counts = Freq, * label = "both" ) ``` .font70[ π **Defaults** return <br> β descriptive statistics <br> β inferential statistics <br> β effect size + CIs <br> β Bayesian hypothesis-testing <br> π€ Bayesian estimation <br> **Note** <br> If the data is in *tabled* format, you can use the `counts` argument. ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggpiestats_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggpiestats - association between categorical variables --- .left-code[ ```r # let's use subset of data ggpiestats( data = dplyr::filter( movies_long, genre %in% c("Drama", "Comedy") ), x = mpaa, y = genre ) ``` .font70[ π **Defaults** return <br> β descriptive statistics <br> β inferential statistics <br> β effect size + CIs <br> β Goodness-of-fit tests <br> β Bayesian hypothesis-testing <br> β Bayesian estimation <br> Test by design - `paired = FALSE` β Pearson's `\(\chi^2\)` - `paired = TRUE` β McNemar's `\(\chi^2\)` ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggpiestats_2-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggbarstats - association between categorical variables --- .left-code[ ```r ggbarstats( data = dplyr::filter( movies_long, genre %in% c("Drama", "Comedy") ), x = mpaa, y = genre ) ``` .font70[ π **Defaults** return <br> β descriptive statistics <br> β inferential statistics <br> β effect size + CIs <br> β Goodness-of-fit tests <br> β Bayesian hypothesis-testing <br> β Bayesian estimation <br> Test by design - `paired = FALSE` β Pearson's `\(\chi^2\)` - `paired = TRUE` β McNemar's `\(\chi^2\)` ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggbarstats_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: false class: inverse, center, middle # Hypothesis about regression coefficients π *ggcoefstats*: Regression model object --- layout: true # ggcoefstats --- .left-code[ ```r # model mod <- stats::lm( formula = rating ~ mpaa, data = movies_long ) # plot ggcoefstats( x = mod, title = "IMDB rating by MPAA rating" ) ``` .font70[ π **Defaults** return <br> β estimate + CIs <br> β inferential statistics <br> β model summary (AIC + BIC) ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggcoefstats_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # ggcoefstats: Supported models --- .font60[ `aareg`, `afex_aov`, `anova`, `anova.mlm`, `anova`, `aov`, `aovlist`, `Arima`, `bam`, `bayesx`, `bayesGARCH`, `BBmm`, `BBreg`, `bcplm`, `betamfx`, `betaor`, `BFBayesFactor`, `bglmerMod`, `bife`, `bigglm`, `biglm`, `blavaan`, `bmlm`, `blmerMod`, `bracl`, `brglm`, `brglm2`, `brmsfit`, `brmultinom`, `btergm`, `cch`, `censReg`, `cgam`, `cgamm`, `cglm`, `clm`, `clm2`, `clmm`, `clmm2`, `coeftest`, `complmrob`, `confusionMatrix`, `coxme`, `coxph`, `coxph.penal`, `cpglm`, `cpglmm`, `crch`, `crq`, `crr`, `DirichReg`, `drc`, `eglm`, `elm`, `emmGrid`, `epi.2by2`, `ergm`, `feis`, `felm`, `fitdistr`, `fixest`, `flexsurvreg`, `gam`, `Gam`, `gamlss`, `garch`, `geeglm`, `glmc`, `glmerMod`, `glmmTMB`, `gls`, `glht`, `glm`, `glmm`, `glmmadmb`, `glmmPQL`, `glmRob`, `glmrob`, `glmx`, `gmm`, `HLfit`, `hurdle`, `ivFixed`, `ivprobit`, `ivreg`, `iv_robust`, `lavaan`, `lm`, `lm.beta`, `lmerMod`, `lmerModLmerTest`, `lmodel2`, `lmRob`, `lmrob`, `lm_robust`, `logitmfx`, `logitor`, `logitsf`, `LORgee`, `lqm`, `lqmm`, `lrm`, `manova`, `maov`, `margins`, `mcmc`, `mcmc.list`, `MCMCglmm`, `mclogit`, `mice`, `mmclogit`, `mediate`, `metafor`, `merMod`, `merModList`, `metaplus`, `mixor`, `mjoint`, `mle2`, `mlm`, `multinom`, `negbin`, `negbinmfx`, `negbinirr`, `nlmerMod`, `nlrq`, `nlreg`, `nls`, `orcutt`, `orm`, `plm`, `poissonmfx`, `poissonirr`, `polr`, `ridgelm`, `riskRegression`, `rjags`, `rlm`, `rlmerMod`, `robmixglm`, `rq`, `rqs`, `rqss`, `rrvglm`, `scam`, `semLm`, `semLme`, `slm`, `speedglm`, `speedlm`, `stanfit`, `stanreg`, `summary.lm`, `survreg`, `svyglm`, `svyolr`, `svyglm`, `tobit`, `truncreg`, `varest`, `vgam`, `vglm`, `wbgee`, `wblm`, `zeroinfl`, etc. ] -- Thanks to [`easystats`](https://easystats.github.io/easystats/)!  --- layout: false class: inverse, center, middle # *grouped_* variants of all functions Running the same function for all levels of a single grouping variable --- layout: true # *grouped_* functions --- -- .left-code[ ```r grouped_ggpiestats( data = mtcars, x = cyl, * grouping.var = am ) ``` .font70[ Available `grouped_` variants - `grouped_ggbetweenstats` - `grouped_ggwithinstats` - `grouped_gghistostats` - `grouped_ggdotplotstats` - `grouped_ggscatterstats` - `grouped_ggcorrmat` - `grouped_ggpiestats` - `grouped_ggbarstats` ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/grouped_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: false class: inverse, center, middle # Customizability of *ggstatsplot* "What if I don't like the default plots?" π€ --- <!-- exclude: false --> <!-- layout: true --> <!-- # Defaults --> <!-- --- --> <!-- exclude: false --> <!-- .pull-left[ --> <!-- .font100[ --> <!-- The default plots in *<font color="blue">ggstatsplot</font>* are --> <!-- **opinionated**, yes, but they try to follow best practices outlined in the data visualization research. --> <!-- ] --> <!-- ] --> <!-- .pull-right[ --> <!--  --> <!-- ] --> --- layout: true # Changing aesthetics (themes + palettes) πΌπ¨ --- Aesthetic preferences are not an excuse to not use `ggstatsplot`! π» -- .left-code[ ```r ggbetweenstats( data = movies_long, x = mpaa, y = rating, * ggtheme = hrbrthemes::theme_ipsum_tw(), * palette = "Darjeeling2", * package = "wesanderson" ) ``` .font70[ The default palette is **colorblind-friendly**. ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggbetweenstats_4-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # Further modification with *ggplot2* π --- You can modify `ggstatsplot` plots further using `ggplot2` functions. π .left-code[ ```r ggbetweenstats( data = mtcars, x = am, y = wt, type = "bayes" ) + * scale_y_continuous(sec.axis = dup_axis()) ``` .img-left-small[  ] ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/ggbetweenstats_5-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # Too much information π --- `ggstatsplot` can be used to get **only plots**. -- .left-code[ ```r # using `ggstatsplot` only for plot ggbetweenstats( data = iris, x = Species, y = Sepal.Length, # turn off centrality measure * centrality.plotting = FALSE, # turn off statistical analysis * results.subtitle = FALSE, # turn off Bayesian message * bf.message = FALSE, # turn off pairwise comparisons * pairwise.comparisons = FALSE ) ``` ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/only_plot-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: true # Expressions for custom plots ποΈ --- `ggstatsplot` can be used to get **only the expressions**. -- .left-code[ ```r # using `ggstatsplot` for stats results <- ggstatsplot::ggpiestats( data = Titanic_full, x = Survived, y = Sex, * output = "subtitle" ) # using `ggiraphExtra` for plot *ggiraphExtra::ggSpine( data = Titanic_full, aes(x = Sex, fill = Survived), addlabel = TRUE, interactive = FALSE ) + * labs(subtitle = results) ``` ] .right-plot[ <img src="ggstatsplot_presentation_files/figure-html/subtitle_1-1.png" width="100%" style="display: block; margin: auto;" /> ] --- layout: false class: inverse, center, middle # Why use *ggstatsplot*? ποΈ --- layout: false # Supports different statistical approaches -- Functions | Description | Parametric | Non-parametric | Robust | Bayesian ------- | ------------------ | ---- | ----- | ----| ----- `ggbetweenstats` | Between group comparisons | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** `ggwithinstats` | Within group comparisons | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** `gghistostats`, `ggdotplotstats` | Distribution of a numeric variable | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** `ggcorrmat` | Correlation matrix | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** `ggscatterstats` | Correlation between two variables | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** `ggpiestats`, `ggbarstats` | Association between categorical variables | **<font color="green">Yes</font>** | `NA` | `NA` | **<font color="green">Yes</font>** `ggpiestats`, `ggbarstats` | Equal proportions for categorical variable levels | **<font color="green">Yes</font>** | `NA` | `NA` | **<font color="green">Yes</font>** `ggcoefstats` | Regression modeling | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** `ggcoefstats` | Random-effects meta-analysis | **<font color="green">Yes</font>** | `NA` | **<font color="green">Yes</font>** | **<font color="green">Yes</font>** --- layout: false # One package to access it all π -- .pull-left[ .font90[ **Load 'em up!** π¦ to carry out the test (e.g. `stats`) <br> π¦ computing effect size + CIs (e.g. `effectsize`) <br> π¦ for descriptives (e.g. `skimr`) <br> π¦ pairwise comparisons (e.g. `multcomp`) <br> π¦ Bayesian hypothesis testing (e.g. `BayesFactor`) <br> π¦ Bayesian estimation (e.g. `bayestestR`) <br> π¦ . <br> π¦ . <br> ] .img-left-small[  ] ] -- .pull-right[ .font90[ **Root of headache** π€ accepts dataframe, vectors, matrix? <br> π€ long/wide format data? <br> π€ works with `NA`s? <br> π€ returns list, dataframe, arrays? <br> π€ works with tibbles? <br> .img-right-small[  ] ] ] --- layout: false # Toggling between statistical approaches π -- .pull-left[ **<font color="blue">Parametric<font>** ```r # anova ggbetweenstats( data = mtcars, x = cyl, y = wt, * type = "p" ) # correlation analysis ggscatterstats( data = mtcars, x = wt, y = mpg, * type = "p" ) # t-test gghistostats( data = mtcars, x = wt, test.value = 2, * type = "p" ) ``` ] -- .pull-right[ **<font color="#ff6600">Non-parametric<font>** ```r # anova ggbetweenstats( data = mtcars, x = cyl, y = wt, * type = "np" ) # correlation analysis ggscatterstats( data = mtcars, x = wt, y = mpg, * type = "np" ) # t-test gghistostats( data = mtcars, x = wt, test.value = 2, * type = "np" ) ``` ] --- layout: false # Results *in context* of the underlying data π΅οΈ -- .pull-left[ **Without ggstatsplot** The 20 participants who received the drug intervention `\((M = 24.14)\)` compared to the 20 participants in the control group `\((M = 41.78)\)` demonstrated significantly lower weight (Welch's `\(t\)`-test: `\(t(19.03) = 2.33, p = .031\)`). The effect size `\((g_{Hedges} = 0.72, 95\% CI [0.09,1.35])\)` for this analysis was found to exceed Cohenβs (1988) convention for a medium effect `\((> .50)\)`. ] -- .pull-right[ **With ggstatsplot**  ] --- --- layout: false # Best practices in statistical reporting π -- The expression template tries to follow the gold standard for statistical reporting.  <!-- --- --> <!-- layout: false --> <!-- # Statistically informed tests defaults --> <!-- -- --> <!-- The default tests follow the best practices. For example, --> <!-- β `ggbetweenstats` and `ggwithinstats` default to <font color="blue">Welch's *t*-test</font> and <font --> <!-- color="blue">Welch's ANOVA</font> - and not Student's *t*-test and Fisher's --> <!-- ANOVA - based on recent work (Delacre et al., --> <!-- [2017](https://www.rips-irsp.com/article/10.5334/irsp.82/), --> <!-- [2018](https://psyarxiv.com/wnezg)). --> <!-- β Functions default to reporting unbiased effect size measures (Lakens, [2013](https://www.frontiersin.org/articles/10.3389/fpsyg.2013.00863/full)). --> <!-- β Whenever multiple tests are carried out, *p*-values are adjusted for them by default. --> <!-- etc. --> --- layout: false # Avoiding reporting errors -- > "half of all published psychology papers that use NHST contained at least one p-value that was inconsistent with its test statistic and degrees of freedom. One in eight papers contained a grossly inconsistent p-value that may have affected the statistical conclusion" > [(Nuijten et al., *Behavior Research Methods*, 2016)](https://link.springer.com/article/10.3758/s13428-015-0664-2) -- Since the plot and the statistical analysis are yoked together, the chances of making an error in reporting the results are minimized. --- layout: false # Making sense of null results -- `\(p > 0.05\)`: The null hypothesis (`H0`) can't be rejected But can it be **accepted**?! NHST π€« -- > "In 72% of cases, nonsignificant results were misinterpreted, in that the authors inferred that the effect was absent. A Bayesian reanalysis revealed that fewer than 5% of the nonsignificant findings provided strong evidence (i.e., `\(BF_{01} > 10\)`) in favor of the null hypothesis over the alternative hypothesis." > [(Aczel et al., *AMPPS*, 2018)](https://journals.sagepub.com/doi/pdf/10.1177/2515245918773742) -- Juxtaposing frequentist and Bayesian statistics for the same analysis helps to properly interpret the null results. --- layout: true # A few additional benefits --- -- - Minimal amount of code needed (typically only `data`, `x`, and `y`), which minimizes chances of error and makes for tidy scripts. π -- - Truly makes your figures worth a thousand words. π¬ -- - No need to copy-paste results to the text editor (MS-Word, e.g.). π -- - Disembodied figures stand on their own and are easy to evaluate for the reader. π§ -- - More breathing room for theoretical discussion and other text. β -- - No need to worry about updating figures and statistical details separately. π --- layout: false class: inverse, center, middle # Misconceptions and limitations --- layout: true # Misconceptions about *ggstatsplot* --- -- This package is... -- β an alternative to learning `ggplot2` <br> -- β (the more you know `ggplot2`, the better you can modify the defaults to your liking) -- β meant to be used in talks/presentations <br> -- β (default plots can be too complicated for effectively communicating results in time-constrained presentation settings, e.g. conference talks) -- β meant for mass communication <br> -- β LOL π -- β the only game in town <br> -- β (GUI open-source softwares: [JASP](https://jasp-stats.org/) and [jamovi](https://www.jamovi.org/)) --- layout: true # Limitations of *ggstatsplot* ποΈ --- -- - Limited kinds of <font color="blue">plots</font> available. -- - Limited number of statistical <font color="blue">tests</font> available. This will **always** be the case. π€· -- - Expects a non-trivial level of statistical proficiency (but plots without statistics can still be useful). -- - <font color="blue">Faceting</font> not implemented. --- layout: true # Overcoming these limitations π₯ --- -- .pull-left[ .font90[ Contributions (big or small) welcome! ]  ] -- .pull-right[ .font90[ Ways in which you can [contribute](https://github.com/IndrajeetPatil/ggstatsplot) - Read and correct any inconsistencies in the [documentation](https://indrajeetpatil.github.io/ggstatsplot/) π - Raise issues about bugs/features π - Either mention or cite software if used in a publication π - Star the GitHub repo (increases visibility) β - Review code π΅ - Add new functionality π¨βπ» ] ] --- layout: false class: inverse, center, middle # Acknowledgments -- Other developers π [Daniel LΓΌdecke](https://github.com/strengejacke), [Dominique Makowski](https://github.com/DominiqueMakowski), [Mattan S. Ben-Shachar](https://github.com/mattansb) -- Support π° [Mina Cikara](http://www.intergroupneurosciencelaboratory.com/), [Fiery Cushman](http://cushmanlab.fas.harvard.edu/index.php), [Iyad Rahwan](https://rahwan.me/) -- Community π Contributors to *ggstatsplot* & *rstats* users and developers --- layout: false class: inverse, center, middle # Find me at... .font100[ [π¦ @patilindrajeets](http://twitter.com/patilindrajeets) [π» @IndrajeetPatil](http://github.com/IndrajeetPatil) [π https://sites.google.com/site/indrajeetspatilmorality/](https://sites.google.com/site/indrajeetspatilmorality/) [π« patilindrajeet.science@gmail.com](mailto:patilindrajeet.science@gmail.com) ] --- layout: false class: inverse, center, middle # The End π For more information, see https://indrajeetpatil.github.io/ggstatsplot/ --- <!-- ############### Removed slides ##################### --> <!-- layout: false --> <!-- # Consistent API = No cognitive fatigue --> <!-- -- --> <!-- .pull-left[ --> <!-- ```{r lm, eval = FALSE} --> <!-- stats::lm(formula = wt ~ mpg, data = mtcars) --> <!-- ``` --> <!-- ```{r cor, eval = FALSE} --> <!-- stats::cor(x = mtcars$wt, y = mtcars$mpg) --> <!-- ``` --> <!-- ```{r cor.test, eval = FALSE} --> <!-- stats::cor.test(formula = ~ wt + mpg, data = mtcars) --> <!-- ``` --> <!-- ] --> <!-- -- --> <!-- .img-left-small[ --> <!--  --> <!-- ] --> <!-- -- --> <!-- .pull-right[ --> <!-- Functions in `ggstatsplot`- --> <!-- β expect **dataframe** <br> --> <!-- β expect **tidy** data <br> --> <!-- β have consistent API (`foo(data, x, ...)`) <br> --> <!-- ] --> <!-- -- --> <!-- .img-right-small[ --> <!--  --> <!-- ] -->