Data frame and expression for distribution properties
Source:R/centrality-description.R
centrality_description.Rd
Parametric, non-parametric, robust, and Bayesian measures of centrality.
Usage
centrality_description(
data,
x,
y,
type = "parametric",
conf.level = NULL,
tr = 0.2,
digits = 2L,
...
)
Arguments
- data
A data frame (or a tibble) from which variables specified are to be taken. Other data types (e.g., matrix,table, array, etc.) will not be accepted. Additionally, grouped data frames from
{dplyr}
should be ungrouped before they are entered asdata
.- x
The grouping (or independent) variable in
data
.- y
The response (or outcome or dependent) variable from
data
.- type
A character specifying the type of statistical approach:
"parametric"
"nonparametric"
"robust"
"bayes"
You can specify just the initial letter.
- conf.level
Scalar between
0
and1
(default:95%
confidence/credible intervals,0.95
). IfNULL
, no confidence intervals will be computed.- tr
Trim level for the mean when carrying out
robust
tests. In case of an error, try reducing the value oftr
, which is by default set to0.2
. Lowering the value might help.- digits
Number of digits for rounding or significant figures. May also be
"signif"
to return significant figures or"scientific"
to return scientific notation. Control the number of digits by adding the value as suffix, e.g.digits = "scientific4"
to have scientific notation with 4 decimal places, ordigits = "signif5"
for 5 significant figures (see alsosignif()
).- ...
Currently ignored.
Details
This function describes a distribution for y
variable for each level of the
grouping variable in x
by a set of indices (e.g., measures of centrality,
dispersion, range, skewness, kurtosis, etc.). It additionally returns an
expression containing a specified centrality measure. The function internally
relies on datawizard::describe_distribution()
function.
Centrality measures
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Type | Measure | Function used |
Parametric | mean | datawizard::describe_distribution() |
Non-parametric | median | datawizard::describe_distribution() |
Robust | trimmed mean | datawizard::describe_distribution() |
Bayesian | MAP | datawizard::describe_distribution() |
Citation
Patil, I., (2021). statsExpressions: R Package for Tidy Dataframes and Expressions with Statistical Details. Journal of Open Source Software, 6(61), 3236, https://doi.org/10.21105/joss.03236
Examples
# for reproducibility
set.seed(123)
# ----------------------- parametric -----------------------
centrality_description(iris, Species, Sepal.Length, type = "parametric")
#> # A tibble: 3 × 12
#> Species Sepal.Length std.dev iqr min max skewness kurtosis n.obs
#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 setosa 5.01 0.352 0.400 4.3 5.8 0.120 -0.253 50
#> 2 versicolor 5.94 0.516 0.7 4.9 7 0.105 -0.533 50
#> 3 virginica 6.59 0.636 0.750 4.9 7.9 0.118 0.0329 50
#> missing.obs expression n.expression
#> <int> <list> <chr>
#> 1 0 <language> "setosa\n(n = 50)"
#> 2 0 <language> "versicolor\n(n = 50)"
#> 3 0 <language> "virginica\n(n = 50)"
# ----------------------- non-parametric -------------------
centrality_description(mtcars, am, wt, type = "nonparametric")
#> # A tibble: 2 × 12
#> am wt mad iqr min max skewness kurtosis n.obs missing.obs
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
#> 1 0 3.52 0.452 0.41 2.46 5.42 1.15 1.06 19 0
#> 2 1 2.32 0.682 0.942 1.51 3.57 0.269 -0.654 13 0
#> expression n.expression
#> <list> <chr>
#> 1 <language> "0\n(n = 19)"
#> 2 <language> "1\n(n = 13)"
# ----------------------- robust ---------------------------
centrality_description(ToothGrowth, supp, len, type = "robust")
#> # A tibble: 2 × 12
#> supp len std.dev iqr min max skewness kurtosis n.obs missing.obs
#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
#> 1 OJ 21.7 6.61 10.9 8.2 30.9 -0.580 -0.831 30 0
#> 2 VC 16.6 8.27 12.5 4.2 33.9 0.306 -0.700 30 0
#> expression n.expression
#> <list> <chr>
#> 1 <language> "OJ\n(n = 30)"
#> 2 <language> "VC\n(n = 30)"
# ----------------------- Bayesian -------------------------
centrality_description(sleep, group, extra, type = "bayes")
#> # A tibble: 2 × 11
#> group extra iqr min max skewness kurtosis n.obs missing.obs expression
#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <list>
#> 1 1 0.0579 2.8 -1.6 3.7 0.581 -0.630 10 0 <language>
#> 2 2 0.973 3.82 -0.1 5.5 0.386 -1.42 10 0 <language>
#> n.expression
#> <chr>
#> 1 "1\n(n = 10)"
#> 2 "2\n(n = 10)"