gtsummary package

Examples taken from: https://r-graph-gallery.com/package/gtsummary.html#:~:text=The%20gtsummary%20package%20in%20R,code%20and%20publication%2Dready%20tables!

Basic usage

data(iris)
library(gtsummary)

iris %>%
  tbl_summary()
Characteristic N = 1501
Sepal.Length 5.80 (5.10, 6.40)
Sepal.Width 3.00 (2.80, 3.30)
Petal.Length 4.35 (1.60, 5.10)
Petal.Width 1.30 (0.30, 1.80)
Species
    setosa 50 (33%)
    versicolor 50 (33%)
    virginica 50 (33%)
1 Median (Q1, Q3); n (%)

Regression model results

# load dataset
data(Titanic)
df = as.data.frame(Titanic)

# load library
library(gtsummary)

# create the model
model = glm(Survived ~ Age + Class + Sex + Freq, family=binomial, data=df)

# generate table 
model %>%
  tbl_regression() %>% # regression summary function
  add_global_p() %>% # add p-values
  bold_labels() %>% # make label in bold
  italicize_levels() # make categories in label in italic
Characteristic log(OR) 95% CI p-value
Age

0.5
    Child
    Adult 0.62 -1.0, 2.4
Class

>0.9
    1st
    2nd -0.03 -2.0, 2.0
    3rd 0.25 -1.8, 2.4
    Crew 0.27 -1.8, 2.4
Sex

0.6
    Male
    Female -0.37 -1.9, 1.1
Freq -0.01 -0.02, 0.00 0.2
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

Summarize table

# load dataset and filter to keep just a few columns
data(mtcars) 
mtcars = mtcars %>%
  select(vs, mpg, drat, hp, gear)

# load package
library(gtsummary)

# create summary table
mtcars %>%
  tbl_summary(
    by=vs, # group by the `vs` variable (dichotomous: 0 or 1)
    statistic = list(
      all_continuous() ~ "{mean} ({sd})", # will display: mean (standard deviation)
      all_categorical() ~ "{n} / {N} ({p}%)" # will display: n / N (percentage)
    )
  ) %>%
  add_overall() %>% # statistics for all observations
  add_p() %>% # add p-values
  bold_labels() %>% # make label in bold
  italicize_levels() # make categories in label in italic
## The following warnings were returned during `add_p()`:
## ! For variable `drat` (`vs`) and "estimate", "statistic", "p.value",
##   "conf.low", and "conf.high" statistics: cannot compute exact p-value with
##   ties
## ! For variable `drat` (`vs`) and "estimate", "statistic", "p.value",
##   "conf.low", and "conf.high" statistics: cannot compute exact confidence
##   intervals with ties
## ! For variable `hp` (`vs`) and "estimate", "statistic", "p.value", "conf.low",
##   and "conf.high" statistics: cannot compute exact p-value with ties
## ! For variable `hp` (`vs`) and "estimate", "statistic", "p.value", "conf.low",
##   and "conf.high" statistics: cannot compute exact confidence intervals with
##   ties
## ! For variable `mpg` (`vs`) and "estimate", "statistic", "p.value", "conf.low",
##   and "conf.high" statistics: cannot compute exact p-value with ties
## ! For variable `mpg` (`vs`) and "estimate", "statistic", "p.value", "conf.low",
##   and "conf.high" statistics: cannot compute exact confidence intervals with
##   ties
Characteristic Overall
N = 32
1
0
N = 18
1
1
N = 14
1
p-value2
mpg 20.1 (6.0) 16.6 (3.9) 24.6 (5.4) <0.001
drat 3.60 (0.53) 3.39 (0.47) 3.86 (0.51) 0.013
hp 147 (69) 190 (60) 91 (24) <0.001
gear


0.001
    3 15 / 32 (47%) 12 / 18 (67%) 3 / 14 (21%)
    4 12 / 32 (38%) 2 / 18 (11%) 10 / 14 (71%)
    5 5 / 32 (16%) 4 / 18 (22%) 1 / 14 (7.1%)
1 Mean (SD); n / N (%)
2 Wilcoxon rank sum test; Fisher’s exact test

Custom style of the table

data(iris)
library(gtsummary)
library(gt)

iris %>%
  tbl_summary(by=Species) %>%
  add_overall() %>% # info ignoring the `by` argument
  add_n() %>% # number of observations
  modify_header(label ~ "**Variables from the dataset**") %>% # title of the variables
  modify_spanning_header(c("stat_0", "stat_1", "stat_2", "stat_3") ~ "*Descriptive statistics of the iris flowers*, grouped by Species") %>%
  as_gt() %>%
  gt::tab_source_note(gt::md("*The iris dataset is probably the **most famous** dataset in the world*"))
Variables from the dataset N
Descriptive statistics of the iris flowers, grouped by Species
Overall
N = 150
1
setosa
N = 50
1
versicolor
N = 50
1
virginica
N = 50
1
Sepal.Length 150 5.80 (5.10, 6.40) 5.00 (4.80, 5.20) 5.90 (5.60, 6.30) 6.50 (6.20, 6.90)
Sepal.Width 150 3.00 (2.80, 3.30) 3.40 (3.20, 3.70) 2.80 (2.50, 3.00) 3.00 (2.80, 3.20)
Petal.Length 150 4.35 (1.60, 5.10) 1.50 (1.40, 1.60) 4.35 (4.00, 4.60) 5.55 (5.10, 5.90)
Petal.Width 150 1.30 (0.30, 1.80) 0.20 (0.20, 0.30) 1.30 (1.20, 1.50) 2.00 (1.80, 2.30)
The iris dataset is probably the most famous dataset in the world
1 Median (Q1, Q3)