| Title: | Easy Publication-Ready Tables and Regression Analysis |
|---|---|
| Description: | Streamlines the creation of descriptive frequency tables ('Table 1'), diagnostic test accuracy evaluations (sensitivity, specificity, predictive values), and multi-outcome regression summaries. Features automatic tables, prevalence and odds ratio calculations, and seamless integration with 'flextable' for exporting results to 'Microsoft Word' and 'PowerPoint'. |
| Authors: | Matheus Trabuco Gonzalez [aut, cre] |
| Maintainer: | Matheus Trabuco Gonzalez <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.2.0 |
| Built: | 2026-05-21 09:24:09 UTC |
| Source: | https://github.com/matheustg-14/simtablr |
Convert tb Object to Flextable
## S3 method for class 'tb' as_flextable(x, ...)## S3 method for class 'tb' as_flextable(x, ...)
x |
A |
... |
Additional arguments passed to |
A flextable object.
Extracts the performance metrics table as a plain data.frame.
## S3 method for class 'diag_test' as.data.frame(x, ...)## S3 method for class 'diag_test' as.data.frame(x, ...)
x |
A |
... |
Additional arguments (unused). |
A data.frame with columns Metric, Estimate, LowerCI,
UpperCI.
Convert tb to Data Frame
## S3 method for class 'tb' as.data.frame(x, ...)## S3 method for class 'tb' as.data.frame(x, ...)
x |
A |
... |
Additional arguments (unused). |
A data.frame with the formatted table.
Computes a 2x2 confusion matrix and comprehensive diagnostic performance metrics for a binary classification test, with exact binomial confidence intervals.
diag_test( data, test, ref, positive = NULL, test_positive = NULL, conf.level = 0.95 )diag_test( data, test, ref, positive = NULL, test_positive = NULL, conf.level = 0.95 )
data |
A data.frame containing |
test |
Unquoted name of the diagnostic test variable (must be binary). |
ref |
Unquoted name of the reference standard variable (must be binary). |
positive |
Character or numeric. Level representing "Positive" in the
reference variable. If |
test_positive |
Character or numeric. Level representing "Positive" in
the test variable. If |
conf.level |
Numeric. Confidence level for binomial CIs (0-1).
Default: |
| Ref + | Ref - -----------+---------+-------- Test + | TP | FP Test - | FN | TN
Sensitivity (Recall) = TP / (TP + FN)
Specificity = TN / (TN + FP)
PPV (Precision) = TP / (TP + FP)
NPV = TN / (TN + FN)
Accuracy = (TP + TN) / Total
Prevalence = (TP + FN) / Total
Likelihood Ratio + = Sensitivity / (1 - Specificity)
Likelihood Ratio - = (1 - Sensitivity) / Specificity
Youden's Index = Sensitivity + Specificity - 1
F1 Score = 2 x (PPV x Sensitivity) / (PPV + Sensitivity)
Binomial CIs (exact Clopper-Pearson) are computed for the first six metrics. Likelihood Ratios, Youden's Index, and F1 Score do not have CIs.
An object of class diag_test - a named list with:
$table: 2x2 table object (Test x Ref).
$stats: data.frame with columns Metric, Estimate, LowerCI,
UpperCI.
$labels: named list with ref_pos, ref_neg, test_pos, test_neg.
$sample_size: integer, total valid observations.
$conf.level: numeric, confidence level used.
print.diag_test(), as.data.frame.diag_test(),
plot.diag_test()
set.seed(1) n <- 200 ref <- factor(sample(c("No", "Yes"), n, replace = TRUE, prob = c(.55, .45))) tst <- ifelse(ref == "Yes", ifelse(runif(n) < .80, "Yes", "No"), ifelse(runif(n) < .85, "No", "Yes")) df <- data.frame(rapid_test = factor(tst), lab = ref) result <- diag_test(df, test = rapid_test, ref = lab, positive = "Yes", test_positive = "Yes") print(result) as.data.frame(result)set.seed(1) n <- 200 ref <- factor(sample(c("No", "Yes"), n, replace = TRUE, prob = c(.55, .45))) tst <- ifelse(ref == "Yes", ifelse(runif(n) < .80, "Yes", "No"), ifelse(runif(n) < .85, "No", "Yes")) df <- data.frame(rapid_test = factor(tst), lab = ref) result <- diag_test(df, test = rapid_test, ref = lab, positive = "Yes", test_positive = "Yes") print(result) as.data.frame(result)
A simulated dataset containing demographic, clinical, and outcome variables for 500 individuals. Designed for demonstrating table creation and diagnostic testing analysis.
epitablepitabl
A data frame with 500 rows and 19 variables:
Unique patient identifier
Age in years (Numeric)
Biological sex (Female, Male)
Body Mass Index in kg/m2 (Numeric, contains NAs)
Smoking status (Never, Former, Current)
Physical activity level (Low, Moderate, High)
Educational attainment (High School, Some College, College+)
Annual household income (<30k, 30-60k, 60k+)
Disease status - primary outcome (No, Yes)
Result of rapid diagnostic test (Negative, Positive)
Laboratory confirmation - gold standard (No, Yes)
Score 0-5 based on medical history
Count of primary care visits in past year
Count of specialist visits in past year
Count of emergency department visits in past year
Hospitalized in past year (No, Yes)
Systolic blood pressure in mmHg
Total cholesterol in mg/dL
Geographic region (North, South, East, West)
Simulated data for the SimtablR package.
data(epitabl) # Basic description tb(epitabl, sex, disease)data(epitabl) # Basic description tb(epitabl, sex, disease)
Export regtab Results to CSV
export_regtab_csv(x, file, ...)export_regtab_csv(x, file, ...)
x |
A data.frame from |
file |
File path. |
... |
Additional arguments passed to |
Invisibly returns x.
Requires the openxlsx package.
export_regtab_xlsx(x, file, ...)export_regtab_xlsx(x, file, ...)
x |
A data.frame from |
file |
File path (.xlsx). |
... |
Additional arguments passed to |
Invisibly returns x.
Draws a fourfold display of the confusion matrix with sensitivity and specificity annotated on the bottom margin.
## S3 method for class 'diag_test' plot(x, col = c("#ffcccc", "#ccffcc"), main = "Confusion Matrix", ...)## S3 method for class 'diag_test' plot(x, col = c("#ffcccc", "#ccffcc"), main = "Confusion Matrix", ...)
x |
A |
col |
Character vector of length 2. Fill colours for the negative and
positive quadrants respectively. Default: |
main |
Character. Plot title. Default: |
... |
Additional arguments passed to |
Invisibly returns x.
Displays a formatted summary of the confusion matrix and all diagnostic performance metrics with confidence intervals.
## S3 method for class 'diag_test' print(x, digits = 3L, ...)## S3 method for class 'diag_test' print(x, digits = 3L, ...)
x |
A |
digits |
Integer. Decimal places for metrics. Default: |
... |
Additional arguments (unused). |
Invisibly returns x.
Print Method for regtab Results
## S3 method for class 'regtab' print(x, ...)## S3 method for class 'regtab' print(x, ...)
x |
A data.frame returned by |
... |
Additional arguments passed to |
Invisibly returns x.
Print Method for tb Objects
## S3 method for class 'tb' print(x, digits = NULL, ...)## S3 method for class 'tb' print(x, digits = NULL, ...)
x |
A |
digits |
Number of decimal places to display. |
... |
Additional arguments (unused). |
Invisibly returns x, called for side effects.
Fits generalized linear models (GLMs) for multiple outcome variables and generates a formatted wide-format table with point estimates and confidence intervals. Supports robust standard errors, automatic exponentiation for count/binary outcomes, and custom labeling for publication-ready tables.
regtab( data, outcomes, predictors, family = poisson(link = "log"), robust = TRUE, exponentiate = NULL, labels = NULL, d = 2, conf.level = 0.95, include_intercept = FALSE, p_values = FALSE )regtab( data, outcomes, predictors, family = poisson(link = "log"), robust = TRUE, exponentiate = NULL, labels = NULL, d = 2, conf.level = 0.95, include_intercept = FALSE, p_values = FALSE )
data |
Data.frame containing all variables for analysis. |
outcomes |
Character vector of dependent variable names. Each outcome is modeled separately with the same set of predictors. |
predictors |
Formula or character string specifying predictors. Can be:
|
family |
GLM family specification. Options:
|
robust |
Logical. If TRUE (default), calculates heteroskedasticity-consistent (HC0) robust standard errors via the sandwich package. CIs are based on robust SEs. |
exponentiate |
Logical. If TRUE, exponentiates coefficients and CIs:
If NULL (default), automatically detects: TRUE for Poisson/Binomial, FALSE for Gaussian. |
labels |
Named character vector for renaming outcome columns in output.
Format: |
d |
Integer. Number of decimal places for rounding estimates and CIs. Default: 2. |
conf.level |
Numeric. Confidence level for intervals (0-1). Default: 0.95. |
include_intercept |
Logical. If TRUE, includes intercept in output table. Default: FALSE (typically excluded from publication tables). |
p_values |
Logical. If TRUE, adds p-values as separate column. Default: FALSE. |
For each outcome, the function fits:
glm(outcome ~ predictors, family = family, data = data)
When robust = TRUE, the function:
Fits the model with standard GLM.
Computes sandwich covariance matrix (HC0 estimator).
Calculates Wald-type CIs based on robust SEs.
This provides protection against heteroskedasticity and mild model misspecification.
Poisson regression: exp(beta) = Incidence Rate Ratio
IRR = 1: No association
IRR > 1: Increased rate
IRR < 1: Decreased rate
Logistic regression: exp(beta) = Odds Ratio
OR = 1: No association
OR > 1: Increased odds
OR < 1: Decreased odds
Returns a wide-format data.frame:
Variable | Outcome1 | Outcome2 | ... ------------|-------------------|-------------------|---- (Intercept) | 2.34 (1.89-2.91) | 1.98 (1.65-2.38) | ... age | 1.05 (1.02-1.08) | 1.03 (1.01-1.06) | ... sex | 0.87 (0.75-1.01) | 0.92 (0.81-1.05) | ...
Each cell contains: "Estimate (Lower CI - Upper CI)"
GLM uses complete cases by default. Observations with missing values in any variable are excluded from that specific model.
If a model fails to converge or encounters errors:
A warning is issued with the outcome name and error message
That outcome column is skipped in the output
Other outcomes continue processing
A data.frame in wide format with:
Variable: Predictor names (first column)
Outcome columns: One column per outcome with formatted estimates and CIs
Can be directly exported to Excel, Word, or LaTeX for publication.
# Create example data set.seed(456) n <- 500 df <- data.frame( age = rnorm(n, 50, 10), sex = factor(sample(c("M", "F"), n, replace = TRUE)), treatment = factor(sample(c("A", "B"), n, replace = TRUE)), outcome1 = rpois(n, lambda = 5), outcome2 = rpois(n, lambda = 8), outcome3 = rpois(n, lambda = 3) ) # Basic usage: Poisson regression for multiple outcomes regtab(df, outcomes = c("outcome1", "outcome2", "outcome3"), predictors = ~ age + sex + treatment, family = poisson(link = "log")) # With custom labels and no robust SEs regtab(df, outcomes = c("outcome1", "outcome2"), predictors = "age + sex", labels = c(outcome1 = "Primary Endpoint", outcome2 = "Secondary Endpoint"), robust = FALSE) # Logistic regression with p-values df$binary_outcome <- rbinom(n, 1, 0.4) regtab(df, outcomes = "binary_outcome", predictors = ~ age + sex, family = binomial(), p_values = TRUE)# Create example data set.seed(456) n <- 500 df <- data.frame( age = rnorm(n, 50, 10), sex = factor(sample(c("M", "F"), n, replace = TRUE)), treatment = factor(sample(c("A", "B"), n, replace = TRUE)), outcome1 = rpois(n, lambda = 5), outcome2 = rpois(n, lambda = 8), outcome3 = rpois(n, lambda = 3) ) # Basic usage: Poisson regression for multiple outcomes regtab(df, outcomes = c("outcome1", "outcome2", "outcome3"), predictors = ~ age + sex + treatment, family = poisson(link = "log")) # With custom labels and no robust SEs regtab(df, outcomes = c("outcome1", "outcome2"), predictors = "age + sex", labels = c(outcome1 = "Primary Endpoint", outcome2 = "Secondary Endpoint"), robust = FALSE) # Logistic regression with p-values df$binary_outcome <- rbinom(n, 1, 0.4) regtab(df, outcomes = "binary_outcome", predictors = ~ age + sex, family = binomial(), p_values = TRUE)
Creates comprehensive tables for categorical or continuous variables with formatting, statistical tests, prevalence ratios (PR), odds ratios (OR), and column stratification.
tb( data, ..., m = FALSE, d = 1, format = TRUE, style = "n_pct", style.rp = "{rp} ({lower} - {upper})", style.or = "{or} ({lower} - {upper})", test = FALSE, subset = NULL, strat = NULL, rp = FALSE, or = FALSE, ref = NULL, conf.level = 0.95, var.type = NULL, stat.cont = "median" )tb( data, ..., m = FALSE, d = 1, format = TRUE, style = "n_pct", style.rp = "{rp} ({lower} - {upper})", style.or = "{or} ({lower} - {upper})", test = FALSE, subset = NULL, strat = NULL, rp = FALSE, or = FALSE, ref = NULL, conf.level = 0.95, var.type = NULL, stat.cont = "median" )
data |
A data.frame or atomic vector. |
... |
Variables to be tabulated. Accepts variable names and/or flags
( |
m |
Logical. Include missing values (NA) in the table. Default: |
d |
Integer. Decimal places for percentages and statistics. Default: |
format |
Logical. Render a formatted grid output. Default: |
style |
Character. Format for displaying counts and percentages.
Options: |
style.rp |
Character. Format string for Prevalence Ratio.
Default: |
style.or |
Character. Format string for Odds Ratio.
Default: |
test |
Logical or Character. Performs statistical test on 2x2+ tables.
|
subset |
Logical expression for row filtering. |
strat |
Variable for column stratification. Disables PR/OR calculations. |
rp |
Logical. Calculate Prevalence Ratios (PR). Default: |
or |
Logical. Calculate Odds Ratios (OR). Default: |
ref |
Character or numeric. Reference level for PR/OR calculations. |
conf.level |
Numeric. Confidence level for intervals (0-1). Default: |
var.type |
Named character vector specifying variable types, e.g.
|
stat.cont |
Character. |
An object of class tb (a matrix with attributes).