Choosing Statistical and Epidemiologic Methods

Choosing a Statistical Method, with Automatic Internet Searching

Taken from ‘Outline for the Proposed JavaScript Open Source Version of StatCalc’

By Hemant Kulkarni, San Antonio November 18, 2002 Searching revised, June 2010

The following tables outline the suggested matrix. Please remember that the suggested options are the ones that CAN be chosen in a given situation. These options do NOT HAVE to be chosen.

To do a search, if you have an Internet connection, click on an item within a table cell. In some browsers, a choice of search will appear as a funny little icon; in others, the search may happen automatically, and may include ALL the items in the cell. After you see the results, you may want to add a word like ´method,´ ´calculator,´ ´equation,´ or ´tutorial´ to narrow the search. To return from the search, use the browser's BACK button or close the browser window.

DESCRIPTIVE STUDIES

Situation	Epidemiologic results	Statistical methods
Time description	Scatter plots Scatter plots with trend lines Scatter plots with spline smoothing	Seasonality index
Place description	Spot maps	Clustering methods like k-means
Person description	Bar charts Pie charts Histograms Box plots Box and whisker plots Stem and leaf diagrams Hierarchical trees Scatter plots	Mean Proportion Standard deviation Standard errors Median Percentiles Mode

CROSS-SECTIONAL STUDIES

Situation	Epidemiologic results	Statistical methods
Group comparison	Risk reduction with 95% confidence intervals Odds ratios with 95% confidence intervals	chi square test of association Fisher’s exact test Φ coefficient Cramer’s V chi square test for linear trend Student’s T test (unpaired) Mann-Whitney U test Mantel-Haenszel chi square test Unconditional multiple logistic regression Generalized linear models Bonferroni’s corrections

CASE-CONTROL STUDIES

Situation

Epidemiologic results

Statistical methods

Unmatched case-control study

Single exposure variable expressed as Yes/No
One set of controls

Odds ratio with 95% confidence interval

Cornfield’s method
Woolf’s method
Exact method

chi square test of association
Fisher’s exact test
Φ coefficient

Unmatched case-control study

Single exposure variable expressed as multiple, ordered categories
One set of controls

Odds ratio for each category compared to the reference category with 95% confidence interval

Cornfield’s method
Woolf’s method
Exact method

chi square test of association
Fisher’s exact test
Cramer’s V
chi square test for linear trend

Unmatched case-control study

Single continuous variable
One set of controls

Student’s T test (unpaired)
Mann-Whitney U test

Unmatched case-control study

Multiple exposure variables expressed as Yes/No
One set of controls

Odds ratio with 95% confidence interval for each variable separately

Cornfield’s method
Woolf’s method
Exact method

Adjusted ORs for combinations of variables

Stratified and summary OR with 95% confidence interval

chi square test of association
Fisher’s exact test
Φ coefficient
Unconditional multiple logistic regression
Mantel-Haenszel chi square test

Unmatched case-control study

Multiple exposure variables expressed as two or more, ordered categories
One set of controls

Odds ratio for each category compared to the reference category with 95% confidence interval for each variable separately

Cornfield’s method
Woolf’s method
Exact method

Adjusted ORs for combinations of variable categories

chi square test of association
Fisher’s exact test
Cramer’s V
chi square test for linear trend
Unconditional multiple logistic regression

Unmatched case-control study

Multiple continuous variables
One set of controls

Student’s T test (unpaired)
Mann-Whitney U test
Unconditional multiple logistic regression

Unmatched case-control study

Combinations of variable types
One set of controls

Odds ratio for each category compared to the reference category with 95% confidence interval for each variable separately

Cornfield’s method
Woolf’s method
Exact method

Adjusted ORs for combinations of variable categories

Stratified and summary OR with 95% confidence interval

chi square test of association
Fisher’s exact test
Φ coefficient
Cramer’s V
chi square test for linear trend
Student’s T test (unpaired)
Mann-Whitney U test
Mantel-Haenszel chi square test
Unconditional multiple logistic regression
Generalized linear models

Unmatched case-control study

Single exposure variable expressed as Yes/No
Multiple sets of controls

Odds ratio with 95% confidence interval

Cornfield’s method
Woolf’s method
Exact method

Stratified and summary OR with 95% confidence interval

chi square test of association
Fisher’s exact test
Φ coefficient
Mantel-Haenszel chi square test

Unmatched case-control study

Single exposure variable expressed as multiple, ordered categories
Multiple sets of controls

Odds ratio for each category compared to the reference category with 95% confidence interval

Cornfield’s method
Woolf’s method
Exact method

For each set of controls separately

chi square test of association
Fisher’s exact test
Cramer’s V
chi square test for linear trend
Bonferroni’s correction for multiple comparisons

Unmatched case-control study

Single continuous variable
Multiple sets of controls

Analysis of variance with Tukey’s pairwise comparison tests
Kruskal-Wallis test with Mann-Whitney test for pairwise comparisons after Bonferroni correction

Unmatched case-control study

Multiple exposure variables expressed as Yes/No
Multiple sets of controls

Odds ratio with 95% confidence interval for each variable separately

Cornfield’s method
Woolf’s method
Exact method

Adjusted ORs for combinations of variables

Stratified and summary OR with 95% confidence interval

chi square test of association
Fisher’s exact test
Φ coefficient
Mantel-Haenszel test
Unconditional multiple logistic regression

Unmatched case-control study

Multiple exposure variables expressed as two or more, ordered categories
Multiple sets of controls

Odds ratio for each category compared to the reference category with 95% confidence interval for each variable separately

Cornfield’s method
Woolf’s method
Exact method

Adjusted ORs for combinations of variable categories

chi square test of heterogeneity
Fisher’s exact test
Cramer’s V
chi square test for linear trend
Unconditional multiple logistic regression

Unmatched case-control study

Multiple continuous variables
Multiple sets of controls

Analysis of variance with Tukey’s pairwise comparison tests
Kruskal-Wallis test with Mann-Whitney test for pairwise comparisons after Bonferroni correction
Polytomous multiple logistic regression

Unmatched case-control study

Combinations of variable types
Multiple sets of controls

Odds ratio for each category compared to the reference category with 95% confidence interval for each variable separately

Cornfield’s method
Woolf’s method
Exact method

Adjusted ORs for combinations of variable categories

Stratified and summary ORs

chi square test of association
Fisher’s exact test
Φ coefficient
Cramer’s V
chi square test for linear trend
Student’s T test (unpaired)
Mann-Whitney U test
Unconditional multiple logistic regression
Analysis of variance and Tukey’s pairwise comparisons
Kruskal-Wallis test
Generalized linear models

Matched case-control study

Single exposure variable expressed as Yes/No
One set of controls

Odds ratio with 95% confidence interval

· McNemar’s method

McNemar’s chi square test of association

Matched case-control study

Single exposure variable expressed as multiple, ordered categories
One set of controls

Odds ratio for each category compared to the reference category with 95% confidence interval by collapsing into 2X2 tables

McNemar’s method

McNemar’s chi square test of association
Breslow-Day test

Matched case-control study

Single continuous variable
One set of controls

Student’s T test (paired)
Wilcoxon signed-rank test

Matched case-control study

Multiple exposure variables expressed as Yes/No
One set of controls

Odds ratio with 95% confidence interval for each variable separately

McNemar’s Method

Adjusted ORs for combinations of variables

McNemar’s chi square test of association
Conditional multiple logistic regression

Matched case-control study

Multiple exposure variables expressed as two or more, ordered categories
One set of controls

Odds ratio for each category compared to the reference category with 95% confidence interval for each variable separately

McNemar’s method

Adjusted ORs for combinations of variable categories

McNemar’s chi square test of association
Breslow-Day test
Conditional multiple logistic regression

Matched case-control study

Multiple continuous variables
One set of controls

Student’s T test (paired)
Wilcoxon signed- rank test
Conditional multiple logistic regression

Matched case-control study

Combinations of variable types
One set of controls

Odds ratio for each category compared to the reference category with 95% confidence interval for each variable separately

McNemar’s method

Adjusted ORs for combinations of variable categories

McNemar’s chi square test of association
Breslow-Day test
Student’s T test (paired)
Wilcoxon signed-rank test
Conditional multiple logistic regression
Generalized linear models

Matched case-control study

Single exposure variable expressed as Yes/No
Multiple sets of controls

Odds ratio with 95% confidence interval

McNemar’s method

Stratified and summary OR with 95% confidence interval

McNemar’s chi square test of association
Mantel-Haenszel chi square test

Matched case-control study

Single exposure variable expressed as multiple, ordered categories
Multiple sets of controls

Odds ratio for each category compared to the reference category with 95% confidence interval by collapsing into 2X2 tables

McNemar’s method

For each set of controls separately

McNemar’s chi square test of association
Breslow-Day test
Bonferroni’s correction for multiple comparisons

Matched case-control study

Single continuous variable
Multiple set of controls

Friedman’s Analysis of variance
Kruskal-Wallis test with Mann-Whitney test for pairwise comparisons after Bonferroni correction

Matched case-control study

Multiple exposure variables expressed as Yes/No
Multiple sets of controls

Odds ratio with 95% confidence interval for each variable separately

McNemar’s method

Adjusted ORs for combinations of variables

Stratified and summary OR with 95% confidence interval

McNemar’s chi square test of association
Mantel-Haenszel test
Conditional multiple logistic regression

Matched case-control study

Multiple exposure variables expressed as two or more, ordered categories
Multiple sets of controls

Odds ratio for each category compared to the reference category with 95% confidence interval for each variable separately

McNemar’s method

Adjusted ORs for combinations of variable categories

chi square test of heterogeneity
Breslow-Day test
Conditional multiple logistic regression

Matched case-control study

Multiple continuous variables
Multiple sets of controls

Friedman’s Analysis of variance
Kruskal-Wallis test with Mann-Whitney test for pairwise comparisons after Bonferroni correction
Polytomous multiple logistic regression

Matched case-control study

Combinations of variable types
Multiple sets of controls

Odds ratio for each category compared to the reference category with 95% confidence interval for each variable separately

McNemar’s method

Adjusted ORs for combinations of variable categories

Stratified and summary ORs

McNemar’s chi square test of association
Breslow-Day test
Student’s T test (paired)
Wilcoxon’s signed-rank test
Conditional multiple logistic regression
Friedman’s Analysis of variance
Kruskal-Wallis test
Mantel-Haenszel test
Generalized linear models

4. COHORT STUDIES

Situation	Epidemiologic results	Statistical methods
Outcome is number of events in the exposed and unexposed groups where the event can occur only once in a single individual	Relative risk with 95% confidence interval Excess fraction (also called as risk reduction) Etiologic fraction (also called attributable risk) Population attributable risk Population attributable risk proportion	chi square test of association Fisher’s exact test Φ coefficient chi square test of linear trend for multiple ordered categorical variables Unconditional multiple logistic regression Generalized linear models
Outcome is number of events in the exposed and unexposed groups where the event can occur multiple times in a single individual	Relative hazards and their 95% confidence interval	Kaplan-Meier survival curves Log-rank test OR Wilcoxon test based on the violation of proportional hazards assumption Cox proportional hazards model Poisson’s multiple regression
Outcome is time to events in the exposed and unexposed groups	Relative hazards and their 95% confidence intervals	Kaplan-Meier survival curves Log-rank test OR Wilcoxon test based on the violation of proportional hazards assumption Cox proportional hazards model
Outcome is a continuous variable for the exposed and unexposed groups	Relative risks and their 95% confidence intervals using clinically meaningful cut-offs	chi square test of association Fisher’s exact test Φ coefficient chi square test of linear trend for multiple ordered categorical variables Unconditional multiple logistic regression Generalized linear models Time series analyses

5. CLINICAL AND PREVENTIVE TRIALS

Situation	Epidemiologic results	Statistical methods
Outcome is number of events in the exposed and unexposed groups where the event can occur only once in a single individual	Relative risk with 95% confidence interval Excess fraction (also called as risk reduction) Etiologic fraction (also called attributable risk) Number needed to treat (NNT) Vaccine efficacy with its 95% confidence interval	chi square test of association Fisher’s exact test Φ coefficient chi square test of linear trend for multiple ordered categorical variables Unconditional multiple logistic regression Generalized linear models
Outcome is number of events in the exposed and unexposed groups where the event can occur multiple times in a single individual	Relative hazards and their 95% confidence interval Vaccine efficacy with 95% confidence interval	Kaplan-Meier survival curves Log-rank test OR Wilcoxon test based on the violation of proportional hazards assumption Cox proportional hazards model Poisson’s multiple regression Sequential trial with interim analyses using the alpha-spending function Generalized linear models
Outcome is time to events in the exposed and unexposed groups	Relative hazards and their 95% confidence intervals Vaccine efficacy and its 95% confidence interval	Kaplan-Meier survival curves Log-rank test OR Wilcoxon test based on the violation of proportional hazards assumption Cox proportional hazards model
Outcome is a continuous variable for the exposed and unexposed groups	Relative risks and their 95% confidence intervals using clinically meaningful cut-offs	chi square test of association Fisher’s exact test Φ coefficient chi square test of linear trend for multiple ordered categorical variables Unconditional multiple logistic regression Generalized linear models Time series analyses

SCREENING TEST PERFORMANCE EVALUATION

Situation	Epidemiologic results	Statistical methods
Dichotomous (Positive/Negative) result compared with a reference standard	Sensitivity Specificity Positive predictivity Negative predictivity Accuracy Likelihood ratio of a positive test Likelihood ratio of a negative test Cohen’s kappa Entropy Bias index	chi square test of association Fisher’s exact test Φ coefficient
Multiple categorical test outcome compared with a reference standard	Likelihood ratios for each categorical test outcome Receiver-operating characteristic curve	chi square test of association Fisher’s exact test Φ coefficient chi square test of linear trend Area under the ROC curve and its 95% confidence interval by Wilcoxon’s method
Continuous test outcome compared with a reference standard	Likelihood ratios for pre-defined categories of the outcome Receiver operating characteristic curve	Area under the ROC curve and its 95% confidence interval Optimum operating point (OOP) choosing Cost-of-error comparisons