A QUICK STATS OVERVIEW
Statistics is not a magic act. At its core, is the effort to understand variation. In this incomplete overview I’ll provide some definitions of basic statistics terminology and attempt to provide a flow chart that helps you to understand the type of information provided by some common statistical tests. I am by no means a mathematician, my goal is for a conceptual understanding not mathematical. I think we’ll have plenty of opportunity to hash out the details in our study group discussions. If we get into complex multivariate models I’ll try to post additional information (if I understand what is going on in the article…).
I. Levels of Measurement – There are four levels of measurement. These levels are a hierarchy with nominal at the lower end and ratio at the top. As you move from higher to lower you lose information and have less ability to mathematically and statistically manipulate the data.
Nominal: Qualitative data with no natural ordering; Characteristics or attributes may be randomly coded numerically but the numbers really provide no information about the variable. For instance you might have the categories male and female and code them "1" and "2" respectively.
Ordinal: Qualitative data with an ordering. For instance if you were to measure ADL’s using a scoring system of 1=dependent, 2=max assist, 3=mod assist, 4=min assist, 5=independent.
Interval: Quantitative data in which there is a clear ordering and the distance between objects is specified but there is no meaningful zero point (absence of attribute). For instance temperature measured in Fahrenheit.
Ratio: Quantitative data which have a rational and meaningful zero.
II. Measures of Central Tendency
Mode: numeric value that occurs most frequently.
Median: the point on the numeric scale above which and below which 505 of the cases fall.
Mean: sum of scores divided by total number of scores, arithmetic average.
III. Measures of Variability
Range: highest score minus the lowest score.
Standard deviation: used with interval or ratio level data, summarizes the average amount of deviation of all values of a distribution from the mean.
Variance: value of the standard deviation before a square root is taken. Variance is not typically reported in research but is often used in inferential statistical tests.
IV. Inferential Statistics (how to draw conclusions about a population from a sample)
Null hypothesis: the statement that there is no true relationship between variables and any observed relationship is due to chance alone.
Type I Error: Reject the null hypothesis when it is true. The probability of committing a Type I error is the level of significance of a given statistical test.
Type II Error: Accept a null hypothesis when it is false.
Parametric vs. Nonparametric tests: Parametric tests require measurement on at least an interval scale and typically have a number of other assumptions regarding the distribution of variables. Nonparametric tests are applied to data collected on nominal or ordinal scales and have relatively far fewer assumptions.
V. Some tables and charts about the tests themselves:
What question do you want to answer?
How well does one variable predict another?® Regression analysis
What is the degree that two variables are related?
Ratio/Interval data?® Pearson Product Moment Correlation
Ordinal data?® Spearman Rank Correlation
Is there a difference between/among group means?
Nominal data?® Chi square
Ordinal data?® Kruskal-Wallis One-way ANOVA by ranks (if less than 2 groups/comparisons)
?® Wilcoxon Signed-Ranks test (more than 2 groups/comparisons, paired data)
?® Mann-Whitney U Test or Wilcoxon Rank Sum test (more than 2 groups/comparisons, unpaired data)
Ratio/interval data?® t-test (if only 2 groups/comparisons, may be done paired or unpaired)
?® ANOVA (if more than 2 groups/comparisons)
Statistical tests commonly used
Test |
Assumptions/Requirements |
What will it tell me? |
t-test |
Ratio/interval data Paired or unpaired data 2 groups to be compared |
Is there a difference between group means? |
ANOVA |
Ratio/interval data Comparison among >3 group means Paired or unpaired data Repeated measures possible |
Is there a difference among group means? |
Bonferroni test |
Following ANOVA Within subjects, between subjects or mixed design |
Where did differences occur among group means? |
Scheffe test |
Following ANOVA (post hoc) Within or between subjects design |
Where did differences occur among group means? |
Tukey test |
Following ANOVA (post hoc) Within or between subjects design |
Where did differences occur among group means? |
Dunnett test |
Following ANOVA (post hoc) Within or between subjects design Comparisons only to 1 group mean |
Where did differences occur among group means? |
Linear Regression Analysis |
Ratio/interval data Homogeneity of variance Linearity |
Can I predict one variable based on another? |
Pearson Product Moment Correlation |
Ratio/interval data Linear relationship |
What is the degree of relationship between two variables? |
Spearman Rank Correlation |
Ratio/interval/ordinal data Linear relationship |
What is the degree of relationship between two variables? |
Chi Square |
Nominal/ordinal data |
Is the distribution of observed frequencies different from the expected frequencies? |