A QUICK STATS OVERVIEW

Statistics is not a magic act. At its core, is the effort to understand variation. In this incomplete overview I’ll provide some definitions of basic statistics terminology and attempt to provide a flow chart that helps you to understand the type of information provided by some common statistical tests. I am by no means a mathematician, my goal is for a conceptual understanding not mathematical. I think we’ll have plenty of opportunity to hash out the details in our study group discussions. If we get into complex multivariate models I’ll try to post additional information (if I understand what is going on in the article…).

I. Levels of Measurement – There are four levels of measurement. These levels are a hierarchy with nominal at the lower end and ratio at the top. As you move from higher to lower you lose information and have less ability to mathematically and statistically manipulate the data.

Nominal: Qualitative data with no natural ordering; Characteristics or attributes may be randomly coded numerically but the numbers really provide no information about the variable. For instance you might have the categories male and female and code them "1" and "2" respectively.

Ordinal: Qualitative data with an ordering. For instance if you were to measure ADL’s using a scoring system of 1=dependent, 2=max assist, 3=mod assist, 4=min assist, 5=independent.

Interval: Quantitative data in which there is a clear ordering and the distance between objects is specified but there is no meaningful zero point (absence of attribute). For instance temperature measured in Fahrenheit.

Ratio: Quantitative data which have a rational and meaningful zero.

II. Measures of Central Tendency

Mode: numeric value that occurs most frequently.

Median: the point on the numeric scale above which and below which 505 of the cases fall.

Mean: sum of scores divided by total number of scores, arithmetic average.

III. Measures of Variability

Range: highest score minus the lowest score.

Standard deviation: used with interval or ratio level data, summarizes the average amount of deviation of all values of a distribution from the mean.

Variance: value of the standard deviation before a square root is taken. Variance is not typically reported in research but is often used in inferential statistical tests.

IV. Inferential Statistics (how to draw conclusions about a population from a sample)

Null hypothesis: the statement that there is no true relationship between variables and any observed relationship is due to chance alone.

Type I Error: Reject the null hypothesis when it is true. The probability of committing a Type I error is the level of significance of a given statistical test.

Type II Error: Accept a null hypothesis when it is false.

Parametric vs. Nonparametric tests: Parametric tests require measurement on at least an interval scale and typically have a number of other assumptions regarding the distribution of variables. Nonparametric tests are applied to data collected on nominal or ordinal scales and have relatively far fewer assumptions.

 

V. Some tables and charts about the tests themselves:

What question do you want to answer?

How well does one variable predict another?® Regression analysis

What is the degree that two variables are related?

Ratio/Interval data?® Pearson Product Moment Correlation

Ordinal data?® Spearman Rank Correlation

Is there a difference between/among group means?

Nominal data?® Chi square

Ordinal data?® Kruskal-Wallis One-way ANOVA by ranks (if less than 2 groups/comparisons)

?® Wilcoxon Signed-Ranks test (more than 2 groups/comparisons, paired data)

?® Mann-Whitney U Test or Wilcoxon Rank Sum test (more than 2 groups/comparisons, unpaired data)

Ratio/interval data?® t-test (if only 2 groups/comparisons, may be done paired or unpaired)

?® ANOVA (if more than 2 groups/comparisons)

Statistical tests commonly used

Test

Assumptions/Requirements

What will it tell me?

t-test

Ratio/interval data

Paired or unpaired data

2 groups to be compared

Is there a difference between group means?

ANOVA

Ratio/interval data

Comparison among >3 group means

Paired or unpaired data

Repeated measures possible

Is there a difference among group means?

Bonferroni test

Following ANOVA

Within subjects, between subjects or mixed design

Where did differences occur among group means?

Scheffe test

Following ANOVA (post hoc)

Within or between subjects design

Where did differences occur among group means?

Tukey test

Following ANOVA (post hoc)

Within or between subjects design

Where did differences occur among group means?

Dunnett test

Following ANOVA (post hoc)

Within or between subjects design

Comparisons only to 1 group mean

Where did differences occur among group means?

Linear Regression Analysis

Ratio/interval data

Homogeneity of variance

Linearity

Can I predict one variable based on another?

Pearson Product Moment Correlation

Ratio/interval data

Linear relationship

What is the degree of relationship between two variables?

Spearman Rank Correlation

Ratio/interval/ordinal data

Linear relationship

What is the degree of relationship between two variables?

Chi Square

Nominal/ordinal data

Is the distribution of observed frequencies different from the expected frequencies?