statistics

Statistics deals with the collection, organisation, analysis, interpretation, and presentation of data.

types

  • Descriptive statistics summarise data
  • Inferential statistics make predictions
data types
typeexamples
binaryyes or no
categoricalfilm genre
countnumber of pushups
number42
ordinallow, medium, high

terms

mean

The arithmetic mean is the sum of all values in a given set divided by the number of values in that set:

x=1N i=1N xi=x1+x2+...+xnN

Other types of means:

  • The geometric mean is the root of the product of all values in a given set
  • The harmonic mean is the reciprocal of the arithmetic mean of the reciprocals of a given set. Useful for finding the average rate
  • The root mean square is the square root of the arithmetic mean of the square in a given set

standard deviation

The standard deviation (σ) of a set determines the amount of variance in its values.

σ= 1 N-1 i=1 N (xi-x)2

other

  • The median is the middle number in a sorted set
  • The mode is the most frequent item of data

correlation

Covariances measures how one variable relates to another variable. Postive covariances indicate that the two variables move in the same direction; conversely, negative covariances indicate they move in opposite directions.

Correlation is basically just covariance but normalised. You’ll only get correlation values that range from -1 to 1. A correlation of 1 is a positive correlation, -1 is a negative correlation, and 0 means no correlation.