# Statistics jargon made simple (08 Feb 18)

## Like any other specialisation, quantitative research has its own technical terms. This short glossary explains them in simple language

**
Base
**The number of respondents on which a table of results is based.

**
Census
**The study of all units in the population/universe.

**
Cluster sampling
**Sampling based on the selection of groups of closely-located units, followed by the further selection of individual units (or smaller groups) from the groups which were selected from the population at the first stage. Popularly carried out for face-to-face sampling as a way of minimising interviewer’s travel time and costs.

**
Confidence levels
**The probability that the estimate obtained from a sample survey is not more than a certain distance from the figure for the true population. The 95% confidence level is the most commonly used.

**
Convenience sampling
**Sampling from units in an un-scientific way, based on who or which ones are most easily / conveniently available.

**
Cross tabulations
**Tables which present the results broken down by sub groups or characteristics of interest.

**
Deadwood
**Cases in the sample which should not have been included e.g. addresses where the dwelling has been demolished. The selected sample minus the deadwood is known as the eligible sample.

**
Eligible sample
**The selected sample minus the deadwood.

**
Flow sample
**A sampling method in which respondents are selected as they become of interest to the research rather than from a pre-existing list e.g. people visiting a housing office over a period of time.

**
Frequency counts
**Tables giving the results of the answers of all respondents to a question (sometimes called ‘hole counts’).

**
Margin of error
**In survey and sampling practice, this is an alternative term for “Confidence Interval”. Further alternative terms that may be used for this include “Statistical Reliability” and “Tolerance Margin”.

**
Mean
**The average value of a list of numbers where you add up and divide by the no. of observations

**
Median
**The middle value of a list of numbers when sorted (or average of 2 middle values)

**
Mode
**The most popular value (or range)

**
Non probability based sampling techniques
**These are methods where the selection of units has taken place in a subjective and non-random way, often carried out by the interviewer, and where the probability of selection of each unit in the population cannot be determined.

**
Non-response
**Members of the sample which are eligible for inclusion but fail to provide any data e.g. where no one can be contacted or the individual refuses to take part. These must be distinguished from deadwood.

**
Normal distribution
**A specific type of shape of distribution curve of a variable when shown as a frequency distribution, whereby that variable is most likely to take values in the middle (i.e. mean) of its range and the chances of it taking values further way from its middle tails off either side. Many features in nature tend to follow a “normal distribution”, such as the heights of a specific species of tree.

**
Population (or Universe)
**The entire body of interest to the decision-maker. The units can be human or non-human

**
Probability
**A measure or estimation of how likely it is that something will happen or that a statement is true. Probabilities are given a value between 0 (0% chance or will not happen) and 1 (100% chance or will happen). The higher the degree of probability, the more likely the event is to happen, or, in a longer series of samples, the greater the number of times such event is expected to happen.

**
Quota sampling
**A sampling strategy which allows the interviewer to choose any individuals provided that they conform in total to a pre-determined pattern e.g. set percentages by sex and age. It is not possible to calculate the confidence levels for surveys conducted by this method. It is very commonly used in standard market and opinion poll research, and is also used in sampling qualitative research.

**Random sampling**

A sampling strategy based on the principle that all population members have an equal (or at least, pre-determined) chance of being selected. Specific individuals are selected in advance from a sampling frame. This method offers a better chance of a representative sample for the population, and allows sampling errors to be calculated

**
Sample fraction
**The proportion of the population which is sampled.

**
Sample frame
**A list of the population units from which a sample is drawn

**
Sampling
**The process of selecting of a fraction of the total amount of units of interest to decision makers for the ultimate purpose of being able to draw conclusions about the entire body of units (population/universe). You use sampling to learning about the views of a large group of people by speaking to a smaller number of them.

This is done on the assumption that those few we have interviewed have the same characteristics as the rest of the population i.e. they are to REPRESENT the population of interest

**
Sampling error
**The extent to which estimates from a random sample differ from the value for the whole population being surveyed.

**Sampling frame**

A record of the population from which the sample is selected.

**
Simple random sampling
**The act of selecting units from a population of units in a random way “1 in n”, such that each unit has that same chance of selection as each other, for example drawing lottery balls.

**
Standard deviation
**A measure of the spread of a group of observations as defined by a mathematical formula based on the aggregate of the distances between each observation and the group mean.

**
Standard error
**A measure of the spread of the means of a set of samples taken from the same population. This is typically related to the Standard Deviation of the observations within the samples, and the number of observations within each sample.

**
Statistical reliability
**(See “Margin of Error”)

**
Stratification
**The process of organising the sampling frame into different groups and then selecting a sample of appropriate size from each of these groups.

**
Stratified random sampling
**Selecting a sample in a random way, controlling for certain characteristics of the population / for different sub-populations, such as creating a separate “1 in n” sample of males from the males in a population, and likewise for females.

** Proportionate Stratified Random Sampling
** As above, but ensuring that the “1 in n” sampling fraction is the same for each sub-population.

**
Disproportionate Stratified Random Sampling
** A different sampling fraction (1 in n) may be selected for different sub-populations, thus allowing over and under- sampling / representation of certain groups.

**
Area Sampling
** Effectively the same as “Cluster Sampling”, as applied to specific “areas” or “geographies”.

**
Systematic sampling
**The act of selecting units from a population by ordering the population units in a certain way, then selecting every nth unit from that population, going in order, to appear in the sample. This can be carried out as part of a simple, stratified or clustered random sample.

**
Stratification
**The process of organising the sampling frame into different groups and then selecting a sample of appropriate size from each of these groups.

**
Weighting
**A procedure for increasing the size of some groups in a sample to make in possible to subject them to separate analysis, and then adjusting stage to make the sample representative of the whole population. Weighting is the process of systematically making certain respondents more or less important in the analysis. It’s done to make the survey results more realistic - like the real world/market place/segment that you are interested in.

Best

Rod

PS If this topic interests you why not run a course on Statistics in house? Email me at rod@rodlaird.co.uk or phone
**01494 772 458. **We cater for all levels from introductory to advanced.

# Why not join the discussion!

Or even better still offer your own advice and tell us about things that others can learn from.

We moderate comments lightly so bear with us and we'll get your thoughts listed as soon as we can.

*You must be logged in to post comments.*

Not registered yet? Simply fill in the box below.