Handling data is a key part of the role of a biologist. Raw data is meaningless and it is the job of the scientist to process and present the data in a way that allows people to understand it without distortion or obfuscation.
By using the right number of significant figures in reporting our own numerical data we communicate important information about our confidence in each digit. This prevents us, or the people making use of our data, from drawing incorrect conclusions based on results or differences that are actually too small, in our experimental design, to be accurately measured.
The arithmetic mean is a key first step in getting to grips, and helping others get to grips, with data from a set of repeated measurements or a sample of a population.
Presenting numerical data in tables and graphs is a vital skill for effective communication of our own results and the ability to interpret them allows us to rapidly understand what other scientists are trying to tell us and whether their data supports this.
Probability and chance are important considerations in experimental work and form the basis for effective statistical analysis of the results. Without understanding probability it is impossible to know whether an experimental result is surprising and interesting or merely coincidental.
An understanding of sampling allows us to gather meaningful data from a huge population or a potentially never ending series of laboratory assays using only the time and effort realistically available to us.
Different descriptions of ‘central tendency’ or ‘average’ tell us slightly different things about our data. It is important to have a clear understanding of how mean, median and mode are described in order to know what they do (and equally importantly do not) tell us about a data set.
Scatter diagrams are excellent visual representations of data being analysed for correlation.
Understanding ‘order of magnitude’ ties in directly with the concept of powers and logs and allows scientists to quickly describe systems over a wide range of scales without creating confusion or introducing errors.
Statistical tests allow us to put a number on exactly how unlikely our results are ‘by chance’ and hence how much confidence we have that what we are seeing is in fact not merely ‘chance’ but a feature of the system we are investigating.
Standard deviation is an extremely useful measure of dispersion, helping us to characterise the data we have gathered and the system we gathered it from.
Knowing the uncertainty in our measurements, and knowing how this then contributes to the uncertainties in the results of processing our data, protects us from making claims or conclusions that are not actually supported by the data.