Why three large data sets for the MEI specification?
03 March 2020
This blog post was originally published on 1 August 2017 and covers the general questions raised during the reform on the approach taken in the H630/H640 OCR B (MEI) A Level Mathematics specification regarding the new large data set requirement.
See below for our updated notes on the large data sets for each of the published H630/H640 large data sets:
 Notes on the Large Data Set 2 (H630 in 2019 and H640 in 2020)
 Notes on the Large Data Set 3 (H630 in 2020 and H640 in 2021)
 Notes on the Large Data Set 4 (H630 in 2021 and H640 in 2022)
The large data sets associated with AS and A Levels in mathematics should serve two purposes: they are a teaching resource and they provide a context for setting examination questions.
OCR has issued three data sets for AS/A Level Mathematics B (MEI). Our hope is that teachers will use all three for teaching, but for each cohort of students just one will be the focus of some of the questions in the exam. Each data set will be clearly labelled as to when it is used.
Exam series 

June 2018 
June 2019 
June 2020 
June 2021 
June 2022 
June 2023 
AS

1

2

3

4

5

6

A Level

1

1

2

3

4

5

So if you teach A Level Maths over two years, then the class you start teaching in September 2017 will see some questions on data set 1 in their AS exams in 2018 (if they sit AS) and their A Level exams in 2019, as the following table demonstrates.
Publish

June 2017

2017

2017

2019

2020

Start teaching

Sept 2017

2018

2019

2020

2021

AS exam (if sat)

June 2018

2019

2020

2021

2022

A Level exam

June 2019

2020

2021

2022

2023

Data set

1

2

3

4

5

MEI and OCR have some experience of prerelease data from our Quantitative Problem Solving Core Maths qualification. The CIA World Factbook data set that forms the current prerelease for that qualification became the basis for our thinking and development for AS and A Level.
We tried to write different types of questions using that data set, based on A Level content. When doing this, we realised that things in some countries have changed quite a lot during the lifetime of the legacy mathematics specifications so the data set would need to be updated from time to time  we didn’t want students learning about how things used to be in the world 15 years ago if that no longer reflected the current position.
We are aware that some students (and maybe teachers) do not currently enjoy the statistics in the legacy Mathematics A Levels. We think that may be because in mathematics the focus has been on learning statistical techniques without much idea of why you might want to use them.
The large data sets provide a place to use the techniques. As part of our FMSP work, MEI has been working on a project to produce videos to support teachers with the new A Levels and you can now watch these videos online (those showing ideas for teaching statistics are here).
For the statistics videos, we had students and teachers working with different large data sets; the students were really enthusiastic about working with real data and the way this helped them to extend their understanding. We thought that working with more than one data set could encourage students to understand that the techniques they are learning are applicable to a wide variety of data.
The use of large data sets in teaching and examining A Level Mathematics is new – it is an opportunity to make the statistics students learn more similar to the ways they will use statistics in future study and work. We thought it was important to review the data sets used and to make sure they continued to be suitable for examining.
This needs a threeyear cycle – two years for using the data set in teaching and a year to review and update if necessary. LDS 4 maybe a refreshment of the data from LDS 1, or a new data set dependent upon the postassessment review of the questions set in the live assessment. Similarly LDS 5 will be the refreshment/replacement LDS 2, and so on.
The data in the CIA World Factbook is grouped by country; we realised that data based on individuals would allow better teaching of distributions. There aren’t many publicly available data sets which contain ungrouped data on individuals. The NHANES data set, from American health surveys, is often used in statistics courses and it contains a wealth of data so we decided to use that as one data set.
We wanted to make the process of working with data manageable for teachers, educationally valuable for students and workable for examining. We decided that three data sets – one per cohort – updated on a rotating cycle would do the trick. In the first year of teaching the new specifications, teachers might choose to work with one data set. The next year, they could still use the lessons that had gone well as well as introducing the next data set and so on.
Our hope is that teachers will use all the data sets for teaching, concentrating more on the examination data set nearer the end of the course. For students, working with more than one data set will help them see that statistics is about working with a variety of data sets.
Having got data about countries and data about (American) individuals, we thought it would be good to have some Englandbased data – the London Datastore is a good place to find suitable data and so we ended up with the following three initial data sets which we hope will appeal to students with different interests in terms of other subjects they are taking.
 LDS_1Data about countries
 LDS_2Comparative data about the boroughs of London and the regions of England
 LDS_3Healthbased data about individuals
Planning to teach our Mathematics or Further Mathematics A Levels? Let us know so we can make sure you have everything you need. We’d love to hear your thoughts in ‘comments’ below. If you have any questions, get in touch with us by email at maths@ocr.org.uk or on Twitter @OCR_Maths.
About the author
Keith Proffitt is a Curriculum Developer for MEI. Keith has a BA in Mathematics and a PGCE in Secondary Mathematics. He taught in secondary schools for 25 years, including 13 as Head of Mathematics. He worked for OCR for over 5 years, which included being Qualifications Manager for the MEI A Level specifications in mathematics and further mathematics. He helped to develop the Further Pure with Technology unit and the Quantitative Methods qualifications while working for OCR. He joined MEI in April 2014.