Summative assessment in science

I have struggled for ages knowing how summative assessment should work in science. Moving from levels to grades between KS3 and KS4 always seems unsatisfactory, requiring students to move from one assessment system to another; this change in grade makes it difficult to track progress.

Rubrics in science assessment - stating not always so easy Unlike many skills-based subjects, the longer you study science the harder it becomes, as students have to remember, and connect together, an ever increasing amount of stuff.

A single assessment system seems the answer. But making this meaningful for Year 7 through to Year 11 is problematic. Should grades be based on percentages or should they be descriptors of attainment targets? Can a grade 4 be awarded to a single piece of work? If so, how does this compare to a grade 4 gained on a full examination paper at the end of Year 11? These two grades are clearly not the same.

Perhaps we should use descriptors of performance, where a grade describes the ability of a student to perform a specific cognitive process, such as describe or explain? This is difficult to achieve with a single assessment pathway for Years 7 to 11 because skills, processes and knowledge need to be mapped and described into a vast number of perhaps meaningless and vague divisions. It is also easier to describe cell structure than cell structure AND atomic structure – yet descriptors of cognition fail to account for this. Similarly, most children would find it easier to explain why polar bears have white fur than to state the number of atoms in a 250ml bottle of water; and yet stating is often associated with low-level thinking.

What you measure reveals what you value

The assessment system that you use ultimately depends on what you want it to achieve. For example, you may want to know what your students would get if they sat a GCSE paper today. This would allow you to see how far students are from the end goal. However, this will only work if you have a clear idea of where students are expected to be at different points of their school career. Giving full GCSE papers to students in Years 7 to 9 is not advisable as it will sap motivation and result in all students getting similar grades. But using full assessments before everything has been taught – i.e. at the end of Year 10 – has some merit as it will provide you with a confident grade.

Alternatively, you may want to know how your students compare to other students in the class/year, for example if you want to set them. Here, a simple percentage from an assessment combined with teacher input would suffice. Or, you may want to know what specific skills and content your students have developed in specific topics.

These differing questions probably cannot be answered by one single assessment style and it may be that schools need to use different assessment methods throughout the year. Below I have summarised three distinct ways to assess students in science in a way that can produce a grade.

Option 1: How far are you from the end goal? Assigning grades and levels in science based on curriculum taught

Students sit an exam with questions carefully selected from GCSE papers. This can be done from Year 7 onward. Exam questions are modified to ensure there is not an emphasis on recall only. Grades 9- 1 are awarded according to how much of the course has been taught. Download an illustration of the assessment system and adapt this for your exam board and specification. A disadvantage of this approach is that it caps student attainment and so can artificially restrict high-flyers. It also doesn’t give you much information about the cognition of your students. A potential advantage of this approach is that it’s quite a motivating system to be part of; students see their grades increase over time.

Example: A student scores 100% on an exam when they have covered only 20% of the content and skills. The grade awarded would be a G3 as in a full Year 11 exam paper. A G grade boundary is 20% of the total marks.

Option 2: Can you state, describe, explain? Assigning grades based on descriptors of performance

Students are awarded a grade based on the cognitive demand of the questions they correctly answer. Exam questions are graded before the exam is sat (adapted later if necessary) using this document of performance descriptors in science. I found it easier, and perhaps more reliable, to assign questions as A, C or F; distinguishing between an A and a B grade question is difficult and time consuming (exam boards provide grade descriptions in the specification for A, C and F grades). With the new GCSE grading system you can grade 8,5 and 2.

With a little help from Excel, total scores can be converted into a grade, as illustrated below. This approach does have a drawback: as students study more, their overall grade can fall for reasons discussed above. Students may also perform differently in different areas but this best-fit approach allows a grade to be assigned that gives some information about what a student can and cannot do. I have tested this approach using a full exam paper and the method broadly recovers the boundaries set by the exam board.

Example: A student got all the F-grade questions correct and half the C-grade questions correct in their assessment. This student would be given a grade C3. Had they got all of the C-grade questions right, they would have got a C1. You can use Excel to fill in the other boundaries using combinations of F, C and A marks.

Option 3: An age-related scale. I think Daisy’s cracked it!

What does an A actually mean? Up until I met Daisy Christodoulou I believed an A meant something about the types of question a student could answer i.e. it could be defined by grade descriptors. But I was wrong. An A grade simply means that a child is achieving in the top 10% of students of their age nationally (percentiles differ for each subject – see JCQ for GCSE and A Level percentiles). A C grade student is doing better than 60% of their peers but not better that the top 30%, otherwise they would get a B. It’s so simple it’s beautiful and works for years 7-11. A summative grade is simply a way of expressing where a student sits on an age-related national distribution. Of course this will, to some extent, serve as a corollary to the types of question students can answer. A summative grade then is really a statistical phenomenon, with a bit of tweaking around the edges by the awarding bodies.

So how does this help us to grade a student in Year 7 without assessing every 11 year old in England? Let me try to explain how this could work in your school.

Some students will need to sit a nationally benchmarked assessment such as a GL progress test. All Yr 7 students sit your school assessment. A simple graph can be drawn plotting the GL assessment result versus the school assessment result. By extrapolation, you can then work out where every mark and therefore student sits on the national, age-related distribution gained from the GL. And remember, if a student in Yr7 moves from a B to an A over five years they have made a whole grade better progress than the average student nationally. It’s a massive amount of progress so this is not a model for low expectations. This is a model for meaningful tracking.

A final thought

Summative assessment will always play some role in influencing teachers’ and students’ motivation and attitudes. The priority for examination boards is to implement an assessment system that is both reliable and valid. Whilst schools also need valid and reliable assessments, they also want to create an assessment system that motivates teachers and students. So, if in creating an assessment system you prioritise reliability and validity over one that motivates students, there is the risk that you will prioritise the measurement over the aspects you wish to develop; in the long term this will negatively impact student progress.

What you measure reveals what you value

Option 1: How far are you from the end goal? Assigning grades and levels in science based on curriculum taught

Option 2: Can you state, describe, explain? Assigning grades based on descriptors of performance

Option 3: An age-related scale. I think Daisy’s cracked it!

A final thought

Further reading