Mathew Hillier
5:00 PM
http://transformingassessment.com

Laurine
5:18 PM
that is soooo obvious! why did we not think of this before??? thank you already

Mathew Hillier
5:27 PM
how does this fit into the trend towards criterion referenced assessment (away from norm referenced).
?
5:27 PM
Tim Hunt
5:27 PM
This is definitely norm-referenced.
You are comparing the students with each other. No absolute standards
5:27 PM
carmen tomas 1
5:27 PM
also where is validity? reliability does not solve the problem of making VALID judgements
Greg
5:28 PM
what are the 'values'? 'parameter values' on an earlier slide?
carmen tomas 1
5:28 PM
and how about helping students understand the basis of the judgements? i.e. your criteria?

carmen tomas 1
5:29 PM
How many markers did the 17 judgements?

Tim Hunt
5:29 PM
I guess this is very much doing assessment of learning, not assessment for learning.

Nicky Rushton
5:29 PM
28 markers. It was on a previous slide.
Tim Hunt
5:29 PM
Well, assessment of performance, which is the learning outcome.
Anthea
5:30 PM
Do the students have access to the portfolio to reflect on their work?
Philip Walker
5:30 PM
Coming from Maths, I've been thinking about the norm-referencing question, as we are (almost) entirely criterion-referenced. But one diagram had grades, added as part of a later process: I'll be interested to hear how that process works.
Sue Gill
5:30 PM
It would be interesting to see if the students repeated the process, how their results compared to those of the markers?
Laurine
5:31 PM
yes, that would be great exercise in peer/self review
carmen tomas 1
5:31 PM
again this exercise masks the criteria - not sure it is great for fairness and transparency
Sue Gill
5:32 PM
and how well the students understood the measures of quality
Tim Hunt
5:32 PM
Yes, but look at some of the research on how teachers actually mark. They may ultimately have to record their marks on a rubric, but they often form an intuitive judgement, and then work out how to justify it on the mark scheme.
malik
5:32 PM
What matters is the difference in judgement quality between the portfolios. Placing portfolio on a measurement scale the bias of individual judges is unimportant so long as they are able to perceive differences in quality on the shared and equal criteria.
Matt Wingfield 1
5:32 PM
The great thing about ACJ is that the judgements are highly valid and not influenced by a predetermined value scale, so you end up with a very real view of the value of each piece of work.
Sue Gill
5:33 PM
Absolutely - they talk about reading some through to get an idea of the cohort
carmen tomas 1
5:33 PM
the fact that current practice may be flawed does not mean it cannot be improved, it is not a reason to go back to norm-referenced
Hue Hwa
5:33 PM
will there be any bias for earlier pairs vs later pairs when the judges have formed some expectation of performance?
Mathew Hillier
5:34 PM
@Hue there are multiple rounds so that would be cancelled out over time.
malik
5:34 PM
i hope not
Greg
5:34 PM
and giving students feedback that is specific to different elements of the work?
MarijeLesterhuis
5:34 PM
in our project on comparative judgement we saw that judges become faster because they get a more clear view on what they think is important in a performance
Tim Hunt
5:35 PM
I would love to try to trick this system, and see if it handles it. Take a really good piece of content, and get someone with really bad handwriting like me to re-create it in a really scruffy way. Then create a really weak piece of work, and make a beautifully presented version. Then feed that it and see if it gets correctly graded, or if it shows that markers are considering things that should not be being assessed.
Laurine
5:35 PM
I'd have thought that giving a grade on a 'live' performance would be rather norm-referenced... certainly has been in my experience as staff have had to rank then anyway as even with criteria, some achieve it 'better' than others
Matt Wingfield 1
5:36 PM
I think that's very important...people get a clearer sense of what 'good' looks like much more quickly than in traditional marking processes. Particularly helpful in peer review contexts, really empowers the students
Anthea
5:36 PM
Nice for them to self reflect on quality
MarijeLesterhuis
5:37 PM
it shows that peers generate similar rank-orders than the teachers generate, because they very well able to recognize quality
Mathew Hillier
5:37 PM
https://digitalassess.wistia.com/projects/theneb3hqj

carmen tomas 1
5:38 PM
Well I would like to see the evidence base for these claims
Mathew Hillier
5:38 PM
i will link to this video on the recording page too.

Tim Hunt
5:39 PM
Finished watching it.
Paul McLaughlin #2
5:39 PM
Finished
Mathew Hillier
5:39 PM
the student peer part - now we see assessment for learning!
Maggie #2
5:40 PM
Done
malik
5:40 PM
finished!
Greg
5:40 PM
but how are the assessment criteria made explicit / visible to students
Tim Hunt
5:40 PM
The comments seem to me very similar to what students say about criterion-referenced peer assessment
MarijeLesterhuis
5:41 PM
the D-PAC project is missing: www.d-pac.be
María Fernández-Toro
5:41 PM
The student mentioned that peers not only gave a judgement but also provided 1 positive comment and 1 criticism.
Phil Mills
5:41 PM
I tend to prefer a benchmarking approach in which students get an idea of how the assessment is conducted on the same piece. If the peer assessors haven't been given an idea about what quality work looks like how can they assess each other and provide feedback>
Matt Wingfield 1
5:41 PM
I think the difference here is that the students understand the assessment requirements much more transparently than through detailed/lengthy assessment criteria sets, which they seem to find harder to interpret without a lot more support from faculty
carmen tomas 1
5:42 PM
again i do not think that these claims are based on a strong evidence base
i have seen very few studies on comparative judgement
5:42 PM
malik
5:42 PM
student said collaborative feedback was constructive and important.
MarijeLesterhuis
5:43 PM
the number of studies is growing, I think
Laurine
5:43 PM
Carmen how do you evaluate/assess your students an in what discipline?
carmen tomas 1
5:43 PM
i work with faculty to develop assessment, I am an assessment advisor and I research the judgement process
Laurine
5:44 PM
thanks
carmen tomas 1
5:44 PM
i know the comparative judgement literature is very scarce although the literature on use of criteria is also limited
Greg
5:44 PM
re workload, for a given population of students n how many comparative judgements are required to achieve the ranking?
carmen tomas 1
5:44 PM
that is why i think many more studies need to establish the evidence for the claims
María Fernández-Toro
5:45 PM
It's much quicker to decide which one is best than to plough through criteria for each
carmen tomas 1
5:45 PM
to me it all seems a bit premature, to claim that this is better when only reliability is considered
Matt Wingfield 1
5:45 PM
In terms of studies into ACJ, there are a number of them referenced here: https://en.wikipedia.org/wiki/Adaptive_comparative_judgement
Laurine
5:46 PM
yes, and can't do studies unless there is something TO study, I guess
Maggie #2
5:46 PM
Thanks
Phil Mills
5:46 PM
It would be quicker, but would anyone know why each one was ranked as better/worse?
Matt Wingfield 1
5:46 PM
There have been quite a few in a number of different studies

María Fernández-Toro
5:47 PM
Are students aware of how their work was ranked? What about ethical issues re. those who fined themselves at the bottom of the pile?
Laurine
5:47 PM
Although if you have multiple reviews of music performances, how do we know what each reviewer considered 'good' or not?
carmen tomas 1
5:48 PM
I think comparative judgement is interesting but needs more careful consideration - I also think that analytic rubrics need to be better understood against holistic judgement... I think it is premature to conclude that comparative judgement solves the problems with judgement
Tim Hunt
5:49 PM
It shows how wide the grade boundaries are. If you look at the error bars, everyone from 150 to 110 is on the A/B boundary.
Greg
5:49 PM
but how do i interpret the meaning of parameter value?

Matt Wingfield 1
5:50 PM
The parameter value is just an internal measure within the system, which can then be correlated to other marking scales if you want.
Greg
5:50 PM
so it is a measure of quality?
MarijeLesterhuis
5:50 PM
it is expressed in logit: the relative score of the portfolio compared to other portfolios
relative measure
5:51 PM
malik
5:51 PM
Adaptive Comparative Judgement Reliable Rank Order in Assessment: reference https://cerp.aqa.org.uk/sites/default/files/pdf_upload/CERP_RP_CW_20062012_2.pdf
Tim Hunt
5:52 PM
Although ACJ is relatively new, and so there is limited research. I think the way it works is very similar to how rating systems and tournaments work in sports. (A tournament is basically pair-wise comparisons). And, there is a lot of analysis of the reliability of them.
Ed Russell
5:52 PM
I've been studying this for a while because I think it might be attractive for some of my colleagues who are strongly anti-norm-referencing, but also strongly opposed to the use of rubrics, in preference to "holistic professional judgement" based on "complex tacit knowledge and criteria". The difference is that with ACJ you don't read and slap on a grade, rather you get a unit-less score on, I believe, an interval scale. You then have to decide cut points for grades based on further judgments about the students' work. You could also plant some work of known grades into the process to guide this.
Tim, I understand that in the first two or three rounds, ACJ uses what's known as "Swiss Rules" in sport and games, then switches to a different approach.
5:53 PM
Matt Wingfield 1
5:53 PM
@Greg - the value is determined by the judges, the parameter value is simple used to represent relative value between the scripts.
Philip Walker
5:53 PM
Planting work of known -borderline- grades sounds like exactly the way to solve the practical problem I was raising about how one can assign a grade once a ranking is established! Good point.
Tim Hunt
5:53 PM
@Ed. Thanks. that is what I understand too.

Greg
5:54 PM
@Matt, OK thanks, got it
malik
5:55 PM
Could use a baseline portfolio first then use a comparative judgment

Tim Hunt
5:56 PM
Are there any good review articles of the literature to date on this?
Nicky Rushton
5:57 PM
Are judges asked to complete marking at particular time? If not, how do you stop people waiting for a round to be completed before they start the next one?
- Natalie left the Main Room. ( 5:58 PM ) -
María Fernández-Toro
5:58 PM
I still have a query about the ethics of allowing a student to find him/herself at the bottom of the rank. The negative emotional response may well override the intended benefits of the approach.
So what are the safeguards? (I presume there must be some)
5:58 PM
Paul McLaughlin #2
5:58 PM
It is easier to make judgements at start as they are often clear distinctions - at the end it gets harder. So get in early!

Tim Hunt
5:59 PM
@Maria, is that any more or less ethical than then receiving a 10% mark. That would also trigger a negative emotional response too!
Phil Mills
5:59 PM
Thank you for the presentation
Tim Hunt
5:59 PM
Also, you don't have to tell students their rank. Just how that is translated into a mark like A, B, C, ...

Natalie Pustam - Digital Assess
6:00 PM
http://link.springer.com/article/10.1007/s10798-011-9197-x
Ed Russell
6:01 PM
I've been planning to try a formative peer assessment task for group presentations in a course I'm teaching in China in July. There'd be 8 groups, so for a full set of comparisons there should, by my calculations, be only 28. That's about 5 per group. I'd get the groups to do discuss the pairs as they do the comparisons, and record some feedback for the judged groups. The judgment process would be where most of the learning would occur. Question: is it practical for small numbers like that?
Greg
6:02 PM
what does it take to access/use the ACJ system?
Matt Wingfield 1
6:03 PM
@ed russell - would also be good to connect you with the folk at the University of Edinburgh who have done a lot of work on peer review through ACJ

malik
6:03 PM
Investigating the reliability of Adaptive Comparative Judgment Ref:-http://www.cambridgeassessment.org.uk/Images/232694-investigating-the-reliability-of-adaptive-comparative-judgment.pdf
Natalie Pustam - Digital Assess
6:04 PM
@Ed Russell if you send us your email address we can send you some more information
Tim Hunt
6:04 PM
What is the licensing?
Matt Wingfield 1
6:05 PM
Anyone who is interested in finding out more about licensing ACJ or trying it out, please do just drop me an e-mail at matt.wingfield[at]digitalassessgroup.com
Natalie Pustam - Digital Assess
6:05 PM
Please send emails to Matt.wingfield[at]digitalassessgroup.com or Natalie.Pustam[at]digitalassessgroup.com

Tim Hunt
6:05 PM
http://nomoremarking.com seems to be a similar open source project
Greg
6:06 PM
is there a minimum population size necessary for acj?
Natalie Pustam - Digital Assess
6:07 PM
Or visit www.digitalassess.com where you can also find some brief videos explaining the products a bit more.
Philip Walker
6:08 PM
Digital Assess will need to get their "server resource overage" problem (whatever that means) sorted first...!

Ashwak Abdulsalam Yousif
6:10 PM
Thank you!

Nicky Rushton
6:10 PM
Thank you for an interesting presentation.
malik
6:10 PM
Thank you ALL!
Philip Walker
6:11 PM
Definitely interesting; thank you!

Matt Wingfield 1
6:11 PM
Thanks very much everyone
Mirjam Hauck
6:11 PM
Thank you!

Ed Russell
6:11 PM
Thank you for a very interesting webinar.

malik
6:11 PM
done!
Hue Hwa
6:11 PM
thank you
Ashwak Abdulsalam Yousif
6:11 PM
Done
malik
6:11 PM
cheers
Natalie Pustam - Digital Assess
6:11 PM
Thank you everyone for your questions

Natalie Pustam - Digital Assess
6:12 PM
We will, thanks Mathew

malik
6:12 PM
:)

malik
6:13 PM
it was great!
Matt Wingfield 1
6:15 PM
Thanks Matthew