Intelligent Assessment

Thursday 27 June 2013

I've been reading (slowly!) the recent report Role of the US Department of education, office of educational technology that is entitled "Expanding evidence approaches for learning in a digital world." (http://www.ed.gov/edblogs/technology/files/2013/02/Expanding-Evidence-Approaches.pdf).

The development of the report was led by Karen Cator, the Director of the Office of Educational technology. Karen. who was previously at Apple Computers, was involved in the Centre for Assessment and Evaluation of Student Learning (www.caesl.org) for which I was managing director for seven years.

In Chapter 1 of the report there is an interesting discussion of the A/B testing method, which is a way of rapidly testing changes to the development of online learning materials in an iterative manner. It set me thinking about how educational measurement research needs to change its methodology as we move increasingly into digital formats. Traditionally, in educational research, we have designed large-scale studies that take several years to implement. That is fine for testing the major outcomes of an intervention, but is not an efficient way of making incremental changes that will improve learning in digital environments. We need to think about how we can structure design and development to rapidly try out different versions of assessment methods so that we can identify the most promising ones and focus on those.

Saturday 15 June 2013

Here is a paper from ETS that talks about the current state of the art in automated essay scoring.

http://www.ets.org/Media/Research/pdf/RD_Connections_21.pdf

Tuesday 30 April 2013

The debate about automated scoring rages on! This article addresses the issue of the accuracy and validity of computer-scoring of essays and other written responses.

The thing to remember is that automated scoring applies Natural Language Processing using a variety of algorithms that look for patterns in the responses that correspond to those seen in essays that have been scored by humans. The system has to be trained on some essays rated already by humans, so all it can do is look for features of the answer that correlate to those seen in the human-scored training samples. In other words, they are only as good as the samples they were trained on.

The other thing to bear in mind is that maybe the best use of the technology is not in high stakes testing, but in supporting teachers in the classroom to relieve their scoring load. Also for online classes like the MOOCs where the student numbers prohibit human scoring.

I'm sure the debate will continue to rumble on, but automated scoring will improve as machine learning techniques get better and I think that the technologies can be of real help to teachers.

Computer says no: automated essay grading in the world of MOOCs
PC Authority
ETS uses the e-Rater software, in conjunction with human assessors, to grade the Graduate Record Examinations (GRE) and Test Of English as a Foreign Language (TOEFL), without human intervention for practice tests. Both these tests are high stakes – the ...

Saturday 27 April 2013

Questioning the System

A colleague sent me a link to this video in which Sullibreezy questions the education system and the value of exams

I Will Not Let An Exam Result Decide My Fate||Spoken Word
http://www.youtube.com/watch?v=D-eVF_G_p-Y

He raises some valid points about the purpose of education and the way that we assess learning. I like his questioning of why we treat all students as if they are the same when they differ greatly in their strengths and weaknesses. The question he doesn't answer is what should education be in the future. That is for us to decide.

Thursday 11 April 2013

A couple of days ago I was sitting in on the educational technology strand sessions of the 2013 NARST conference and getting frustrated!

A wonderful thing that is happening is that many researchers are developing interactive learning and serious gaming environments that are teaching kids to develop content knowledge and skills in science. And they are collecting oodles of log data from those environments.

The other good thing is that they are aware that they need to be assessing the learning as the kids go through the experience, but most are not well informed about how to do so. I keep seeing people trying to find the patterns of learning in the data to determine where they should embed assessments and what those should be measuring.

I wish that more of them would be thinking about assessment using the Evidence Centred Design (ECD) approach. If they thought about what they want kids to know and be able to do, then determine what evidence they would accept that students have those knowledge and skills, and then design tasks that will elicit that kind of evidence. Then they wouldn't need to be data mining to find out what to measure. They need educational measurement experts in their projects!

Thursday 14 March 2013

There is a new report issued in the US by the Gordon Commission (details below with links to the report). It is interesting to see what they are saying about the changing face of education assessment. There are some great thinkers on the commission including Bob Mislevy, John Behrens and Randy Bennett.

The Gordon Commission on the Future of Assessment in Education, established by
ETS in January 2011 and given a two-year tenure, has released a public policy statement calling upon state and federal policymakers to commit to a long-term effort to develop assessments that place greater emphasis on providing timely and valuable information to students and teachers. See:http://www.gordoncommission.org/rsc/pdfs/gordon_commission_public_policy_report.pdf.
See also the technical report TO ASSESS, TO TEACH, TO LEARN: A VISION FOR THE THE FUTURE OF ASSESSEMENT:http://www.gordoncommission.org/rsc/pdfs/gordon_commission_technical_report.pdf
and its Executive Summaryhttp://www.gordoncommission.org/rsc/pdfs/technical_report_executive_summary.pdf.

Thursday 28 February 2013

The promise of technology for education is that it can enable personalisation of learning. By that I mean that it will allow sequences of learning to be recommended for individual learners and that during the learning, a student's progress will be monitored and evaluated so that assistance can be given at the right time. At the end of the learning sequence, the student and her teacher will receive reports of what she has learned, with recommendations for next steps.

BUT, none of this is possible without assessment of learning both at the macro level (courses and sequences of instruction) combined with micro level assessments of what the student knows and can do right now (to allow for tutoring).

In this blog I want to explore the barriers and opportunities in this space.

Translate