How NOT to design an effective assessment

It’s no secret that students hate taking assessments. They often feel anxious and stressed, and sometimes they even feel like the assessment doesn’t really test them on what they know or have learned. But designing those assessments can also be a daunting task for even the most experienced classroom teacher. Considering the important decisions that we make using assessment results, it’s especially important to think about assessment design as early in the instructional process as possible. Assessment, instruction, and curriculum are inextricably linked, and challenges arise when they are not viewed as such. When it comes to designing effective and meaningful assessments that produce valid results, there are three primary challenges that educators face.

Pitfall #1: Having an Ill-defined Purpose

In architecture, there is a common understanding that “form follows function.” This understanding works in assessment development as well.  How an assessment fits within an overall evaluation program as well as the purpose of the assessment are fundamental questions that designers consider. The role that an assessment is expected to play in providing information to decision-makers – teachers, interventionist, principals – should guide its development. When there is a lack of clarity in purpose, then the resulting assessment will likely fail to yield the data consistent with the intended purpose.

When the reason for an assessment is ill-defined, we often end up with an assessment that is not properly sampling from the domain, is targeting learning goals at the incorrect depths of knowledge, or is being scored and scaled inconsistent with its intended purpose. To design a valid assessment, designers need to consider four essential questions:

  • Who are the intended users of the student performance data from this assessment?
  • At what level of analysis will the data be used?
  • How frequently will students be assessed on this content, and in what contexts?
  • What is the response expected to be based on the information the assessment results provide?

Pitfall #2: Item Types That are Misaligned with Learning Targets

The primary goal in designing an effective assessment is to assemble an instrument that provides the evidence needed to make instructional, programmatic, or resource-allocation decisions about students and curriculum in support of learning. In developing an assessment, designers must decide which items to include on a test at the exclusion of other options that might be just as viable. The item-writing process must be well-informed, and there needs to be clear congruence between items and the learning target each intends to measure.

Consider the example of an item that was intended to measure a reasoning target, shown at right. The item author intended this question to assess the students’ ability to interpret the graph with respect to energy changes. However, this item is much more likely to measure a knowledge target, as the item is a common example of shown to first-year high school chemistry students that they have likely memorized. Learning targets – in contrast to the standards and other course-level outcomes – are the specific and measurable elements around which we teach and from which we build assessment items. When items are not aligned to learning targets, we begin tapping into the measurement of the wrong domain element and we lose validity of inferences from the results of the assessment.

Alignment between learning targets and item types can be improved by considering three essential factors:

  • Ensure clarity of the learning target for both students and teachers.
  • Provide students the opportunity to engage with assessment in new contexts from instruction.
  • Be sure to distinguish between cognitive demand and item difficulty in writing items.

Pitfall #3: Distribution of Items is Inconsistent with an Assessment’s Use

Popular education literature often characterizes the distribution of items on assessments in a one-size fits all model. Erroneously, such a model directs test authors to simply delete items that are too difficult or too easy. However, the expected distribution of items on an assessment should reflect what decisions are going to be made when students are scored.

Placement tests, end-of-year summative assessments, and tests that are used to identify students for remediation should display varying characteristics with respect to item distribution for difficulty and discrimination. More importantly, tests that are used for high-stakes purposes should measure cut-points to a high degree of accuracy. Unfortunately, this is often not considered in many assessment programs in K-12.

For example, a placement test for advanced studies (e.g., honors course) must assess whether the right students are able to take the course. In order to ensure that students have the prerequisite abilities in the advanced course, we should construct a test that has a greater number of difficult items and fewer easy items. This allows us to be confident that students who do not possess the knowledge or skills to be successful in the course are unable to be successful on the assessment. The same is true for a test identifying students for targeted instruction or remediation but from the other direction. Such an assessment would need additional items that students of minimal competence will answer correctly, while non-proficient students will fail to.

Ensuring that the items on a given assessment are distributed consistently for the assessment’s purpose is key, and considering three factors will help to ensure a more effective assessment:

  • Placement: trending more difficult with high discrimination
  • Intervention: trending less difficult with high discrimination
  • General summative assessment: a wider distribution of difficulty with discrimination being a secondary concern.

If an assessment author lacks a clear purpose, misaligns item types with learning targets, and fails to consider the distribution of items relative to an assessment’s use, it unfortunately becomes all too easy to result in an ineffective assessment from which invalid inferences will be made. However, by considering all of the important elements in assessment development, designing an effective assessment is possible. Start by considering these three pitfalls as you plan and design your next assessment so that you will ensure you are assessing what you intend and have results from which you can make informed decision-making.

Interested in learning more about designing meaningful assessments to drive student achievement? Register for our June 14th webinar today!