What are the qualities of performance based assessment?

An important characteristic of performance assessments is the close correspondence between the performance that is assessed and the performance of interest. Performance assessments should be designed to emulate the context in which the intended knowledge, skills or abilities (KSA) are to be applied. As indicated by the Standards for Educational and Psychological Testing (American Educational Research Association (AERA), American Psychological Association (APA) & National Council on Measurement in Education (NCME), 2014), “performance assessments require examinees to demonstrate the ability to perform tasks that are often complex in nature and generally require the test takers to demonstrate their abilities or skills in settings that closely resemble real-life situations” (p. 77). In accordance with the Standards, this chapter uses the following definition for performance tasks that was informed by the definition provided by Lane and Depascale (in press):

Performance tasks that may be used for high-stakes purposes are designed to closely reflect the performance of interest; require standardized directions, ancillary materials, and administration conditions; allow students to construct or perform an original response that reflects important disciplinary knowledge and skills; and the student work or performance is evaluated by predetermined scoring criteria and procedures that are applied in a standardized manner. Validity evidence and evidence of their psychometric quality should also be provided to support their use.

Although performance tasks in educational achievement tests typically measure cognitively complex skills, it is not necessary that they do so. For example, if fluency and accuracy in keyboarding are of interest, a task that examines speed and accuracy of keyboarding could be considered a performance task. Regardless, performance tasks are often associated with the measurement of cognitively complex skills because we often tend to value and therefore need to measure skills that are cognitively complex. Some tasks may be considered performance tasks when used for a particular purpose, but not for other purposes. For example, an extended constructed-response item requiring students to discuss the merits of a biological theory could be considered a performance task for class in theoretical biology, but it would not be considered a performance task for evaluating students’ scientific investigation skills for a laboratory class. When referring to extended constructed-response item formats that can be a form of performance tasks, the Standards indicate that examinees must create their own responses that may result in a few sentences, a paragraph, a diagram or a mathematical proof (AERA et al., 2014, pp. 217-218).

Performance tasks are contextualized and may assess the process used by students to solve the task or create a product, such as a sculpture or a persuasive essay. They may involve the use of hands-on activities, such as building a model or using scientific equipment, or they may require students to produce an original response to a constructed-response item, to write an essay or to write a position paper. They may require students to articulate their reasoning or to provide their own approaches to solving problems. They can include opportunities for self-reflection and collaboration as well as student choice. Performance assessments may also allow for a particular task to yield scores in more than one content domain. An example of a large-scale performance assessment that embodied many of these features was the Maryland State Performance Assessment Program (MSPAP; Maryland State Board of Education, 1995). MSPAP required collaborative efforts in that students worked together on solving tasks, such as science investigations, using ancillary materials. Tasks were also designed to produce scores in more than one content domain, which has practical as well as pedagogical appeal, but can lead to psychometric challenges, such as scores in one academic discipline being overly influenced by performance in another academic discipline. Other examples of large-scale K—12 educational performance tasks include sections and items developed for Advanced Placement, International Baccalaureate, PARCC and Smarter Balanced assessments and some state writing assessments.

ment, the assessment can be said to be practical or feasible. Practicality concerns the adequacy of resources and how these are allocated in the design, development, and use of assessments. Resources to be considered are human resources, material resources, and time. Human resources are test designers, test writers, scorers, test administrators, data analysts, and clerical support. Material resources are space (rooms for test development and test administration), equipment (word processors, tape and video recorders, computers, scoring machines), and materials (paper, pictures, audio-and videotapes or disks, library resources). Time resources are the time that is available for the design, development, pilot testing, and other aspects of assessment development; assessment time (time available to administer the assessment); and scoring and reporting time. Obviously, all these resources have cost implications as well.

In most assessment situations, these resources will not be unlimited. Thus, there will be inevitable trade-offs in balancing the quality standards discussed above with what is feasible with the available resources. Braun discussed a trade-off between validity and efficiency in the design of performance assessments. There may be a gain in validity because of better construct representation, as well as authenticity and more useful information. However, there is a cost for this in terms of the expense of developing and scoring the assessment, the amount of testing time required, and lower levels of reliability. The reader is referred to Bachman and Palmer (1996) for a discussion of issues in assessing practicality and balancing the qualities of assessments in language tests.

Bob Bickerton spoke about practicality issues in the adult education environment. He noted that the limited hours that many ABE students attend class have a direct impact on the practicality of obtaining the desired gains in scores for a population that is unlikely to persist long enough to be posttested and, even if they do, are unlikely to show a gain as measured by the NRS. John Comings said his research indicated that for a student to achieve a 75 percent likelihood of making a one grade level equivalent or one student performance level gain, he or she would have to receive 150 hours of instruction (Comings, Sum, and Uvin, 2000). Bickerton added that Massachusetts has calculated that it takes an average of 130 to 160 hours to complete one grade level equivalent or student performance level (see SMARTT ABE http://www.doe.mass.edu/acls [April 29, 2002]). The NRS defines six ABE levels and six ESOL levels. A comparison of the NRS levels with currently available standardized tests indicates that each NRS level spans approximately two grade level equivalents or student perfor-

What are the characteristics of performance

Performance-based assessments share the key characteristic of accurately measuring one or more specific course standards. They are also complex, authentic, process/product-oriented, open-ended, and time-bound.

What are some examples of performance

Examples of performance assessments include composing a few sentences in an open-ended short response, developing a thorough analysis in an essay, conducting a laboratory investigation, curating a portfolio of student work, and completing an original research paper.

What is the most important characteristic of a performance task?

Performance tasks yield a tangible product and/or performance that serve as evidence of learning. Unlike a selected-response item (e.g., multiple-choice or matching) that asks students to select from given alternatives, a performance task presents a situation that calls for learners to apply their learning in context.

What are the strengths of performance

This section considers some advantages of using performance-based assessments..
Direct Observations of Student Learning. ... .
Good Instructional Alignment. ... .
Interesting Assessments. ... .
Instructional Feedback. ... .
Measurement of Multiple Objectives and Concepts. ... .
Active Student Learning. ... .
Higher-Order Thinking Skills..