|
Chapter Three
Desired Attributes of the Assessment Assessment IssuesNAEP functions to provide information about (1) the knowledge and scientific understanding of the nation’s youth and (2) the features of science education programs that relate to high levels of student achievement. Consequently, the design plan for the NAEP Science Assessment includes strategies for measurement in both of these areas. Research has produced practical and theoretical knowledge that is important to the assessment design process. The Nature of Testing and the Nature of Knowledge and Learning Except for early assessments, NAEP Science Assessments have consisted primarily of short-answer, paper-and-pencil questions that were mostly multiple choice. The previous assessments tended to focus on discrete components of science, each of which was usually learned independently of the others. Hence, tests were made up of independent items, each comparable to the others and weighted the same. This Framework is based on a different view. It holds that scientific knowledge should be organized to provide a structure that connects and creates meaning for factual information, and that this organization is influenced by the context in which the knowledge is presented. Learning is perceived as an activity in which the learner interacts with the physical world, with peers and teachers, and with the scientific community. In this view, science proficiency depends on the ability to know facts and integrate them into larger constructs and to use the tools, procedures, and reasoning processes of science for an increased understanding of the natural world. Rather than concentrate on facts in isolation, assessments will reflect the organization and structure of scientific knowledge and the nature of learning in science. Because scientific knowledge is expanding faster than can be accommodated by any curriculum, teachers and assessment designers must make choices about what topics, concepts, and factual information to address. Consequently, the NAEP Science Assessment Framework concentrates on assessing students’ ability to relate basic facts and concepts as well as their ability to discuss and evaluate approaches to science-related problems. The Framework also stresses that an assessment of what students know and can do must employ techniques reflective of the nature of science. Inferring Understanding From Student Responses Test items present students with tasks that may require a range of responses, from recall of factual information to performance of scientific investigations to complex reasoning. Based on analysis of the responses, experts in the field make inferences about students’ understanding; that is, their knowledge and reasoning skills that are assumed to have produced the responses. The validity of these inferences is a central issue in assessment. Responses to exercises designed to assess thinking or mental processing are generally more difficult to interpret than responses to items designed to assess factual knowledge. In practice, basing an assessment of the quality of mental processing on short responses is problematic. Often the decision about the mental processes applied is based only on the accuracy of the factual knowledge in the answer. When the answer is factually correct, the observer infers that the mental processes represent scientific reasoning (for example, those mental processes that are necessary to understand information in the test question stem, retrieving scientific knowledge from memory, reasoning from the stem to the correct response, or eliminating incorrect responses). However, this is not necessarily a valid inference. An incorrect answer may be the result of misinformation, not flawed reasoning; a correct answer is not necessarily the product of sound reasoning. Illogical thinking or using a wrong assumption or incorrect information can produce a seemingly correct answer. Furthermore, a correct answer may not require any higher order thinking at all; it simply may have been recalled. Only if the student response includes some indication of how the answer was obtained will those who score the assessment have information from which to choose alternative interpretations. Science entails observing objects and phenomena in the natural world and collecting and interpreting information about them. For this reason, pencil-and-paper tests have been criticized as too limited to assess what students know and can do in science. Over the past several years, groups have developed assessment exercises that engage students in performance tasks using scientific equipment and materials. Student responses are recorded by an observer or by the students themselves in written form. However, these exercises have limitations as well. Indeed, in the course of developing the new Framework, numerous examples of performance exercises were examined with respect to science concepts and reasoning, but many did not stand up to rigorous analysis. These exercises might have led science teachers and educators to draw faulty inferences. The following is typical of a performance exercise that is counterproductive: an exercise requires a student to identify several unknown substances by means of indicators, but the student is given minutely detailed directions for performing each step in the identification process. Unfortunately, even if the answers are correct, the only inference to be drawn is that the student can follow written instructions. A test item formulated with such detailed, step-by-step directions reduces to zero the science understanding needed for problemsolving. Many of the so-called performance assessment tasks that were reviewed turned out to be standard laboratory exercises that, again, were reduced to "follow-the-instructions" problems. No inferences about a student’s knowledge of science or its tools and procedures can be drawn from such exercises. To test higher order thinking skills—a major goal of performance assessments—problems need to be placed in new contexts, applied to new situations, or have new elements introduced that preclude students’ recalling what they have done before (Resnick, 1987). Class, culture, ethnicity, gender, language ability, and access to quality instruction may influence the manner in which science is learned and the manner in which science attitudes and knowledge are produced. Hence, individuals need opportunities to demonstrate knowledge or competencies in different contexts. Assessment techniques that show group differences are more likely to reveal problems with student learning and classroom instruction than with assessment per se. However, this does not eliminate the assessment community’s responsibility to the broader society. In addressing the issues of pluralism, multiple assessment methods may be more effective than any one method, no matter how well it is developed. Currently, some state pilot testing efforts are providing new ideas about assessment exercise and task formats. These pilot activities are also aimed at assessing new types of information relevant to new curriculum guides emerging in the states. The experimental assessment work (much of it pioneered in the United Kingdom) uses new approaches, including performance tasks, open-ended tasks, and new types of multiple-choice questions that are thematic or conceptual or that ask students to explain their choices in short written answers. Moreover, state assessments are experimenting with collecting information on other meaningful outcomes: sustained student work, proficiency in designing and conducting experiments, and fluency of ideas critical to the natural sciences and related fields. They are also experimenting with innovative reporting approaches. The performance measures are created through consensual judgment about what students should know and be able to do at given grade levels or developmental stages. Criteria for Assessing Learning and Achievement in Science The Framework for the NAEP Science Assessments has been developed according to the following broad guidelines:
Achievement levels describe how well students perform on the content and thinking levels required by the assessment. They evaluate the quality of the outcomes of students’ education in science at grades 4, 8, and 12 as measured by NAEP. Three achievement levels—Basic, Proficient, and Advanced—have been defined for each grade level assessed by NAGB:
Appendix A lists the final achievement level descriptions for students participating in the 2005 NAEP Science Assessment in grades 4, 8, and 12. The assessment was constructed to measure and report student performance according to the three levels of achievement, as required by NAEP policy.
|