Who scores the student responses?
The people who score the student responses are college graduates who possess at least a bachelor's degree. Whenever possible, educators are hired as scorers.
Are California teachers involved in the scoring process?
Yes. California teachers select the sample student responses used to train the readers.
How are the scorers trained to score student responses?
Each prospective scorer is required to participate in extensive computer-based training.
The training consists of the following:
- General information about the Electronic Performance Evaluation Network (ePEN™).
- Background information about the California Standardized Testing and Reporting (STAR) Program.
- Information on the STAR writing tasks.
- Explanations of STAR scoring rubrics and scoring principles.
- Sets of prescored annotated training papers. The training papers include anchor and practice papers. Anchor papers are solid samples of student writing for each score point (1-4). Practice papers are samples of student writing that demonstrate the “high” and “low” end of each score point. The training also includes Qualification Sets that prospective scorers must pass before scoring STAR writing tasks.
How does someone QUALIFY to score?
After completing the training, participants must complete the Qualification Sets (three sets of papers consisting of ten papers per set) before being eligible to score. To become a scorer, participants must score with exact accuracy on at least 70 percent of the papers, achieve agreement with the predetermined score on two of the sets or as an average across two of the three sets, and have no nonadjacent scores (more than one point apart from the predetermined score). Scorers continue to qualify throughout the scoring process. Before each scoring session, each scorer will score a Calibration Set of three to four papers. The scores on these sets have been previously agreed upon by scoring directors, in conjunction with other personnel. The sets are given to scorers to ensure that the accuracy of their scoring does not drift. These sets “calibrate” the scorers. Often, a new Calibration Set is given before afternoon scoring. Calibration Sets can also be given to individual scorers who might be struggling.
Who are scoring supervisors, and how are they selected?
Scoring supervisors monitor and mentor scorers during operational scoring. Scorers with a history of achieving the highest accuracy on the Qualification Sets and the highest level of scoring consistency and validity statistics during project scoring are selected as scoring supervisors. Approximately ten scorers are assigned to one scoring supervisor; this ratio allows scoring supervisors to work closely with each scorer. The ePEN system also allows scoring supervisors to continuously monitor each response scored and the score point assigned to ensure the highest accuracy possible.
All scoring supervisors participate in a two-day training session that provides the same training that qualifies scorers. If a scoring supervisor does not achieve the accuracy required on the Qualification Sets, he or she will not be allowed to be a supervisor. In addition, all supervisors receive extensive training on how the ePEN system works, how to best manage scorers, and how to maintain accuracy as scoring continues.
Who are scoring directors, and how are they selected?
Scoring directors are responsible for overseeing the scoring of the grade level to which they are assigned. They provide leadership for the scoring supervisors, help manage the scorers, and are ultimately responsible for maintaining the highest accuracy possible during STAR scoring.
Scoring directors represent the best of the scoring supervisors. They typically have two to three years experience as scoring supervisors. They have demonstrated a thorough understanding of STAR scoring and very strong leadership qualities.
How are the training papers chosen?
At the range-finding sessions, participants identify papers to be used to train scorers. Working with the scoring directors, content specialists from Pearson Educational Measurement, the STAR scoring contractor, review the student papers identified at the range finding and select sample papers to represent each of the score points. They choose papers that illustrate the criteria in the scoring guides and show the different ways students responded to the topic. The responses are then put into sets used to train and qualify scorers.
Are student responses scored on computers?
Yes. All student responses are scanned into the ePEN™ system. Scorers view assigned responses on a computer at one of Pearson Educational Measurement’s regional scoring centers. The screen does NOT display the student’s name or background information; the scorer sees only the student response. Scoring supervisors and scoring directors are on site to monitor the scoring.
How many times is each student response scored?
Beginning in 2006, each student response will be scored by one scorer. In addition, ten percent of the student responses will be scored by a second reader. When a second reader does read a student response, the second reader’s score is not counted toward the student’s overall score on the writing test. The second reader’s score is used only to monitor the accuracy and reliability of the scoring.
How many points can a student response earn, and how are the scores reported?
Each scorer will give a student response a score ranging from one to four, and that score will be doubled to produce the student’s overall writing test score of two, four, six, or eight. The writing test score will be reported as the student’s score on the Writing Applications reporting cluster. This Writing Applications score will be combined with the student’s score on the multiple-choice portion of the CST in English-language arts (ELA). This combined score will then be converted to a scale score that adjusts for differences in the difficulties of the ELA CSTs from year to year and allows for comparison of scores over time. Scale scores can range from 150 to 600. A student’s scale score determines the student’s performance level result.
How is the accuracy of scoring maintained throughout the scoring process?
The accuracy of all scoring is monitored on a regular basis. First, in those instances in which a scorer and second reader both read a student response, consistency of scoring is calculated based on whether the two scores assigned are identical, adjacent, or nonadjacent. This consistency measure is called inter-rater reliability. Scoring supervisors and scoring directors constantly monitor agreement percentages. If a scorer’s rate of agreement begins to decline, the scorer is retrained by a scoring supervisor or scoring director and closely monitored thereafter. If the scorer’s performance does not improve, the scorer is released.
Second, one in every 40 papers read by each scorer has been previously scored by scoring directors and scoring supervisors. These previously scored papers are referred to as validity papers. The consistency of the scorer’s ratings with the scores on the validity papers is checked throughout the day to ensure that each scorer applies the criteria in the scoring guides accurately. The validity papers are introduced throughout the scoring process. If a scorer’s validity falls below required levels, the scorer is retrained by a scoring supervisor or scoring director. If a scorer continues to show poor validity, the scorer is released.
Third, scoring supervisors “back read” a certain percentage of the student responses that have been scored by the scorers. The scorer and supervisor scores are then compared to check the scorer’s consistency and reliability and to ensure the scorer is maintaining scoring standards. In addition, ePEN allows scoring directors to view the back-reading completed by scoring supervisors to ensure that scoring supervisors are maintaining accuracy. Scoring directors will also back read scorers.
Fourth, to help prevent drifting, scorers are required to score a Calibration Set before each scoring session. If a scorer is deficient on any of the accuracy indices, he or she is immediately retrained or released from the scoring process.
What is done to ensure that writing tasks do not contain offensive material?
All writing tasks are reviewed to ensure that they are free of bias and controversial content and that they exhibit sensitivity to all students. Any tasks that do not meet these criteria are either modified and resubmitted for field testing or removed from the pool of useable tasks.