Data SGP – Sources of Error in SGP Estimates
The data sgp provides access to aggregated student growth information (SGP) for students, classes, schools, and districts. The SGP statistic summarizes the average growth in a subject area, such as ELA or math, of a group of students. It is based on the average of a group’s current and prior assessment scores in the same content area. The SGP is computed using the historical growth trajectories of Star examinees and estimates what a group’s projected scores need to be to reach proficiency in the subject area.
SGPs are a widely used measure of student achievement and progress. However, there are a number of potential sources of error in the SGP calculations that can affect the accuracy of their estimated value and impact how they are used.
One such source of error is related to the relationships between true SGPs and student background characteristics. Because teachers are not randomly assigned to classrooms, there is a natural relationship between a teacher’s true SGPs and the background characteristics of the students in his or her classroom. This relationship is reflected in the correlations shown in Figure 2. The magnitude of these relationships is troubling, and they represent a potential source of bias in SGPs aggregated at the teacher level.
Fortunately, these relationships can be addressed by a value-added model that regresses the current assessment score on the sum of the teacher’s fixed effects and the individual-level student covariates. This approach removes most of the variance in the estimated teacher effect, eliminating the relationship between a teacher’s true SGPs or prior test scores and student backgrounds and resulting in more accurate estimated values.
Another source of error is related to the variation in the number of students that a teacher teaches, and the distribution of these students. These factors can influence the accuracy of SGPs estimated for a group of students by introducing sampling variability that results in over- or under-reporting of their achievement levels. This can be reduced by aggregating the SGP estimates of a group at the school or district level to control for these sampling variations.
sgptData_LONG is an anonymized panel data set that contains assessment records in LONG format for 5 years for students in grades 4 through 8 and grade 10. The data set contains the following variables: VALID_CASE, CONTENT_AREA, YEAR, ID, SCALE_SCORE, and GRADE. All of these variables are required for SGP analyses except GRADE and ACHIEVEMENT_LEVEL which are only needed if running student growth projections. The sgptData_LONG data set also contains an additional variable indicating whether the record contains a valid prior test score. See the sgptData_LONG vignette for more detailed documentation on how to use this data set for SGP analyses.