The Data SGP Toolkit
The term “big data” is often used to refer to datasets that are too large for traditional analytical applications. In comparison the size of SGP data sets is relatively small; by a number of measures they are substantially smaller than even the data generated by Facebook interactions. Nevertheless, assembling and managing this data is challenging. The data sgp project offers a set of tools to address this challenge.
The data sgp toolkit provides an efficient means of organizing longitudinal (time dependent) student assessment data into statistical growth plots. The toolkit includes support for two common formats of this data: WIDE and LONG. WIDE data is formatted so that each case/row represents a single student while LONG data spreads time-dependent variables across multiple rows per student. The toolkit also supports access to state specific data in embedded sgpstateData meta-data.
The toolkit provides a simple user interface for displaying the resulting growth plots. The UI allows users to select the student and teacher for whom they wish to view plots, as well as the prior and current year(s) for which the plots are being generated. A summary of the results is then displayed. The results can then be downloaded in a variety of formats, including HTML, PDF and Excel files.
A graphical display of the error variance is also available to assist in understanding the uncertainty associated with individual estimates of SGPs. The displayed variance is a function of the reliability l selected for the estimate and the number of observations used in its calculation.
This display illustrates that the variance in estimated SGPs increases as a function of the amount of time over which the estimates are made. This is a property of the maximum likelihood estimation methodology employed to estimate these estimates. However, it also indicates that the uncertainty in estimated SGPs is substantially lower when a value-added model is used to estimate these estimates.
Our analyses indicate that aggregated estimates of SGPs contain a substantial source of variance due to individual-level relationships between student characteristics and true SGPs. This source of variance is likely to make it difficult for teachers to interpret aggregated SGPs as a measure of teacher effectiveness. This is an additional reason for our preference for value-added models that regress both prior and current test scores on teacher fixed effects and student background variables.
SGP estimates are based on the difference between a student’s scale score on a statewide assessment in one year and the student’s scale score on that same assessment in the subsequent year. These estimates are then compared to the state average in order to identify students who have made the greatest improvement. A typical example is a sixth grader who achieves a scale score of 370 this year on the Wisconsin Forward Exam, indicating that the student improved by 70 points in English language arts compared to their performance last year on the Badger exam.