Using data mining to predict K-12 students' performance on large-scale assessment items related to energy
J. Res. Sci. Teach 45: 554-573, 2008
Xiufeng Liu, Miguel E. Ruiz
This article reports a study on using data mining to predict K-12 students' competence levels on test items related to energy. Data sources are the 1995 Third International Mathematics and Science Study (TIMSS), 1999 TIMSS-Repeat, 2003 Trend in International Mathematics and Science Study (TIMSS), and the National Assessment of Educational Progress (NAEP). Student population performances, that is, percentages correct, are the object of prediction. Two data mining algorithms, C4.5 and M5, are used to construct a decision tree and a linear function to predict students' performance levels. A combination of factors related to content, context, and cognitive demand of items and to students' grade levels are found to predict student population performances on test items. Cognitive demands have the most significant contribution to the prediction. The decision tree and linear function agree with each other on predictions. We end the article by discussing implications of findings for future science content standard development and energy concept teaching.