Abstract
In recent years, two communities have grown around a joint interest on how big data can be exploited to benefit education and the science of learning: Educational Data Mining and Learning Analytics. This article discusses the relationship between these two communities, and the key methods and approaches of educational data mining. The article discusses how these methods emerged in the early days of research in this area, which methods have seen particular interest in the EDM and learning analytics communities, and how this has changed as the field matures and has moved to making significant contributions to both educational research and practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aleven, V., Mclaren, B., Roll, I., & Koedinger, K. (2006). Toward meta-cognitive tutoring: A model of help seeking with a cognitive tutor. International Journal of Artificial Intelligence in Education, 16(2), 101–128.
Amershi, S., & Conati, C. (2009). Combining unsupervised and supervised classification to build user models for exploratory learning environments. Journal of Educational Data Mining, 1(1), 18–71.
Arroyo, I., & Woolf, B. (2005). Inferring learning and attitudes from a Bayesian Network of log file data. In: Proceedings of the 12th International Conference on Artificial Intelligence in Education (pp. 33–40).
Baker, R., Corbett. A. T., Koedinger, K., & Wagner, A. Z. (2004). Off-task behavior in the cognitive tutor classroom: When students game the system. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 383–390).
Baker, R., de Carvalho, A., Raspat, J., Aleven, V., Corbett, A., & Koedinger, K. (2009). Educational software features that encourage and discourage “gaming the system”. In: Proceedings of the International Conference on Artificial Intelligence in Education (pp. 475–482).
Baker, R., & Gowda, S. (2010). An analysis of the differences in the frequency of students’ disengagement in urban, rural, and suburban high schools. In: Proceedings of the 3rd International Conference on Educational Data Mining (pp. 11–20).
Baker, R., Gowda, S. M., & Corbett, A. T. (2011a). Towards predicting future transfer of learning. In G. Biswas, S. Bull, J. Kay, & A. Mitrovic (Eds.), Artificial intelligence in education: Vol. 6738. Lecture notes in computer science (pp. 23–30). Heidelberg, Germany: Springer.
Baker, R., Gowda, S. M., & Corbett, A. T. (2011b). Automatically detecting a student’s preparation for future learning: Help use is key. In Proceedings of the 4th International Conference on Educational Data Mining (pp. 179–188).
Baker, R., Kalka, J., Aleven, V., Rossi, L., Gowda, S., Wagner, A., et al. (2012). Towards sensor-free affect detection in cognitive tutor algebra. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 126–133).
Baker, R., Walonoski, J., Heffernan, N., Roll, I., Corbett, A., & Koedinger, K. (2008). Why students engage in “gaming the system” behavior in interactive learning environments. Journal of Interactive Learning Research, 19(2), 185–224.
Baker, R., & Yacef, K. (2009). The state of educational data mining in 2009: A review and future visions. Journal of Educational Data Mining, 1(1), 3–17.
Bakharia, A., & Dawson, S. (2011). SNAPP: A bird’s-eye view of temporal participant interaction. In: Proceedings of the 1st International Conference on Learning Analytics and Knowledge (pp. 168–173).
Barnes, T. (2005). The q-matrix method: Mining student response data for knowledge. In: Proceedings of the American Association for Artificial Intelligence 2005 Educational Data Mining Workshop (pp. 39–46).
Barnes, T., Bitzer, D., & Vouk, M. (2005). Experimental analysis of the q-matrix method in knowledge discovery. In M.-S. Hacid, N. Murray, Z. Raś, & S. Tsumoto (Eds.), Foundations of intelligent systems: Vol. 3488. Lecture notes in computer science (pp. 603–611). Heidelberg, Germany: Springer.
Beal, C. R., Qu, L., & Lee, H. (2006). Classifying learner engagement through integration of multiple data sources. In: Proceedings of the 21st National Conference on Artificial Intelligence (pp. 151–156).
Beheshti, B., & Desmarais, M. (2012). Improving matrix factorization techniques of student test data with partial order constraints. In: Proceedings of the 20th International Conference on User Modeling, Adaptation, and Personalization (pp. 346–350).
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, 57(1), 289–300.
Ben-Naim, D., Bain, M., & Marcus, N. (2009). A user-driven and data-driven approach for supporting teachers in reflection and adaptation of adaptive tutorials. In: Proceedings of the 2nd International Conference on Educational Data Mining (pp. 21–30).
Bouchet, F., Azevedo, R., Kinnebrew, J., & Biswas, G. (2012). Identifying students’ characteristic learning behaviors in an intelligent tutoring system fostering self-regulated learning. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 65–72).
Brin, S., Motwani, R., Ullman, J., & Tsur, S. (1997). Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the 1997 ACM International Conference on Management of Data (pp. 255–264).
Cen, H., Koedinger, K., & Junker, B. (2006). Learning factors analysis—A general method for cognitive model evaluation and improvement. In M. Ikeda, K. Ashley, & T.-W. Chan (Eds.), Intelligent tutoring systems: Vol. 4053. Lecture notes in computer science (pp. 164–175). Heidelberg, Germany: Springer.
Cen, H., Koedinger, K., & Junker, B. (2007). Is over practice necessary?—Improving learning efficiency with the cognitive tutor through educational data mining. In: Proceedings of 13th International Conference on Artificial Intelligence in Education (pp. 511–518).
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Corbett, A., & Anderson, J. (1995). Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4(4), 253–278.
d’Aquin, M., & Jay, N. (2013). Interpreting data mining results with linked data for learning analytics: Motivation, case study and directions. In: Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (pp. 155–164).
D’Mello, S., Craig, S., Witherspoon, A., Mcdaniel, B., & Graesser, A. (2008). Automatic detection of learner’s affect from conversational cues. User Modeling and User-Adapted Interaction, 18(1–2), 45–80.
D’Mello, S., Olney, A., & Person, N. (2010). Mining collaborative patterns in tutorial dialogues. Journal of Educational Data Mining, 2(1), 1–37.
Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning (pp. 233–240).
Dawson, S. (2008). A study of the relationship between student social networks and sense of community. Educational Technology and Society, 11(3), 224–238.
Dekker, G., Pechenizkiy, M., & Vleeshouwers, J. (2009). Predicting students drop out: A case study. In: Proceedings of 2nd International Conference on Educational Data Mining (pp. 41–50).
Desmarais, M. (2011). Conditions for effectively deriving a q-matrix from data with non-negative matrix factorization. In: Proceedings of the 4th International Conference on Educational Data Mining (pp. 41–50).
Desmarais, M., Beheshti, B., & Naceur, R. (2012). Item to skills mapping: Deriving a conjunctive q-matrix from data. In S. A. Cerri, W. J. Clancey, G. Papadourakis, & K.-K. Panourgia (Eds.), Intelligent tutoring systems: Vol. 7315. Lecture notes in computer science (pp. 454–463). Heidelberg, Germany: Springer.
Fancsali, S. (2012). Variable construction and causal discovery for cognitive tutor log data: Initial results. In: Proceedings of the 5th Conference on Educational Data Mining (pp. 238–239).
Feng, M., & Heffernan, N. (2007). Towards live informing and automatic analyzing of student learning: Reporting in the assistment system. Journal of Interactive Learning Research, 18(2), 207–230.
Feng, M., Heffernan, N., & Koedinger, K. (2009). Addressing the assessment challenge with an online system that tutors as it assesses. User Modeling and User-Adapted Interaction, 19(3), 243–266.
Goldin, I., Koedinger, K. R., & Aleven, V. (2012). Learner differences in hint processing. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 73–80).
Gong, Y., Beck, J. E., & Heffernan, N. T. (2011). How to construct more accurate student models: Comparing and optimizing knowledge tracing and performance factor analysis. International Journal of Artificial Intelligence in Education, 21(1), 27–46.
Hanley, A., & McNeil, B. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36.
Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley.
Kay, J., Maisonneuve, N., Yacef, K., & Zaïane, O. (2006). Mining patterns of events in students’ teamwork data. In: Proceedings of the Workshop on Educational Data Mining at the 8th International Conference on Intelligent Tutoring Systems (pp. 45–52).
Kinnebrew, J., & Biswas, G. (2012). Identifying learning behaviors by contextualizing differential sequence mining with action features and performance evolution. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 57–64).
Kline, P. (1993). An easy guide to factor analysis. London: Routledge.
Koedinger, K., McLaughlin, E., & Stamper, J. (2012). Automated student model improvement. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 17–24).
Lin, J., Keogh, E., Lonardi, S., & Patel, P. (2002). Finding motifs in time series. In: Proceedings of the 2nd Workshop on Temporal Data Mining (pp. 53–68).
Martin, J., & VanLehn, K. (1995). Student assessment using Bayesian nets. International Journal of Human Computer Studies, 42(6), 575–592.
Martinez, R., Yacef, K., Kay, J., Kharrufa, A., & Al-Qaraghuli, A. (2011). Analysing frequent sequential patterns of collaborative learning activity around an interactive tabletop. In: Proceedings of the 4th International Conference on Educational Data Mining (pp. 111–120).
Merceron, A., & Yacef, K. (2005). Educational data mining: A case study. In: Proceedings of the 2005 Conference on Artificial Intelligence in Education: Supporting Learning Through Socially Informed Technology (pp. 467–474).
Merceron, A., & Yacef, K. (2008). Interestingness measures for association rules in educational data. In: Proceedings of the 1st International Conference on Educational Data Mining (pp. 57–66).
Minaei-Bidgoli, B., Kashy, D., Kortmeyer, G., & Punch, W. (2003). Predicting student performance: An application of data mining methods with an educational web-based system. In: Frontiers in Education, 2003. FIE 2003 33rd Annual (pp. T2A 13–18). (http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1263284&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F8925%2F28250%2F01263284.pdf%3Farnumber%3D1263284#).
Pardos, Z., Baker, R., San Pedro, M., Gowda, S., & Gowda, S. (2013). Affective states and state tests: Investigating how affect throughout the school year predicts end of year learning outcomes. In: Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (pp. 117–124).
Pardos, Z. A., Gowda, S. M., Baker, R., & Heffernan, N. T. (2012). The sum is greater than the parts: Ensembling models of student knowledge in educational software. ACM SIGKDD Explorations Newsletter, 13(2), 37–44.
Pavlik, P., Cen, H., & Koedinger, K. R. (2009) Performance factors analysis—A new alternative to knowledge tracing. In: Proceedings of the 14th International Conference on Artificial Intelligence in Education (pp. 531–538).
Perera, D., Kay, J., Koprinska, I., Yacef, K., & Zaïane, O. R. (2009). Clustering and sequential pattern mining of online collaborative learning data. IEEE Transactions on Knowledge and Data Engineering, 21(6), 759–772.
Rai, D., & Beck, J. (2011). Exploring user data from a game-like math tutor: A case study in causal modeling. In: Proceedings of the 4th International Conference on Educational Data Mining (pp. 307–313).
Rau, A., & Scheines, R. (2012). Searching for variables and models to investigate mediators of learning from multiple representations. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 110–117).
Roll, I., Aleven, V., McLaren, B. M., & Koedinger, K. R. (2007). Can help seeking be tutored? Searching for the secret sauce of metacognitive tutoring. In: Proceedings of the 13th International Conference on Artificial Intelligence in Education, Marina del Rey, CA (pp. 203–210).
Romero, C., & Ventura, S. (2007). Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications, 33(1), 135–146.
Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, 40(6), 601–618.
Rus, V., Moldovan, C., Graesser, A., & Niraula, N. (2012). Automated discovery of speech act categories in educational games. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 25–32).
San Pedro, M., Baker, R., Bowers, A., & Heffernan, N. (2013). Predicting college enrollment from student interaction with an intelligent tutoring system in middle school. In Proceedings of the 6th International Conference on Educational Data Mining (pp. 177–184).
Sao Pedro, M., Baker, R., Montalvo, O., Nakama, A., & Gobert, J. D. (2010). Using text replay tagging to produce detectors of systematic experimentation behavior patterns. In: Proceedings of the 3rd International Conference on Educational Data Mining (pp. 181–190).
Scheines, R., Spirtes, P., Glymour, C., Meek, C., & Richardson, T. (1998). The TETRAD project: Constraint based aids to causal model specification. Multivariate Behavioral Research, 33(1), 65–117.
Scheuer, O., & McLaren, B. M. (2011). Educational data mining. The encyclopedia of the sciences of learning. New York: Springer.
Schreurs, B., Teplovs, C., Ferguson, R., De Laat, M., & Buckingham Shum, S. (2013). Visualizing social learning ties by type and topic: Rationale and concept demonstrator. In: Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (pp. 33–37).
Shanabrook, D. H., Cooper, D. G., Woolf, B. P., & Arroyo, I. (2010). Identifying high-level student behavior using sequence-based motif discovery. In: Proceedings of the 3rd International Conference on Educational Data Mining (pp. 191–200).
Shute, V. J. (1995). SMART: Student modeling approach for responsive tutoring. User Modeling and User-Adapted Interaction, 5(1), 1–44.
Siemens, G., & Baker, R. (2012). Learning analytics and educational data mining: Towards communication and collaboration. In: Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (pp. 252–254).
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. New York: MIT Press.
Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. Heidelberg, Germany: Springer.
Storey, J. D. (2003). The positive false discovery rate: A Bayesian interpretation and the q-value. Annals of Statistics, 31(6), 2013–2035.
Suthers, D., & Rosen, D. (2011). A unified framework for multi-level analysis of distributed learning. In: Proceedings of the 1st International Conference on Learning Analytics and Knowledge (pp. 64–74).
Tatsuoka, K. (1995). Architecture of knowledge structures and cognitive diagnosis: A statistical pattern recognition and classification approach. In P. Nichols, S. Chipman, & R. Brennan (Eds.), Cognitively diagnostic assessment (pp. 327–359). London: Routledge.
Thai-Nghe, N., Horvath, T., & Schmidt-Thieme, L. (2011). Context-Aware factorization for personalized student’s task recommendation. In: Proceedings of the International Workshop on Personalization Approaches in Learning Environments (pp. 13–18).
Vuong, A., Nixon, T., & Towle, B. (2011). A method for finding prerequisites within a curriculum. In: Proceedings of the 4th International Conference on Educational Data Mining (pp. 211–216).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Baker, R.S., Inventado, P.S. (2014). Educational Data Mining and Learning Analytics. In: Larusson, J., White, B. (eds) Learning Analytics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3305-7_4
Download citation
DOI: https://doi.org/10.1007/978-1-4614-3305-7_4
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-3304-0
Online ISBN: 978-1-4614-3305-7
eBook Packages: Humanities, Social Sciences and LawEducation (R0)