A quantitative analysis of Educational Data through the Comparison between Hierarchical and Not-Hierarchical Clustering
More details
Hide details
Dipartimento di Fisica e Chimica, University of Palermo, Italy
Dipartimento di Matematica e Informatica, University of Palermo, Italy
Online publication date: 2017-07-12
Publication date: 2017-07-12
Corresponding author
Onofrio Rosario Battaglia   

Dipartimento di Fisica e Chimica, University of Palermo, viale delle Scienze edificio 18, 90100 Palermo, Italy
EURASIA J. Math., Sci Tech. Ed 2017;13(8):4491-4512
Many research papers have studied the problem of taking a set of data and separating it into subgroups through the methods of Cluster Analysis. However, the variables and parameters involved in Cluster Analysis have not always been outlined and criticized, especially in the field of Science Education. Moreover, in the field of Science Education, a comparison between two different Clustering methods is not discussed in the literature. Conceptions of students about modeling in physic are investigated by using an open-ended questionnaire. The questionnaire is analyzed through Clustering methods. The clustering results obtained by using the two methods are compared and show a good coherence between them. The results are interpreted and are compared with literature results. A synergism between the two clustering methods allows us to obtain more detailed and robust information about the modelling concept. Looking at the content from a pedagogical point of view, our study allowed us to obtain more detail about the relationship between different student conceptions of modeling in physics.
Allen, D. N., & Goldstein, G. (eds.) (2013). Cluster Analysis in Neuropsychological Research: 13 Recent Applications, Springer Science+Business Media, New York.
Bao, L., & Redish, E. F. (2006). Model analysis: Representing and assessing the dynamics of student learning. Phys. Rev. ST Phys. Educ. Res., 2, 010103.
Battaglia, O. R., & Di Paola, B. (2015). A quantitative method to analyse an open answer questionnaire: a case study about the Boltzmann Factor. Il Nuovo Cimento, 38C(3), id 87.
Battaglia, O. R., Di Paola, B., & Fazio, C. (2016). A New Approach to Investigate Students’ Behavior by Using Cluster Analysis as an Unsupervised Methodology in the Field of Education. Applied Mathematics, 7, 1649-1673.
Battaglia, O. R., Paola, B. D., & Fazio, C. (2017). K-means Clustering to Study How Student Reasoning Lines Can Be Modified by a Learning Activity Based on Feynman’s Unifying Approach. EURASIA Journal of Mathematics, Science and Technology Education, 13(6), 2005-2038. DOI: 10.12973/eurasia.2017.01211a
Bunge, M. (1973). Method, Model and Matter, Springer Netherlands, Dordrecht.
Calinski, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in statistics. Theory and methods, 3(1), 1-27.
Chi, M. T. H. (1997). Quantifying Qualitative Analyses of Verbal Data: A Practical Guide. The Journal of the Learning Sciences, 6(3), 271-315.
Coates, A., & Ng, A. Y. (2012). Learning Feature Representations with K-Means, in Neural Networks: Tricks of the Trade, edited by G. Montavon, G. B. Orr, K.R. Muller, 2nd edn. Springer LNCS 7700, Berlin Heidelberg, pp. 561-580.
Cowgill, M. C., & Harvey, R. J. (1999). A Genetic Algorithm Approach to Cluster Analysis. Computers and Mathematics with Applications, 37, 99-108.
Danusso, L., Testa, I., & Vicentini, M. (2010). Improving Prospective Teachers’ Knowledge about Scientific Models and Modelling: Design and evaluation of a teacher education intervention. Int. J. Sci. Educ., 32(7), 871-905.
Dayan, P. (1999). Unsupervised Learning in The MIT Encyclopedia of the Cognitive Sciences Wilson, edited by Wilson, R.A. & Keil, F. The MIT Press, London, 1-7.
Denzin, N. (2006). Sociological Methods: A Sourcebook, 5th edition (Aldine Transaction).
Di Paola, B., Battaglia, O. R., & Fazio, C. (2016). Non-Hierarchical Clustering to Analyse an Open-Ended Questionnaire on Algebraic Thinking. South African Journal of Education, 36, 1-13.
DiCiccio, T. J., & Efron, B. (1996). Bootstrap confidence intervals. Statistical Science, 11(3), 189-228.
Ding, L., & Beichner, R. (2009). Approaches to data analysis of multiple-choice questions. Phys. Rev. ST Phys. Educ. Res, 5, 020103.
Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster Analysis, John Wiley & Sons Ltd, Chichester.
Fazio, C, Di Paola, B., & Guastella, I. (2012). Prospective elementary teachers’ perceptions of the processes of modeling: A case study. Phys. Rev. ST Phys. Educ. Res., 8, 010110.
Fazio, C., & Spagnolo, F. (2008). Conceptions on modelling processes in Italian high school prospective mathematics and physics teachers. S. Afr. J. Educ., 28, 469.
Fazio, C., Battaglia, O. R., & Di Paola, B. (2013). Investigating the quality of mental models deployed by undergraduate engineering students in creating explanations: the case of thermally activated phenomena. Phys. Rev. ST Phys. Educ. Res., 9, 020101.
GhasemiGol, M., Yazdi, H. S., & Monsefi, R. (2010). A new Hierarchical Clustering Algorithm on Fuzzy Data (FHCA). International Journal of Computer and Electrical Engineering, 2(1), 134-140.
Gower, J. C. (1966). Some Distance Properties of Latent Root and Vector Methods Used in Multivariate Analysis. Biometrika Trust, 53(3/4), 325-338.
Grosslight, L., Unger, C., Jay, E., & Smith C. L. (1991). Understanding models and their use in science: Conceptions of middle and high school students and experts. Journal of Research in Science Teaching, 28, 799-822.
Hammer, D., & Berland L. K. (2014). Confusing Claims for Data: A critique of Common Practices for Presenting Qualitative Research on Learning. Journal of the Learning Sciences, 23, 37-46.
Hrepic, Z., Zollman, D., & Rebello, S. (2005). Eliciting and Representing Hybrid Mental Models. Annual International Conference of the National Association for Research in Science Teaching, April 4-7, Dallas, TX. http://ldsp01.columbusstate.ed....
Inkley, D. V. (1997). Bootstrap methods and their applications, Cambridge Series in Statistical and Probabilistic mathematics. Cambridge University Press, Cambridge.
Justi, R., & Gilbert, J. K. (2002). Science teachers’ knowledge about and attitudes towards the use of models and modelling in learning science. Int. J. Sci. Educ., 24(12), 1273–1292.
Justi, R., & Van Driel, J. K. (2005). The development of science teachers’ knowledge on models and modelling: Promoting, characterizing, and understanding the process. Int. J. Sci. Educ., 27(5), 549–573.
Kenett, D. Y., Tumminello, M., Madi, A., Gur-Gershgoren, G., Mantegna, R. N., & Ben-Jacob, E. (2010). Dominating Clasp of the Financial Sector Revealed by Partial Correlation Analysis of the Stock Market. Plos One, 5(12), e15032.
Lerman, I. C., Gras, R., & Rostam, H. (1981). Elaboration et évaluation d’un indice d’implication pour des données binaires I. Math. Sci. Hum., 74, 5.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations, in Proc. 5th Berkely Symp. Math. Statist. Probab. 1965/66 vol. I, edited by L.M. LeCam & J. Neyman, Univ. of California Press, Berkeley, 281- 297.
Mantegna, R. N. (1999). Hierarchical structure in financial markets. Eur. Phys. J., B11, 193–197.
MATLAB version 8.6 (2015). Natick, Massachusetts: The MathWorks Inc., www.mathworks.com/products/matlab/.
Meila, M. (2007). Comparing clusterings—an information based distance. Journal of Multivariate Analysis, 98, 873–895.
Mestre, J. P. (2002). Probing adults’ conceptual understanding and transfer of learning via problem posing. Journal of Applied Developmental Psychology, 23, 9-50.
NRC (2012). A Framework for K–12 Science Education: Practices, Crosscutting Concepts and Core Ideas. (National Academies Press, Washington,). (http://www.nap.edu/catalog/131...).
Ott, J. (1999). Analysis of Human Genetic Linkage. 3rd Edition. Johns Hopkins University Press, Baltimore, London.
Patton, M. Q. (2001). Qualitative Research and Evaluation Methods, 3rd edition. Sage Publications, Thousands Oaks.
Pluta, W. J., Clark, A., Chinn, C. A., & Duncan, R. G. (2011). Learners’ Epistemic Criteria for Good Scientific Models. Journal of Research in Science Teaching, 48, 486-511.
Redfors, A., & Ryder, J. (2001). University physics students’ use of models in explanations of phenomena involving interaction between metals and electromagnetic radiation. Int. J. Sci. Educ., 23(12), 1283- 1301.
Rouseeuw, P. J. (1987). Silhouttes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53-65.
Saracli, S., Dogan, N., & Dogan, I. (2013). Comparison of hierarchical cluster analysis methods by cophenetic correlation. Journal of Inequalities and Application, 203, 1-8.
Sathya, R., & Abraham, A. (2013). Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification. International Journal of Advanced Research in Artificial Intelligence, 2(2).
Saxena, P., Singh, V., & Lehri, S. (2013). Evolving efficient clustering patterns in liver patient data through data mining techniques. International Journal of Computer Applications, 66(16), 23-28.
Sokal, R. R., & Rohlf, F. J. (1962). The Comparison of Dendrograms by Objective Methods. International Association for Plant Taxonomy, 11(2), 33-40.
Sokal, R. R., Sneath, P. H. A. (1963). Principles of Numerical Taxonomy, W. H. Freeman & Co., San Francisco and London.
Springuel, R. P. (2010). Applying cluster analysis to physics education research data, PhD Thesis (available from https://www.academia.edu).
Springuel, R. P., Wittmann, M. C., & Thompson, J. R. (2007). Applying clustering to statistical analysis of student reasoning about two-dimensional kinematics. Phys. Rev. ST Phys. Educ. Res., 3, 020107.
Stewart, J., Miller, M., Audo, C., & Stewart, G. (2012). Using cluster analysis to identify patterns in students’ responses to contextually different conceptual problems. Phys. Rev. ST Phys. Educ. Res., 8, 020112.
Struyf, A., Hubert, M., & Rousseeuw, P. J. (1997). Clustering in an Object-Oriented Environment. Journal of Statistical Software, 1(4), 1-30.
Treagust, D., Chittleborough, G., & Mamiala, T. (2002). Students’ understanding of the role of scientific models in learning science. Int. J. Sci. Educ., 24(4), 357-368.
Tryon, R. C. (1939). Cluster Analysis: Correlation Profile and Orthometric (Factor) Analysis for the Isolation of Unities in Mind and Personality, Edwards Brothers, Ann Arbor.
Tumminello, M., Micciché, S., Dominguez, L. J., Lamura, G., Melchiorre, M. G., Barbagallo, M., & Mantegna, R. N. (2011). Happy Aged People Are All Alike, While Every Unhappy Aged Person is Unhappy in Its Own. PLoS ONE, 6(9), e23377.
Van Driel, J. H., & Verloop, N. (1999). Teachers’ knowledge of models and modelling in science. Int. J. Sci. Educ. 21(11), 1141-1153.
Journals System - logo
Scroll to top