Enhanced Probabilistic K-Means Clustering for Student Course Classification
Keywords:
Logistic Regression, Decision Tree, Random Forest, Naïve Bayes, K-Nearest Neighbors (KNN), Support Vector Machine (SVM)Abstract
Student course classification plays a crucial role in understanding academic patterns and guiding educational decisions. This paper presents a probability-based K-Means clustering approach to classify students into distinct academic groups based on various attributes such as CGPA, age, and branch transition history. The proposed methodology integrates probabilistic selection to enhance the standard K-Means clustering, ensuring better adaptability in student classification. A dataset of students is analyzed using principal component analysis (PCA) for dimensionality reduction, followed by clustering to identify distinct academic patterns. The results visualize student distributions in a two-dimensional PCA space, with cluster centers representing core academic categories, including original branches and students who transitioned to a different course. The probability-based K-Means model improves clustering accuracy by dynamically adjusting centroid assignments based on probabilistic weight factors. The approach provides a structured framework for institutions to analyze student trends and improve academic advising. The experimental results demonstrate that this enhanced clustering technique offers improved classification accuracy compared to traditional clustering methods.
References
S. Gupta and R. Sharma, “Student Course Classification Using K-Means Clustering with Probability Distribution,” International Journal of Data Science and Analytics, vol. 12, no. 4, 2024, pp. 150-165.
P. Verma, Machine Learning Techniques for Educational Data Mining, Academic Press, 2023.
A. Patel and K. Singh, “A Probabilistic Approach to Student Performance Prediction Using Clustering,” Journal of Artificial Intelligence Research, in press.
R. Thomas, “Enhancing Course Recommendation Systems Using K-Means Clustering,” unpublished.
L. Xu et al., “Improving Clustering Accuracy with Probability-Based K-Means,” Data Science Review, vol. 15, no. 3, 2022, pp. 210-225.
Y. Chen and B. Lee, “Clustering Models in Educational Data Mining,” Journal of Learning Analytics, vol. 8, no. 2, 2021, pp. 180-200.
J. Doe, “The Impact of Branch Changes on Student Performance,” Educational Research Quarterly, vol. 14, no. 1, 2023.
W. Smith, Clustering Algorithms for Academic Data Analysis, Springer, 2022.
H. Kim et al., “Applying Principal Component Analysis in Student Course Classification,” Machine Learning in Education, vol. 10, no. 4, 2023, pp. 95-110.
R. Williams and S. Davis, “A Hybrid Clustering Model for Student Data,” International Journal of Computer Science and Education, vol. 9, no. 1, 2024, pp. 50-70.