Optimization and improvement of fake news detection using voting technique for societal benefit
Published in 2023 IEEE International Conference on Data Mining Workshops (ICDMW), 2023
Addressing the surge in false information and the spread of fake news on the Internet has become increasingly challenging for fact-checkers to keep up with. Consequently, the exponential growth of fake news poses a serious threat, as it has been extensively exploited to manipulate public opinion and undermine trust in reliable sources. Machine learning classifiers have been employed in previous studies to address this issue. Existing work in text classification often overlooks the incorporation of contextual information, a gap that our proposed methodology seeks to fill. Our approach distinguishes itself by employing sophisticated text preprocessing techniques to capture subtle linguistic features, thereby enhancing the overall understanding of the text. Furthermore, we draw inspiration from ensemble machine learning strategies to bolster our methodology. We adopt a voting system, wherein the most frequently predicted class by five distinct classifiers is chosen. This ensemble method helps us address the inherent limitations of individual classifiers and improve the robustness of our results. In this study, we present a comparative analysis of five individual classifiers (Logistic Regression, Decision Trees, Naive Bayes, eXtreme Gradient Boosting, and Stochastic Gradient Descent) along with their combination using our ensemble voting technique. We conduct experiments on three real-world datasets of varying sizes and contexts for evaluation. Our findings reveal the increased performance of voting techniques in distinguishing between real and fake news, providing valuable insights into their efficacy in diverse contexts when compared to individual classifiers.