A Novel Integration of Multiple Learning Methods for Detecting Misleading Information From Different Datasets During the Pandemic

dc.authorscopusid 57188924981
dc.authorscopusid 14055469000
dc.authorscopusid 57195543724
dc.authorwosid Yaganoglu, Mete/Afx-8940-2022
dc.contributor.author Irmak, Muhammed Coskun
dc.contributor.author Aydin, Tolga
dc.contributor.author Yaganoglu, Mete
dc.date.accessioned 2025-05-10T17:24:58Z
dc.date.available 2025-05-10T17:24:58Z
dc.date.issued 2025
dc.department T.C. Van Yüzüncü Yıl Üniversitesi en_US
dc.department-temp [Irmak, Muhammed Coskun] Van Yuzuncu Yil Univ, Dept Comp Engn, TR-65090 Van, Turkiye; [Aydin, Tolga; Yaganoglu, Mete] Ataturk Univ, Dept Comp Engn, TR-25030 Erzurum, Turkiye en_US
dc.description.abstract Coronavirus Disease 2019 (COVID-19) was an intensely and commonly discussed topic on social media platforms during the pandemic due to uncertainty about the virus, especially as new variants of the virus emerged around the world. Unfortunately, during the pandemic, people shared many posts about COVID-19 on their social media accounts without paying attention or checking whether they were true or not. In this way, intentionally or unintentionally, they highly manipulated public opinion through their posts. The majority of these posts contained misleading information that negatively affected readers' cognitive and mental health, leading to a new neologism associated with the pandemic: "infodemic." Therefore, the present study focuses on the classification of Fake News disseminated during the pandemic to mislead people. To this end, five different datasets were first trained independently using natural language processing and machine learning methods, and the results obtained were compared. Later, these datasets were combined according to the different scenarios to improve the model performance. According to the results, the highest accuracy value of 98.1% was obtained with the model Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA) when the datasets were trained independently. Similarly, the highest training accuracy of 94.12% was obtained with the ELECTRA method and the highest test accuracy of 91.71% was obtained with the Random Forest method. In summary, the model ELECTRA, which is less preferred than other pre-trained models, achieved the highest performance scores in all study-specific scenarios. en_US
dc.description.woscitationindex Science Citation Index Expanded
dc.identifier.doi 10.1016/j.engappai.2024.109944
dc.identifier.issn 0952-1976
dc.identifier.issn 1873-6769
dc.identifier.scopus 2-s2.0-85213211041
dc.identifier.scopusquality Q1
dc.identifier.uri https://doi.org/10.1016/j.engappai.2024.109944
dc.identifier.uri https://hdl.handle.net/20.500.14720/11229
dc.identifier.volume 142 en_US
dc.identifier.wos WOS:001402681100001
dc.identifier.wosquality Q1
dc.language.iso en en_US
dc.publisher Pergamon-elsevier Science Ltd en_US
dc.relation.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Efficiently Learning An Encoder That Classifies en_US
dc.subject Token Replacements Accurately en_US
dc.subject Coronavirus Disease 2019 Fake News en_US
dc.subject Natural Language Processing en_US
dc.subject Text Mining en_US
dc.title A Novel Integration of Multiple Learning Methods for Detecting Misleading Information From Different Datasets During the Pandemic en_US
dc.type Article en_US
dspace.entity.type Publication

Files