Classification of Text Documents Using Genetic Algorithm and K-Nearest Neighbors

Larıbı, Parısa

Classification of Text Documents Using Genetic Algorithm and K-Nearest Neighbors

dc.contributor.advisor	Saraçoğlu, Rıdvan
dc.contributor.author	Larıbı, Parısa
dc.date.accessioned	2025-05-10T20:12:12Z
dc.date.available	2025-05-10T20:12:12Z
dc.date.issued	2018
dc.description.abstract	Metin Madenciliği büyük miktardaki metinsel verilerden, önceden bilinmeyen bilgilerin elde edilmesini amaçlayan veri madenciliğinin bir dalıdır. Sınıflandırma, kümeleme ve tahmin, Metin Madenciliğinin önemli bir parçasıdır. Başarılı bir Metin Madenciliği yine başarılı bir sınıflandırma işlemine bağlıdır. Sınıflandırma sisteminin başarısını ve verimini artırmak için genellikle boyut azaltma işlemi gerçekleştirilir. Bu çalışmada metin belgelerinin sınıflandırılmasında boyut azaltma işlemi gerçekleştirilmiştir. Bunun için iki yöntem kullanılmıştır. Bunlardan ilki özellik çıkarımı, diğeri ise özellik seçimidir. Özellik çıkarımı için Temel Bileşen Analizi yöntemi kullanılmıştır. Özellik seçiminden sonra seçilen özellikleri için katsayı ile ağırlıklandırma kullanılmıştır. Özellik seçimi aşaması için ve özellik çıkarımından sonra en iyi kat sayıların seçimi için Genetik Algoritma kullanılmıştır. Deneysel sonuçlara göre özellik seçimi sınıflandırma başarısını kısmen azaltmıştır. Özellik çıkarımı ve bu aşamadan sonra eklenen katsayı ağırlıklandırma işlemi sınıflandırma başarısını önemli ölçüde artırmıştır.
dc.description.abstract	Text Mining is a branch of data mining that aims to obtain previously unknown information from large quantities of textual data. Classification, clustering and estimation are some important piece of Text Mining. An important part of a successful Text Mining is the successful classification process. Dimension reduction is usually performed to improve the success and efficiency of the classification system. In this study, the dimension reduction process was performed in the classification of text documents. Two methods have been used for this. One of them is feature selection and the other is feature extraction. Principial Component Analysis method is used for feature extraction. Weighting with coefficients is used for selected features after feature selection. Genetic Algorithm is used for the feature selection phase and for the selection of the best coefficients after feature extraction. According to the experimental results, the feature selection partially reduced the classification success. Feature extraction and coefficient weighting added after this step significantly increased the classification success.	en_US
dc.identifier.uri	https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=fS4sqEZr79C_n60Rk6MjFYXvcTltYIPiLMjQDJTwpcYj1xOkdoHK227nNUQPh_59
dc.identifier.uri	https://hdl.handle.net/20.500.14720/22935
dc.language.iso	tr
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol
dc.subject	Genetik algoritmalar
dc.subject	Metin sınıflandırma
dc.subject	Temel bileşenler analizi
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.subject	Genetic algorithms	en_US
dc.subject	Text categorization	en_US
dc.subject	Principal components analysis	en_US
dc.title	Classification of Text Documents Using Genetic Algorithm and K-Nearest Neighbors	en_US
dc.title.alternative	Genetik Algoritma ve K-en Yakın Komşu Kullanarak Metin Belgelerinin Sınıflandırılması	en_US
dc.type	Master Thesis	en_US
dspace.entity.type	Publication
gdc.coar.type	text::thesis::master thesis
gdc.description.department	Fen Bilimleri Enstitüsü / Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
gdc.description.endpage	70
gdc.identifier.yoktezid	520774

Collections

Master Tezleri

Classification of Text Documents Using Genetic Algorithm and K-Nearest Neighbors

Files

Collections