Dominating Set-Based Extractive Text Summarization in Graphs
No Thumbnail Available
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Bu çalışmada, belgelerin çıkarımsal özetlenmesi için yeni bir yöntem önerilmektedir. Çıkarımsal metin özetleme yönteminde cümleler ana metinden olduğu gibi seçilerek özet oluşturulmaktadır. Özetlenecek metindeki en fazla bilgiyi içeren cümleleri seçerek özetin oluşturması başarı oranını artırmaktadır. Bu tez çalışması kapsamında Karcı Baskın Küme Algoritması kullanılmıştır. Özetlenecek metne ait cümlelerin ortak kelime sayıları baz alınarak oluşturulan komşuluk matrisinden çizge oluşturulmuştur. Çizgeye ait baskın kümedeki düğümlerin temsil ettiği cümlelerin ana metinden çıkarılması ile geriye kalan cümlelerden oluşturulan yeni çizgenin özvektör merkeziliği değerlerine göre özet elde edilmiştir. Çalışma, Document Understanding Conference veri setlerinden (DUC-2002 ve DUC-2004) yararlanılarak gerçekleştirilmiştir. Çalışmanın performansı, ROUGE değerlendirme metrikleri ile ölçülmüş ve diğer rekabetçi yöntemlerle karşılaştırılmıştır. 100, 200 ve 400 kelimelik özetler için deneyler tekrar edilmiştir. Elde edilen sonuçlar, önerilen modelin katkılarını ortaya koymaktadır.
In this study, a new method has been proposed extractive document summarization. In the extractive text summarization method, sentences are selected from the main text as they are to generate the summary. Selecting sentences that contain the maximum amount of information from the text to be summarized increases the success rate of generating the summary. In this thesis study, Karci Dominating Set Algorithm was used. A graph was created from the adjacency matrix generated based on the common word counts of the sentences belonging to the text to be summarized. The sentences represented by the nodes in the dominating set of the graph were removed from the main text, and a new graph was created from the remaining sentences. The summary was obtained based on the eigenvector centrality values of this new graph. The research utilized the Document Understanding Conference (DUC-2002 and DUC-2004) dataset for evaluation. Performance assessment was conducted using ROUGE evaluation metrics, and the results were compared against other competitive methods. The experimental procedures were repeated for summaries of 100, 200, and 400 words. The outcomes obtained with the proposed method clearly demonstrate the contributions of this innovative approach.
In this study, a new method has been proposed extractive document summarization. In the extractive text summarization method, sentences are selected from the main text as they are to generate the summary. Selecting sentences that contain the maximum amount of information from the text to be summarized increases the success rate of generating the summary. In this thesis study, Karci Dominating Set Algorithm was used. A graph was created from the adjacency matrix generated based on the common word counts of the sentences belonging to the text to be summarized. The sentences represented by the nodes in the dominating set of the graph were removed from the main text, and a new graph was created from the remaining sentences. The summary was obtained based on the eigenvector centrality values of this new graph. The research utilized the Document Understanding Conference (DUC-2002 and DUC-2004) dataset for evaluation. Performance assessment was conducted using ROUGE evaluation metrics, and the results were compared against other competitive methods. The experimental procedures were repeated for summaries of 100, 200, and 400 words. The outcomes obtained with the proposed method clearly demonstrate the contributions of this innovative approach.
Description
Keywords
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Baskınlık, Graflar, Özetleme, Computer Engineering and Computer Science and Control, Dominance, Graphs, Summarizing
Turkish CoHE Thesis Center URL
WoS Q
Scopus Q
Source
Volume
Issue
Start Page
End Page
87