Generic text summarization for Turkish

dc.citation.epage1323en_US
dc.citation.issueNumber8en_US
dc.citation.spage1315en_US
dc.citation.volumeNumber53en_US
dc.contributor.authorKutlu, M.en_US
dc.contributor.authorCığır, C.en_US
dc.contributor.authorCicekli, I.en_US
dc.date.accessioned2016-02-08T12:23:38Z
dc.date.available2016-02-08T12:23:38Z
dc.date.issued2010en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractIn this paper, we propose a generic text summarization method that generates summaries of Turkish texts by ranking sentences according to their scores. Sentence scores are calculated using their surface-level features, and summaries are created by extracting the highest ranked sentences from the original documents. To extract sentences which form a summary with an extensive coverage of the main content of the text and less redundancy, we use features such as term frequency, key phrase (KP), centrality, title similarity and sentence position. The sentence rank is computed using a score function that uses its feature values and the weights of the features. The best feature weights are learned using machine-learning techniques with the help of human-constructed summaries. Performance evaluation is conducted by comparing summarization outputs with manual summaries of two newly created Turkish data sets. This paper presents one of the first Turkish summarization systems, and its results are promising. We introduce the usage of KP as a surface-level feature in text summarization, and we show the effectiveness of the centrality feature in text summarization. The effectiveness of the features in Turkish text summarization is also analyzed in detail. © The Author 2008. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved.en_US
dc.identifier.doi10.1093/comjnl/bxp124en_US
dc.identifier.issn0010-4620
dc.identifier.urihttp://hdl.handle.net/11693/28550
dc.language.isoEnglishen_US
dc.publisherOxford University Pressen_US
dc.relation.isversionofhttp://dx.doi.org/10.1093/comjnl/bxp124en_US
dc.source.titleThe Computer Journalen_US
dc.subjectNatural language processingen_US
dc.subjectSummary extractionen_US
dc.subjectText summarizationen_US
dc.subjectData setsen_US
dc.subjectFeature weighten_US
dc.subjectKey-phraseen_US
dc.subjectMachine learning techniquesen_US
dc.subjectPerformance evaluationen_US
dc.subjectScore functionen_US
dc.subjectSummarization systemsen_US
dc.subjectSummary extractionen_US
dc.subjectTerm frequencyen_US
dc.titleGeneric text summarization for Turkishen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Generic text summarization for Turkish.pdf
Size:
169.64 KB
Format:
Adobe Portable Document Format
Description:
Full Printable Version