dc.contributor.advisor | Akman, Varol | |
dc.contributor.author | Bayraktar, Murat | |
dc.date.accessioned | 2016-01-08T20:13:35Z | |
dc.date.available | 2016-01-08T20:13:35Z | |
dc.date.issued | 1996 | |
dc.identifier.uri | http://hdl.handle.net/11693/17809 | |
dc.description | Ankara : Department of Computer Engineering and Information Science and the Institute of Engineering and Science of Bilkent University, 1996. | en_US |
dc.description | Thesis (Master's) -- Bilkent University, 1996. | en_US |
dc.description | Includes bibliographical references leaves 51-56. | en_US |
dc.description.abstract | Punctuation, an orthographical component of language, has usually been ignored
by most research in computational linguistics over the years. One reason
for this is the overall difficulty of the subject, and another is the absence of a
good theory. On the other hand, both ‘conventional’ and computational linguistics
have increased their attention to punctuation in recent years because it
has been realized that true understanding and processing of written language
will be almost impossible if punctuation marks are not taken into account.
Except the lists of rules given in style manuals or usage books, we know little
about punctuation. These books give us information about how we should
punctuate, but they are generally silent about the actual punctuation practice.
This thesis contains the details of a computer-aided experiment to investigate
English punctuation practice, for the special case of comma (the most significant
punctuation mark) in a parsed corpus. The experiment attempts to
classify the various uses of comma according to the syntax-patterns in which
comma occurs. The corpus (Penn Treebank) consists of syntactically annotated
sentences with no part-of-speech tag information about individual words, and
this ideally seems to be enough to classify ‘structural’ punctuation marks. | en_US |
dc.description.statementofresponsibility | Bayraktar, Murat | en_US |
dc.format.extent | ix, 92 leaves | en_US |
dc.language.iso | English | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.subject | Computational Linguistics | en_US |
dc.subject | Natural Laaguage Processing | en_US |
dc.subject | Punctuation | en_US |
dc.subject | English | en_US |
dc.subject | Corpus-based Analysis | en_US |
dc.subject | Comma | en_US |
dc.subject.lcc | QA76.9.N38 B39 1996 | en_US |
dc.subject.lcsh | Natural language processing (Computer science). | en_US |
dc.subject.lcsh | Computational linguistics. | en_US |
dc.title | Computer-aided analysis of English punctuation on a parsed corpus: the special case of comma | en_US |
dc.type | Thesis | en_US |
dc.department | Department of Computer Engineering | en_US |
dc.publisher | Bilkent University | en_US |
dc.description.degree | M.S. | en_US |