Computer-aided analysis of English punctuation on a parsed corpus: the special case of comma

Bayraktar, Murat

Computer-aided analysis of English punctuation on a parsed corpus: the special case of comma

buir.advisor	Akman, Varol
dc.contributor.author	Bayraktar, Murat
dc.date.accessioned	2016-01-08T20:13:35Z
dc.date.available	2016-01-08T20:13:35Z
dc.date.issued	1996
dc.description	Cataloged from PDF version of article.	en_US
dc.description	Includes bibliographical references leaves 51-56.	en_US
dc.description.abstract	Punctuation, an orthographical component of language, has usually been ignored by most research in computational linguistics over the years. One reason for this is the overall difficulty of the subject, and another is the absence of a good theory. On the other hand, both ‘conventional’ and computational linguistics have increased their attention to punctuation in recent years because it has been realized that true understanding and processing of written language will be almost impossible if punctuation marks are not taken into account. Except the lists of rules given in style manuals or usage books, we know little about punctuation. These books give us information about how we should punctuate, but they are generally silent about the actual punctuation practice. This thesis contains the details of a computer-aided experiment to investigate English punctuation practice, for the special case of comma (the most significant punctuation mark) in a parsed corpus. The experiment attempts to classify the various uses of comma according to the syntax-patterns in which comma occurs. The corpus (Penn Treebank) consists of syntactically annotated sentences with no part-of-speech tag information about individual words, and this ideally seems to be enough to classify ‘structural’ punctuation marks.	en_US
dc.description.statementofresponsibility	Bayraktar, Murat	en_US
dc.format.extent	ix, 92 leaves	en_US
dc.identifier.uri	http://hdl.handle.net/11693/17809
dc.language.iso	English	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Computational Linguistics	en_US
dc.subject	Natural Laaguage Processing	en_US
dc.subject	Punctuation	en_US
dc.subject	English	en_US
dc.subject	Corpus-based Analysis	en_US
dc.subject	Comma	en_US
dc.subject.lcc	QA76.9.N38 B39 1996	en_US
dc.subject.lcsh	Natural language processing (Computer science).	en_US
dc.subject.lcsh	Computational linguistics.	en_US
dc.title	Computer-aided analysis of English punctuation on a parsed corpus: the special case of comma	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Computer Engineering
thesis.degree.grantor	Bilkent University
thesis.degree.level	Master's
thesis.degree.name	MS (Master of Science)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: B035250.pdf
Size:: 2.18 MB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

Collections

Graduate School of Engineering and Science