Characteristics of Web-based textual communications

buir.advisorAykanat, Cevdet
dc.contributor.authorKüçükyılmaz, Tayfun
dc.date.accessioned2016-01-08T18:19:40Z
dc.date.available2016-01-08T18:19:40Z
dc.date.issued2012
dc.descriptionAnkara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University 2012.en_US
dc.descriptionThesis (Ph. D.) -- Bilkent University, 2012.en_US
dc.descriptionIncludes bibliographical references.en_US
dc.description.abstractIn this thesis, we analyze different aspects of Web-based textual communications and argue that all such communications share some common properties. In order to provide practical evidence for the validity of this argument, we focus on two common properties by examining these properties on various types of Web-based textual communications data. These properties are: All Web-based communications contain features attributable to their author and reciever; and all Web-based communications exhibit similar heavy tailed distributional properties. In order to provide practical proof for the validity of our claims, we provide three practical, real life research problems and exploit the proposed common properties of Web-based textual communications to find practical solutions to these problems. In this work, we first provide a feature-based result caching framework for real life search engines. To this end, we mined attributes from user queries in order to classify queries and estimate a quality metric for giving admission and eviction decisions for the query result cache. Second, we analyzed messages of an online chat server in order to predict user and mesage attributes. Our results show that several user- and message-based attributes can be predicted with significant occuracy using both chat message- and writing-style based features of the chat users. Third, we provide a parallel framework for in-memory construction of term partitioned inverted indexes. In this work, in order to minimize the total communication time between processors, we provide a bucketing scheme that is based on term-based distributional properties of Web page contents.en_US
dc.description.provenanceMade available in DSpace on 2016-01-08T18:19:40Z (GMT). No. of bitstreams: 1 0006246.pdf: 1061769 bytes, checksum: 162c281b958bbddb4ba6f82b419c6237 (MD5)en
dc.description.statementofresponsibilityKüçükyılmaz, Tayfunen_US
dc.format.extentxvii, 176 leavesen_US
dc.identifier.urihttp://hdl.handle.net/11693/15512
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectWeb search engineen_US
dc.subjectresult cachingen_US
dc.subjectcacheen_US
dc.subjectchat miningen_US
dc.subjectdata miningen_US
dc.subjectindex inversionen_US
dc.subjectinverted indexen_US
dc.subjectposting listen_US
dc.subject.lccTK5105.888 .K83 2012en_US
dc.subject.lcshWorld Wide Web.en_US
dc.subject.lcshWeb search engines.en_US
dc.subject.lcshData mining.en_US
dc.subject.lcshIndexing.en_US
dc.titleCharacteristics of Web-based textual communicationsen_US
dc.typeThesisen_US
thesis.degree.disciplineComputer Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelDoctoral
thesis.degree.namePh.D. (Doctor of Philosophy)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0006246.pdf
Size:
1.01 MB
Format:
Adobe Portable Document Format