Time domain based Web usage mining for Web site improvement

Date
2002
Advisor
Gürsoy, Attila
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Bilkent University
Volume
Issue
Pages
Language
English
Type
Thesis
Journal Title
Journal ISSN
Volume Title
Series
Abstract

With the increased use of Web, large volumes of click-stream data, embed ded inside server logs, has become available for revealing user access patterns especially on specified Web sites. Efficient Web content presentation conveyed through links structure is a very important issue for efficient use of site. Web Usage Mining can be used to improve Web site design by finding deficiencies of the Web site by analyzing user access patterns. Although Web sites are intended to be designed for efficient usage for typical users, mostly conceptual relations between pages and categorization proposed by Web site designer may not meet expectations of the users. Misleading Web site design leads to users spending much more time for reaching target pages by reasoning redundant paths to be followed or lost in cyber-space without finding the target. Furthermore, changing needs and interests of users by the time require re-structuring of the Web site. Therefore Web sites should be updated according to user expectations. For that reason, most popular pages should be easily accessed, conceptually related pages either should be categorized close enough or should be linked and misleading guidance directing users to different pages other than target should be detected. However, barely finding frequent sequences is not sufficient for improving a Web site. This is because of the fact that explored frequent patterns cover both interested patterns used for reaching popular sites and redundant patterns that are followed previous to reaching target page(s). Frequent backward references embed knowledge of redundant and also related pages according to interest in these pages. In order to interpret backward and forward references in terms of interest we incorporated time domain that finds page viewing timing for each visited page. Relatively spent page viewing time for each page within a session is an important interest criterion for that page, a_.IV For that purpose, we proposed a Web usage mining framework that explores deficient points in the web site design according to user expectations. Whether user reached or not indicates misleading-guidance. Besides, jumping to related pages by using long paths, in many cases backtracks, shows that those pages should be linked. However, in order to be able to capture such patterns, page view time of each page is used. This franework advises re-design suggestions for Web site improvement. In the usage processing part of this framework, all user navigation sessions are analyzed and both forward and backward references are obtained by considering cached pages and also page viewing timing is computed for each page. In the mining process and interpretation part, frequent inter ested and redundant patterns are explored and interpreted for enabling popular pages more visible, linking related pages, reporting misleading categorization and detecting misleading guidance or categorization.

Course
Other identifiers
Book Title
Keywords
Web usage mining, Web site improvement, Sequence mining, Web mining
Citation
Published Version (Please cite this version)