BUIR logo
Communities & Collections
All of BUIR
  • English
  • Türkçe
Log In
Please note that log in via username/password is only available to Repository staff.
Have you forgotten your password?
  1. Home
  2. Browse by Subject

Browsing by Subject "Data partitioning"

Filter results by typing the first few letters
Now showing 1 - 2 of 2
  • Results Per Page
  • Sort Options
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Elastic scaling for data stream processing
    (IEEE Computer Society, 2014) Gedik, B.; Schneider S.; Hirzel M.; Wu, Kun-Lung
    This article addresses the profitability problem associated with auto-parallelization of general-purpose distributed data stream processing applications. Auto-parallelization involves locating regions in the application's data flow graph that can be replicated at run-time to apply data partitioning, in order to achieve scale. In order to make auto-parallelization effective in practice, the profitability question needs to be answered: How many parallel channels provide the best throughput? The answer to this question changes depending on the workload dynamics and resource availability at run-time. In this article, we propose an elastic auto-parallelization solution that can dynamically adjust the number of channels used to achieve high throughput without unnecessarily wasting resources. Most importantly, our solution can handle partitioned stateful operators via run-time state migration, which is fully transparent to the application developers. We provide an implementation and evaluation of the system on an industrial-strength data stream processing platform to validate our solution. © 1990-2012 IEEE.
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Ensemble pruning for text categorization based on data partitioning
    (Springer, Berlin, Heidelberg, 2011) Toraman, Çağrı; Can, Fazlı
    Ensemble methods can improve the effectiveness in text categorization. Due to computation cost of ensemble approaches there is a need for pruning ensembles. In this work we study ensemble pruning based on data partitioning. We use a ranked-based pruning approach. For this purpose base classifiers are ranked and pruned according to their accuracies in a separate validation set. We employ four data partitioning methods with four machine learning categorization algorithms. We mainly aim to examine ensemble pruning in text categorization. We conduct experiments on two text collections: Reuters-21578 and BilCat-TRT. We show that we can prune 90% of ensemble members with almost no decrease in accuracy. We demonstrate that it is possible to increase accuracy of traditional ensembling with ensemble pruning. © 2011 Springer-Verlag Berlin Heidelberg.

About the University

  • Academics
  • Research
  • Library
  • Students
  • Stars
  • Moodle
  • WebMail

Using the Library

  • Collections overview
  • Borrow, renew, return
  • Connect from off campus
  • Interlibrary loan
  • Hours
  • Plan
  • Intranet (Staff Only)

Research Tools

  • EndNote
  • Grammarly
  • iThenticate
  • Mango Languages
  • Mendeley
  • Turnitin
  • Show more ..

Contact

  • Bilkent University
  • Main Campus Library
  • Phone: +90(312) 290-1298
  • Email: dspace@bilkent.edu.tr

Bilkent University Library © 2015-2025 BUIR

  • Privacy policy
  • Send Feedback