Browsing by Subject "Dynamic Programming"

Now showing 1 - 9 of 9

Open Access
Development of a supervisory controller for energy management problems
(2011) Akgün, Emre
Multi energy source systems, like hybrid electric vehicles in automotive industry, started to attract attention as a remedy for the greenhouse gas emission problem. Although their environmental performances are better than conventional technologies such as the case of gasoline vehicles versus hybrid electric vehicles in automotive industry, their operational management can be challenging due to their increased complexity. One of these challenges is the operational management of the energy flow among these multiple sources and sinks which in this context referred as the energy management problem. In this thesis, a supervisory controller is developed to operate at a residential environment with multiple energy sources. First, dynamic optimization techniques are applied to the available mathematical models of the multi-energy sources to create a non-causal optimal controller. Then, a set of implementable rules are extracted by analyzing the optimal trajectories resulted from the dynamic optimization to create a causal supervisory controller. Several simulations are conducted with Matlab/Simulink to validate the developed controller. The supervisory controller achieves not only a daily cost reduction between 6-7.5% compared to conventional energy infrastructure used in residential areas but also performs 2% better than heuristic control techniques available in the literature. Another simulation study is conducted, with different demand cycles, for verification of the controller. Although its performance reduces as expected, it still performs 1% better than heuristic control strategies. In the final part of this thesis, the formulation used in the residential problem which was originally adopted from an example in automotive industry, is generalized so that it can be used in all types of energy management problems. Finally, for exemplary purposes, a formulation for energy management problem in mobile devices is created by using the developed generic formulation.
Open Access
A dynamic pricing policy for perishables with stochastic demand
(2001) Yıldırım, Gonca
III this study, we consider the pricing of perishables in an inventory system where items have a fixi'd lifetime. Unit demands come from a Poisson Process with a price-dependent rate. The instances at which an item is withdrawn from inventory due to demand constitute decision epochs for setting the sales price; the time elapsed between two such consecutive instances is called a period. The sales price at each decision epoch is taken to be a lunction of Tj denoting the remaining lifetime when tin' inventory level drops to z, i = 1,...,Q. The objective is to determine the optimal pricing policy (under the proposed class) and the optimal initial stocking level to maximize the discounted expected profit. A Dynamic Programming approach is used the solve the problem numerically. Using the backward recursion, the optimal price paths are determined for the discounted expected profit for various combinations of remaining lifetimes. Our numerical studies indicate that a single price policy results in significantly lower profits when compared with our formulation.
Open Access
Dynamic routing and wavelength assignment in wavelength-division multiplexed (WDM) optical networks using neuro-dynamic programming
(2001-07) Yeşildağ, Serkan
Open Access
A heuristic algorithm for an integrated routing and scheduling problem with stops en-route
(2009) Uzun, Emre
In this study, we examine an integrated routing and scheduling problem that arises in the context of transportation of hazardous materials. The purpose of the problem is to find a minimum risk route between an origin and a destination point on a given network and to build a schedule on this route that determines where and how long to stop for a truck carrying hazardous materials. The objective is to minimize the risk imposed to the society while completing the path within a given time limit. The risk is defined as the expected population exposure in the presence of an accident which varies different times in a day. There are exact algorithms available in the literature that solve the problem. However, these algorithms are not capable of solving large sized networks due to memory constraints. Our aim is to develop a heuristic procedure that can handle larger networks. We separate the problem into two independent components, routing and scheduling, and propose solution algorithms which would communicate each other when running the algorithm. For the routing component we define a neighborhood structure that can be used to generate several paths around a given path on a network. The search procedure takes an initial path and improves it by generating different paths in the defined neighborhood. For the scheduling component, we discuss mixed integer programming, dynamic programming and heuristic approaches. We run the proposed heuristic algorithm on several test networks and compare its performance with the optimal solutions. We also present the application of the heuristic procedure on a large sized Turkey Road Network.
Open Access
Inquiring the main assumption of the assembly line balancing problem : solution procedures using and/or graph
(2005) Koç, Ali
In this thesis, we consider the assembly/disassembly line balancing (ADLB) problem. The studies in the literature consider assembly and disassembly problems separately and use task precedence diagram (TPD) and AND/OR Graph (AOG) in assembly and disassembly line balancing problems, respectively. In contrast to these studies, we use AOG for both assembly and disassembly line balancing problems, considering these two problems as complementary of each other. Hence, we call the complementary problem as ADLB-AOG. We show theoretically that AOG is a more general version of the TPD. We also develop integer programming (IP) and dynamic programming (DP) formulations to solve the ADLB-AOG problem. Our analysis indicates that the DP formulation performs much better than the IP formulation in terms of the problem sizes that can be optimally solved. We also develop a DP-based heuristic to solve large-size instances of the ADLB-AOG problem. An experimentation of the procedures on some sample problems and the implementation of the heuristic on a sample problem are also given.
Open Access
Modelling capacity expansion planning for an optical disc manufacturing system
(1997) Gündüz, Erdem
The capacity expansion problems involve determination of the optimum timing and sizing of the capacity for the facilities that have to meet a given demand function. There are various versions of the problem in the literature. In this thesis, a mathematical model for the expansion of a facility producing a single commodity is formulated. This formulation is then used for solving the capacity expansion problem for an optical disc manufacturing system, producing two types of products by the use of two different capacity types. The effects of the technological improvements and economies of scale are considered. The dynamic programming approach is used and a forward recursion algorithm is devised and coded on a personal computer.
Open Access
Online learning in structured Markov decision processes
(2017-07) Akbarzadeh, Nima
This thesis proposes three new multi-armed bandit problems, in which the learner proceeds in a sequence of rounds where each round is a Markov Decision Process (MDP). The learner's goal is to maximize its cumulative reward without any a priori knowledge on the state transition probabilities. The rst problem considers an MDP with sorted states and a continuation action that moves the learner to an adjacent state; and a terminal action that moves the learner to a terminal state (goal or dead-end state). In this problem, a round ends and the next round starts when a terminal state is reached, and the aim of the learner in each round is to reach the goal state. First, the structure of the optimal policy is derived. Then, the regret of the learner with respect to an oracle, who takes optimal actions in each round is de ned, and a learning algorithm that exploits the structure of the optimal policy is proposed. Finally, it is shown that the regret either increases logarithmically over rounds or becomes bounded. In the second problem, we investigate the personalization of a clinical treatment. This process is modeled as a goal-oriented MDP with dead-end states. Moreover, the state transition probabilities of the MDP depends on the context of the patients. An algorithm that uses the rule of optimism in face of uncertainty is proposed to maximize the number of rounds in which the goal state is reached. In the third problem, we propose an online learning algorithm for optimal execution in the limit order book of a nancial asset. Given a certain amount of shares to sell and an allocated time to complete the transaction, the proposed algorithm dynamically learns the optimal number of shares to sell at each time slot of the allocated time. We model this problem as an MDP, and derive the form of the optimal policy.
Open Access
Single machine scheduling problems: early-tardy penalties
(1993) Oguz, Ceyda
The primary concern of this dissertation is to analyze single machine total earliness and tardiness scheduling problems with different due dates and to develop both a dynamic programming formulation for its exact solution and heuristic algorithms for its approximate solution within acceptable limits. The analyses of previous works on the single machine earliness and tardiness scheduling problems reveal that the research mainly focused on a restricted problem type in which no idle time insertion is allowed in the schedule. This study deals with the general case where idle time insertion is allowed whenever necessary. Even though this problem is known to be A'P-hard in the ordinary sense, there is still a need to develop an optimizing algorithm through dynamic programming formulation. Development of such an algorithm is necessary for further identifying an approximation scheme for the problem which is an untouched issue in the earliness and tardiness scheduling theory. Furthermore, the developed dynamic programming formulation is extended to an incomplete dynamic programming which forms the core of one of the heuristic procedure proposed.A second aspect of this study is to investigate two special structures for the different due dates, namely Equal-Slack and Total-Work-Content rules, and to discuss computational complexity of the problem with these special structures. Consequently, solution procedures which bear on the characteristics of the special due date structures are proposed. This research shows that the total earliness and tardiness scheduling problem with Equal-Slack rule is A/’P-hard but can be solvable in polynomial time in certain cases. Moreover, a very efficient heuristic algorithm is proposed for the problem with the other due date structure and the results of this part leads to another heuristic algorithm for the general due date structure. Finally, a lower bound procedure is presented which is motivated from the structure of the optimal solution of the problem. This lower bound is compared with another lower bound from the literature and it is shown that it performs well on randomly generated problems.
Open Access
Using reinforcement learning for dynamic link sharing problems under signaling constraints
(2003) Çelik, Nuri
In static link sharing system, users are assigned a fixed bandwidth share of the link capacity irrespective of whether these users are active or not. On the other hand, dynamic link sharing refers to the process of dynamically allocating bandwidth to each active user based on the instantaneous utilization of the link. As an example, dynamic link sharing combined with rate adaptation capability of multimedia applications provides a novel quality of service (QoS) framework for HFC and broadband wireless networks. Frequent adjustment of the allocated bandwidth in dynamic link sharing, yields a scalability issue in the form of a significant amount of message distribution and processing power (i.e. signaling) in the shared link system. On the other hand, if the rate of applications is adjusted once for the highest loaded traffic conditions, a significant amount of bandwidth may be wasted depending on the actual traffic load. There is then a need for an optimal dynamic link sharing system that takes into account the tradeoff between signaling scalability and bandwidth efficiency. In this work, we introduce a Markov decision framework for the dynamic link sharing system, when the desired signaling rate is imposed as a constraint. Reinforcement learning methodology is adopted for the solution of this Markov decision problem, and the results demonstrate that the proposed method provides better bandwidth efficiency without violating the signaling rate requirement compared to other heuristics.