Adaptive ambulance redeployment via multi-armed bandits
Item Usage Stats
Emergency Medical Services (EMS) provide the necessary resources when there is a need for immediate medical attention and play a signi cant role in saving lives in the case of a life-threatening event. Therefore, it is necessary to design an EMS system where the arrival times to calls are as short as possible. This task includes the ambulance redeployment problem that consists of the methods of deploying ambulances to certain locations in order to minimize the arrival time and increase the coverage of the demand points. As opposed to many conventional redeployment methods where the optimization is primary concern, we propose a learning-based approach in which ambulances are redeployed without any a priori knowledge on the call distributions and the travel times, and these uncertainties are learned on the way. We cast the ambulance redeployment problem as a multi-armed bandit (MAB) problem, and propose various context-free and contextual MAB algorithms that learn to optimize redeployment locations via exploration and exploitation. We investigate the concept of risk aversion in ambulance redeployment and propose a risk-averse MAB algorithm. We construct a data-driven simulator that consists of a graph-based redeployment network and Markov tra c model and compare the performances of the algorithms on this simulator. Furthermore, we also conduct more realistic simulations by modeling the city of Ankara, Turkey and running the algorithms in this new model. Our results show that given the same conditions the presented MAB algorithms perform favorably against a method based on dynamic redeployment and similarly to a static allocation method which knows the true dynamics of the simulation setup beforehand.
Multi-armed bandit problem
Contextual multi-armed bandit problem