Towards unifiying mobility datasets

buir.advisorUlusoy, Özgür
dc.contributor.authorBasık, Fuat
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (Ph.D.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2019.en_US
dc.descriptionIncludes bibliographical references (leaves 97-106).en_US
dc.description.abstractWith the proliferation of smart phones integrated with positioning systems and the increasing penetration of Internet-of-Things (IoT) in our daily lives, mobility data has become widely available. A vast variety of mobile services and applications either have a location-based context or produce spatio-temporal records as a byproduct. These records contain information about both the entities that produce them, as well as the environment they were produced in. Availability of such data supports smart services in areas including healthcare, computational social sciences and location-based marketing. We postulate that the spatio-temporal usage records belonging to the same real-world entity can be matched across records from different location-enhanced services. This is a fundamental problem in many applications such as linking user identities for security, understanding privacy limitations of location based services, or producing a unified dataset from multiple sources for urban planning and traffic management. Such integrated datasets are also essential for service providers to optimise their services and improve business intelligence. As such, in this work, we explore scalable solutions to link entities across two mobility datasets, using only their spatio-temporal information to pave to road towards unifying mobility datasets. The first approach is rule-based linkage, based on the concept of k-l diversity | that we developed to capture both spatial and temporal aspects of the linkage. This model is realized by developing a scalable linking algorithm called ST-Link, which makes use of effective spatial and temporal filtering mechanisms that significantly reduce the search space for matching users. Furthermore, ST-Link utilizes sequential scan procedures to avoid random disk access and thus scales to large datasets. The second approach is similarity based linkage that proposes a mobility based representation and similarity computation for entities. An efficient matching process is then developed to identify the final linked pairs, with an automated mechanism to decide when to stop the linkage. We scale the process with a locality-sensitive hashing (LSH) based approach that significantly reduces candidate pairs for matching. To realize the effectiveness and efficiency of our techniques in practice, we introduce an algorithm called SLIM. We evaluated our work with respect to accuracy and performance using several datasets. Experiments show that both ST-Link and SLIM are effective in practice for performing spatio-temporal linkage and can scale to large datasets. Moreover, the LSH-based scalability brings two to four orders of magnitude speedup.en_US
dc.description.statementofresponsibilityby Fuat Basıken_US
dc.format.extentxiii, 106 leaves : charts (some color) ; 30 cm.en_US
dc.publisherBilkent Universityen_US
dc.subjectMobility dataen_US
dc.subjectData integrationen_US
dc.subjectSpatio-temporal linkageen_US
dc.titleTowards unifiying mobility datasetsen_US
dc.title.alternativeMobil veri kümelerini birleştirmeye doğruen_US
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
6.46 MB
Adobe Portable Document Format
Full printable version
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
1.71 KB
Item-specific license agreed upon to submission