dc.contributor.advisor | Ulusoy, Özgür | |
dc.contributor.author | Basık, Fuat | |
dc.date.accessioned | 2019-12-26T13:38:15Z | |
dc.date.available | 2019-12-26T13:38:15Z | |
dc.date.copyright | 2019-12 | |
dc.date.issued | 2019-12 | |
dc.date.submitted | 2019-12-25 | |
dc.identifier.uri | http://hdl.handle.net/11693/52765 | |
dc.description | Cataloged from PDF version of article. | en_US |
dc.description | Thesis (Ph.D.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2019. | en_US |
dc.description | Includes bibliographical references (leaves 97-106). | en_US |
dc.description.abstract | With the proliferation of smart phones integrated with positioning systems and
the increasing penetration of Internet-of-Things (IoT) in our daily lives, mobility
data has become widely available. A vast variety of mobile services and applications
either have a location-based context or produce spatio-temporal records as
a byproduct. These records contain information about both the entities that produce
them, as well as the environment they were produced in. Availability of such
data supports smart services in areas including healthcare, computational social
sciences and location-based marketing. We postulate that the spatio-temporal usage
records belonging to the same real-world entity can be matched across records
from different location-enhanced services. This is a fundamental problem in many
applications such as linking user identities for security, understanding privacy limitations
of location based services, or producing a unified dataset from multiple
sources for urban planning and traffic management. Such integrated datasets are
also essential for service providers to optimise their services and improve business
intelligence. As such, in this work, we explore scalable solutions to link entities
across two mobility datasets, using only their spatio-temporal information to pave
to road towards unifying mobility datasets. The first approach is rule-based linkage,
based on the concept of k-l diversity | that we developed to capture both
spatial and temporal aspects of the linkage. This model is realized by developing
a scalable linking algorithm called ST-Link, which makes use of effective spatial
and temporal filtering mechanisms that significantly reduce the search space
for matching users. Furthermore, ST-Link utilizes sequential scan procedures to
avoid random disk access and thus scales to large datasets. The second approach
is similarity based linkage that proposes a mobility based representation and similarity
computation for entities. An efficient matching process is then developed
to identify the final linked pairs, with an automated mechanism to decide when
to stop the linkage. We scale the process with a locality-sensitive hashing (LSH)
based approach that significantly reduces candidate pairs for matching. To realize
the effectiveness and efficiency of our techniques in practice, we introduce
an algorithm called SLIM. We evaluated our work with respect to accuracy and
performance using several datasets. Experiments show that both ST-Link and
SLIM are effective in practice for performing spatio-temporal linkage and can
scale to large datasets. Moreover, the LSH-based scalability brings two to four
orders of magnitude speedup. | en_US |
dc.description.statementofresponsibility | by Fuat Basık | en_US |
dc.format.extent | xiii, 106 leaves : charts (some color) ; 30 cm. | en_US |
dc.language.iso | English | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.subject | Mobility data | en_US |
dc.subject | Data integration | en_US |
dc.subject | Spatio-temporal linkage | en_US |
dc.subject | Scalability | en_US |
dc.title | Towards unifiying mobility datasets | en_US |
dc.title.alternative | Mobil veri kümelerini birleştirmeye doğru | en_US |
dc.type | Thesis | en_US |
dc.department | Department of Computer Engineering | en_US |
dc.publisher | Bilkent University | en_US |
dc.description.degree | Ph.D. | en_US |
dc.identifier.itemid | B123677 | |
dc.embargo.release | 2020-06-20 | |