Effective early termination techniques for text similarity join operator

dc.citation.epage801en_US
dc.citation.spage791en_US
dc.citation.volumeNumber3733en_US
dc.contributor.authorÖzalp, S. A.en_US
dc.contributor.authorUlusoy, Özgüren_US
dc.coverage.spatialIstanbul, Turkeyen_US
dc.date.accessioned2016-02-08T11:51:23Z
dc.date.available2016-02-08T11:51:23Zen_US
dc.date.issued2005en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionConference name: 20th International Symposiumen_US
dc.descriptionDate of Conference: 26-28 October 2005en_US
dc.description.abstractText similarity join operator joins two relations if their join attributes are textually similar to each other, and it has a variety of application domains including integration and querying of data from heterogeneous resources; cleansing of data; and mining of data. Although, the text similarity join operator is widely used, its processing is expensive due to the huge number of similarity computations performed. In this paper, we incorporate some short cut evaluation techniques from the Information Retrieval domain, namely Harman, quit, continue, and maximal similarity filter heuristics, into the previously proposed text similarity join algorithms to reduce the amount of similarity computations needed during the join operation. We experimentally evaluate the original and the heuristic based similarity join algorithms using real data obtained from the DBLP Bibliography database, and observe performance improvements with continue and maximal similarity filter heuristics. © Springer-Verlag Berlin Heidelberg 2005.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T11:51:23Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2005en_US
dc.identifier.doi10.1007/11569596_81en_US
dc.identifier.doi10.1007/11569596en_US
dc.identifier.isbn9783540294146en_US
dc.identifier.issn0302-9743
dc.identifier.urihttp://hdl.handle.net/11693/27360en_US
dc.language.isoEnglishen_US
dc.publisherSpringer, Berlin, Heidelbergen_US
dc.relation.isversionofhttps://doi.org/10.1007/11569596_81en_US
dc.relation.isversionofhttps://doi.org/10.1007/11569596en_US
dc.source.titleComputer and Information Sciences - ISCIS 2005en_US
dc.subjectBibliographic retrieval systemsen_US
dc.subjectComputation theoryen_US
dc.subjectComputer operating proceduresen_US
dc.subjectData miningen_US
dc.subjectData reductionen_US
dc.subjectInformation retrievalen_US
dc.subjectIntegrationen_US
dc.subjectQuery languagesen_US
dc.subjectApplication domainsen_US
dc.subjectData queryingen_US
dc.subjectFilter heuristicsen_US
dc.subjectText similarityen_US
dc.subjectText processingen_US
dc.titleEffective early termination techniques for text similarity join operatoren_US
dc.typeConference Paperen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Effective early termination techniques for text similarity join operator.pdf
Size:
228.4 KB
Format:
Adobe Portable Document Format
Description:
Full printable version