Automatic multimedia cross-modal correlation discovery
dc.citation.epage | 658 | en_US |
dc.citation.spage | 653 | en_US |
dc.contributor.author | Pan, J.-Y. | en_US |
dc.contributor.author | Yang, H.-J. | en_US |
dc.contributor.author | Faloutsos, C. | en_US |
dc.contributor.author | Duygulu, Pınar | en_US |
dc.coverage.spatial | Seattle, WA, USA | |
dc.date.accessioned | 2016-02-08T11:53:08Z | |
dc.date.available | 2016-02-08T11:53:08Z | |
dc.date.issued | 2004-08 | en_US |
dc.department | Department of Computer Engineering | en_US |
dc.description | Date of Conference: 22-25 August , 2004 | |
dc.description | Conference name: KDD '04 Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining | |
dc.description.abstract | Given an image (or video clip, or audio song), how do we automatically assign keywords to it? The general problem is to find correlations across the media in a collection of multimedia objects like video clips, with colors, and/or motion, and/or audio, and/or text scripts. We propose a novel, graph-based approach, "MMG", to discover such cross-modal correlations. Our "MMG" method requires no tuning, no clustering, no user-determined constants; it can be applied to any multi-media collection, as long as we have a similarity function for each medium; and it scales linearly with the database size. We report auto-captioning experiments on the "standard" Corel image database of 680 MB, where it outperforms domain specific, fine-tuned methods by up to 10 percentage points in captioning accuracy (50% relative improvement). | en_US |
dc.description.provenance | Made available in DSpace on 2016-02-08T11:53:08Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2004 | en |
dc.identifier.doi | 10.1145/1014052.1014135 | en_US |
dc.identifier.uri | http://hdl.handle.net/11693/27429 | en_US |
dc.language.iso | English | en_US |
dc.publisher | ACM | en_US |
dc.relation.isversionof | https://doi.org/10.1145/1014052.1014135 | |
dc.source.title | KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining | en_US |
dc.subject | Automatic image captioning | en_US |
dc.subject | Cross-modal correlation | en_US |
dc.subject | Graph-based model | en_US |
dc.subject | Approximation theory | en_US |
dc.subject | Correlation methods | en_US |
dc.subject | Database systems | en_US |
dc.subject | Graph theory | en_US |
dc.subject | Image analysis | en_US |
dc.subject | Mathematical models | en_US |
dc.subject | Motion estimation | en_US |
dc.subject | Probability | en_US |
dc.subject | Problem solving | en_US |
dc.subject | Automatic image captioning | en_US |
dc.subject | Cross-modal correlation | en_US |
dc.subject | Graph-based models | en_US |
dc.subject | Video motion | en_US |
dc.subject | Multimedia systems | en_US |
dc.title | Automatic multimedia cross-modal correlation discovery | en_US |
dc.type | Conference Paper | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Automatic multimedia cross-modal correlation discovery.pdf
- Size:
- 135.44 KB
- Format:
- Adobe Portable Document Format
- Description:
- Full printable version