Automatic multimedia cross-modal correlation discovery

Pan, J.-Y.; Yang, H.-J.; Faloutsos, C.; Duygulu, Pınar

Automatic multimedia cross-modal correlation discovery

Files

Automatic multimedia cross-modal correlation discovery.pdf (135.44 KB)

Date

2004-08

Authors

BUIR Usage Stats

3
views

21
downloads

Citation Stats

Abstract

Given an image (or video clip, or audio song), how do we automatically assign keywords to it? The general problem is to find correlations across the media in a collection of multimedia objects like video clips, with colors, and/or motion, and/or audio, and/or text scripts. We propose a novel, graph-based approach, "MMG", to discover such cross-modal correlations. Our "MMG" method requires no tuning, no clustering, no user-determined constants; it can be applied to any multi-media collection, as long as we have a similarity function for each medium; and it scales linearly with the database size. We report auto-captioning experiments on the "standard" Corel image database of 680 MB, where it outperforms domain specific, fine-tuned methods by up to 10 percentage points in captioning accuracy (50% relative improvement).

Source Title

KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Publisher

ACM

Keywords

Automatic image captioning, Cross-modal correlation, Graph-based model, Approximation theory, Correlation methods, Database systems, Graph theory, Image analysis, Mathematical models, Motion estimation, Probability, Problem solving, Automatic image captioning, Cross-modal correlation, Graph-based models, Video motion, Multimedia systems

Permalink

http://hdl.handle.net/11693/27429

Published Version (Please cite this version)

https://doi.org/10.1145/1014052.1014135

Collections

Scholarly Publications - Computer Engineering

Language

English

Type

Conference Paper

Full item page

Automatic multimedia cross-modal correlation discovery

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Automatic multimedia cross-modal correlation discovery

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type