Translating images to words for recognizing objects in large image and video collections

Duygulu, P.; Baştan M.; Forsyth, D.

Translating images to words for recognizing objects in large image and video collections

dc.citation.epage	276	en_US
dc.citation.spage	258	en_US
dc.citation.volumeNumber	4170	en_US
dc.contributor.author	Duygulu, P.	en_US
dc.contributor.author	Baştan M.	en_US
dc.contributor.author	Forsyth, D.	en_US
dc.date.accessioned	2019-02-11T12:46:06Z
dc.date.available	2019-02-11T12:46:06Z
dc.date.issued	2006	en_US
dc.department	Department of Computer Engineering	en_US
dc.description.abstract	We present a new approach to the object recognition problem, motivated by the recent availability of large annotated image and video collections. This approach considers object recognition as the translation of visual elements to words, similar to the translation of text from one language to another. The visual elements represented in feature space are categorized into a finite set of blobs. The correspondences between the blobs and the words are learned, using a method adapted from Statistical Machine Translation. Once learned, these correspondences can be used to predict words corresponding to particular image regions (region naming), to predict words associated with the entire images (autoannotation), or to associate the speech transcript text with the correct video frames (video alignment). We present our results on the Corel data set which consists of annotated images and on the TRECVID 2004 data set which consists of video frames associated with speech transcript text and manual annotations.	en_US
dc.identifier.doi	10.1007/11957959_14	en_US
dc.identifier.issn	0302-9743
dc.identifier.uri	http://hdl.handle.net/11693/49255
dc.language.iso	English	en_US
dc.publisher	Springer	en_US
dc.relation.isversionof	https://doi.org/10.1007/11957959_14	en_US
dc.source.title	Lecture Notes in Computer Science	en_US
dc.subject	Machine translation	en_US
dc.subject	Automatic speech recognition	en_US
dc.subject	News video	en_US
dc.subject	Statistical machine translation	en_US
dc.subject	Correspondence problem	en_US
dc.title	Translating images to words for recognizing objects in large image and video collections	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Translating_Images_to_Words_for_Recognizing.pdf
Size:: 831.16 KB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Department of Computer Engineering