Translating images to words for recognizing objects in large image and video collections

dc.citation.epage276en_US
dc.citation.spage258en_US
dc.citation.volumeNumber4170en_US
dc.contributor.authorDuygulu, P.en_US
dc.contributor.authorBaştan M.en_US
dc.contributor.authorForsyth, D.en_US
dc.date.accessioned2019-02-11T12:46:06Z
dc.date.available2019-02-11T12:46:06Z
dc.date.issued2006en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractWe present a new approach to the object recognition problem, motivated by the recent availability of large annotated image and video collections. This approach considers object recognition as the translation of visual elements to words, similar to the translation of text from one language to another. The visual elements represented in feature space are categorized into a finite set of blobs. The correspondences between the blobs and the words are learned, using a method adapted from Statistical Machine Translation. Once learned, these correspondences can be used to predict words corresponding to particular image regions (region naming), to predict words associated with the entire images (autoannotation), or to associate the speech transcript text with the correct video frames (video alignment). We present our results on the Corel data set which consists of annotated images and on the TRECVID 2004 data set which consists of video frames associated with speech transcript text and manual annotations.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2019-02-11T12:46:06Z No. of bitstreams: 1 Translating_Images_to_Words_for_Recognizing.pdf: 851111 bytes, checksum: b4d5e1c86ad3cf438588ccfa2a783476 (MD5)en
dc.description.provenanceMade available in DSpace on 2019-02-11T12:46:06Z (GMT). No. of bitstreams: 1 Translating_Images_to_Words_for_Recognizing.pdf: 851111 bytes, checksum: b4d5e1c86ad3cf438588ccfa2a783476 (MD5) Previous issue date: 2006en
dc.identifier.doi10.1007/11957959_14en_US
dc.identifier.issn0302-9743
dc.identifier.urihttp://hdl.handle.net/11693/49255
dc.language.isoEnglishen_US
dc.publisherSpringeren_US
dc.relation.isversionofhttps://doi.org/10.1007/11957959_14en_US
dc.source.titleLecture Notes in Computer Scienceen_US
dc.subjectMachine translationen_US
dc.subjectAutomatic speech recognitionen_US
dc.subjectNews videoen_US
dc.subjectStatistical machine translationen_US
dc.subjectCorrespondence problemen_US
dc.titleTranslating images to words for recognizing objects in large image and video collectionsen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Translating_Images_to_Words_for_Recognizing.pdf
Size:
831.16 KB
Format:
Adobe Portable Document Format
Description:
Full printable version

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: