Combining textual and visual information for semantic labeling of images and videos

Duygulu, Pınar; Baştan, Muhammet; Özkan, Derya

Combining textual and visual information for semantic labeling of images and videos

dc.citation.epage	225	en_US
dc.citation.spage	205	en_US
dc.contributor.author	Duygulu, Pınar	en_US
dc.contributor.author	Baştan, Muhammet	en_US
dc.contributor.author	Özkan, Derya	en_US
dc.contributor.editor	Cord, M.
dc.contributor.editor	Cunningham, P.
dc.date.accessioned	2019-04-22T10:15:18Z
dc.date.available	2019-04-22T10:15:18Z
dc.date.issued	2008	en_US
dc.department	Department of Computer Engineering	en_US
dc.description	Chapter 9	en_US
dc.description.abstract	Semantic labeling of large volumes of image and video archives is difficult, if not impossible, with the traditional methods due to the huge amount of human effort required for manual labeling used in a supervised setting. Recently, semi-supervised techniques which make use of annotated image and video collections are proposed as an alternative to reduce the human effort. In this direction, different techniques, which are mostly adapted from information retrieval literature, are applied to learn the unknown one-to-one associations between visual structures and semantic descriptions. When the links are learned, the range of application areas is wide including better retrieval and automatic annotation of images and videos, labeling of image regions as a way of large-scale object recognition and association of names with faces as a way of large-scale face recognition. In this chapter, after reviewing and discussing a variety of related studies, we present two methods in detail, namely, the so called “translation approach” which translates the visual structures to semantic descriptors using the idea of statistical machine translation techniques, and another approach which finds the densest component of a graph corresponding to the largest group of similar visual structures associated with a semantic description.	en_US
dc.description.provenance	Submitted by Onur Emek (onur.emek@bilkent.edu.tr) on 2019-04-22T10:15:17Z No. of bitstreams: 1 Combining textual and visual information for semantic labeling of images and videos.pdf: 13135490 bytes, checksum: 868f7594930e9b3507cac48d63d7bf6f (MD5)	en
dc.description.provenance	Made available in DSpace on 2019-04-22T10:15:18Z (GMT). No. of bitstreams: 1 Combining textual and visual information for semantic labeling of images and videos.pdf: 13135490 bytes, checksum: 868f7594930e9b3507cac48d63d7bf6f (MD5) Previous issue date: 2008	en
dc.identifier.doi	10.1007/978-3-540-75171-7_9	en_US
dc.identifier.doi	10.1007/978-3-540-75171-7	en_US
dc.identifier.isbn	9783540751700
dc.identifier.issn	1611-2482
dc.identifier.uri	http://hdl.handle.net/11693/50869
dc.language.iso	English	en_US
dc.publisher	Springer, Berlin, Heidelberg	en_US
dc.relation.ispartof	Machine learning techniques for multimedia	en_US
dc.relation.ispartofseries	Cognitive Technologies;
dc.relation.isversionof	https://doi.org/10.1007/978-3-540-75171-7_9	en_US
dc.relation.isversionof	https://doi.org/10.1007/978-3-540-75171-7	en_US
dc.subject	Machine translation	en_US
dc.subject	Automatic speech recognition	en_US
dc.subject	Mean average precision	en_US
dc.subject	Image annotation	en_US
dc.subject	News video	en_US
dc.title	Combining textual and visual information for semantic labeling of images and videos	en_US
dc.type	Book Chapter	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Combining textual and visual information for semantic labeling of images and videos.pdf
Size:: 12.53 MB
Format:: Adobe Portable Document Format
Description:: View / Download

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Publications - Computer Engineering