Combining textual and visual information for semantic labeling of images and videos

dc.citation.epage225en_US
dc.citation.spage205en_US
dc.contributor.authorDuygulu, Pınaren_US
dc.contributor.authorBaştan, Muhammeten_US
dc.contributor.authorÖzkan, Deryaen_US
dc.contributor.editorCord, M.
dc.contributor.editorCunningham, P.
dc.date.accessioned2019-04-22T10:15:18Z
dc.date.available2019-04-22T10:15:18Z
dc.date.issued2008en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionChapter 9en_US
dc.description.abstractSemantic labeling of large volumes of image and video archives is difficult, if not impossible, with the traditional methods due to the huge amount of human effort required for manual labeling used in a supervised setting. Recently, semi-supervised techniques which make use of annotated image and video collections are proposed as an alternative to reduce the human effort. In this direction, different techniques, which are mostly adapted from information retrieval literature, are applied to learn the unknown one-to-one associations between visual structures and semantic descriptions. When the links are learned, the range of application areas is wide including better retrieval and automatic annotation of images and videos, labeling of image regions as a way of large-scale object recognition and association of names with faces as a way of large-scale face recognition. In this chapter, after reviewing and discussing a variety of related studies, we present two methods in detail, namely, the so called “translation approach” which translates the visual structures to semantic descriptors using the idea of statistical machine translation techniques, and another approach which finds the densest component of a graph corresponding to the largest group of similar visual structures associated with a semantic description.en_US
dc.identifier.doi10.1007/978-3-540-75171-7_9en_US
dc.identifier.doi10.1007/978-3-540-75171-7en_US
dc.identifier.isbn9783540751700
dc.identifier.issn1611-2482
dc.identifier.urihttp://hdl.handle.net/11693/50869
dc.language.isoEnglishen_US
dc.publisherSpringer, Berlin, Heidelbergen_US
dc.relation.ispartofMachine learning techniques for multimediaen_US
dc.relation.ispartofseriesCognitive Technologies;
dc.relation.isversionofhttps://doi.org/10.1007/978-3-540-75171-7_9en_US
dc.relation.isversionofhttps://doi.org/10.1007/978-3-540-75171-7en_US
dc.subjectMachine translationen_US
dc.subjectAutomatic speech recognitionen_US
dc.subjectMean average precisionen_US
dc.subjectImage annotationen_US
dc.subjectNews videoen_US
dc.titleCombining textual and visual information for semantic labeling of images and videosen_US
dc.typeBook Chapteren_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Combining textual and visual information for semantic labeling of images and videos.pdf
Size:
12.53 MB
Format:
Adobe Portable Document Format
Description:
View / Download
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: