Browsing by Subject "Visual content"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Open Access Automatic tag expansion using visual similarity for photo sharing websites(Springer New York LLC, 2010) Sevil, S. G.; Kucuktunc, O.; Duygulu, P.; Can, F.In this paper we present an automatic photo tag expansion method designed for photo sharing websites. The purpose of the method is to suggest tags that are relevant to the visual content of a given photo at upload time. Both textual and visual cues are used in the process of tag expansion. When a photo is to be uploaded, the system asks for a couple of initial tags from the user. The initial tags are used to retrieve relevant photos together with their tags. These photos are assumed to be potentially content related to the uploaded target photo. The tag sets of the relevant photos are used to form the candidate tag list, and visual similarities between the target photo and relevant photos are used to give weights to these candidate tags. Tags with the highest weights are suggested to the user. The method is applied on Flickr (http://www.flickr. com ). Results show that including visual information in the process of photo tagging increases accuracy with respect to text-based methods. © 2009 Springer Science+Business Media, LLC.Item Open Access Ensemble of multiple instance classifiers for image re-ranking(Elsevier Ltd, 2014) Sener F.; Ikizler-Cinbis, N.Text-based image retrieval may perform poorly due to the irrelevant and/or incomplete text surrounding the images in the web pages. In such situations, visual content of the images can be leveraged to improve the image ranking performance. In this paper, we look into this problem of image re-ranking and propose a system that automatically constructs multiple candidate "multi-instance bags (MI-bags)", which are likely to contain relevant images. These automatically constructed bags are then utilized by ensembles of Multiple Instance Learning (MIL) classifiers and the images are re-ranked according to the final classification responses. Our method is unsupervised in the sense that, the only input to the system is the text query itself, without any user feedback or annotation. The experimental results demonstrate that constructing multiple instance bags based on the retrieval order and utilizing ensembles of MIL classifiers greatly enhance the retrieval performance, achieving on par or better results compared to the state-of-the-art. © 2014 Elsevier B.V.Item Open Access What's news, what's not? Associating news videos with words(Springer, 2004) Duygulu, P.; Hauptmann, A.Text retrieval from broadcast news video is unsatisfactory, because a transcript word frequently does not directly 'describe' the shot when it was spoken. Extending the retrieved region to a window around the matching keyword provides better recall, but low precision. We improve on text retrieval using the following approach: First we segment the visual stream into coherent story-like units, using a set of visual news story delimiters. After filtering out clearly irrelevant classes of shots, we are still left with an ambiguity of how words in the transcript relate to the visual content in the remaining shots of the story. Using a limited set of visual features at different semantic levels ranging from color histograms, to faces, cars, and outdoors, an association matrix captures the correlation of these visual features to specific transcript words. This matrix is then refined using an EM approach. Preliminary results show that this approach has the potential to significantly improve retrieval performance from text queries. © Springer-Verlag 2004.