Multi-label multi-modal classification of movie scenes

Türköz, Irmak

Multi-label multi-modal classification of movie scenes

Available

The embargo period has ended, and this item is now available.

Files

B161325.pdf (8.76 MB)

Date

2022-09

Authors

Türköz, Irmak

Advisor

Güvenir, H. Altay

BUIR Usage Stats

1
views

72
downloads

Abstract

Promoting movies through their trailers provides valuable information that can help viewers and investors form expectations about the movie’s future success. Recent research confirmed that the audience prefers to watch movies through at-home-streaming services rather than at the theaters which resulted in movie trailers being shown privately. Moreover, advertisements created for different interest groups can provide a drastically improved experience for users and advertisers alike. There have been few attempts to automatize the trailer generation process however, AI-generated trailers were considered less attractive than editors’ cre-ations. Fortunately, the use of the most recent advancements in deep learning and the greater availability of datasets can accelerate the automated trailer generation process. Every movie produced is labeled with a set of genres that it represents. Thus, it is possible to generate multiple trailers of the same movie for different genres to offer personalized advertisements to the audience. To the best of our knowledge, personalized advertisements of movies via genre-specific trailers will be the first attempt in the automated trailer generation studies. For this task, we needed a tool that extracts representative scenes of a particular genre from a given movie. Then, these scenes can be concatenated to form a draft of a trailer for each genre. The draft can be finalized through the creative post-production process. In this thesis, we developed a deep learning network that classifies scenes into a set of genres. In order to construct a training dataset to train this network, we compiled a set of scenes that are labeled with their representative genres. Our network accomplishes a multi-label classification task with hyper-parameters learned from experimental binary models. The learning process comprises the use of visual features, audio features, and their combination. The final result of the model is evaluated by comparing its classification performance with human perception.

Keywords

Movie trailer labeling, Scene understanding, Deep learning, Computer vision

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Permalink

http://hdl.handle.net/11693/110579

Collections

Graduate School of Engineering and Science

Language

English

Type

Thesis

Full item page

Multi-label multi-modal classification of movie scenes

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Multi-label multi-modal classification of movie scenes

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type