User feedback-based online learning for intent classification

buir.contributor.authorGönç, Kaan
buir.contributor.authorSağlam, Baturay
buir.contributor.authorDalmaz, Onat
buir.contributor.authorÇukur, Tolga
buir.contributor.authorKozat, Serdar
buir.contributor.authorDibeklioğlu, Hamdi
buir.contributor.orcidGönç, Kaan|0009-0009-4563-4369
buir.contributor.orcidSağlam, Baturay|0000-0002-8324-5980
buir.contributor.orcidDalmaz, Onat|0000-0001-7978-5311
buir.contributor.orcidÇukur, Tolga|0000-0002-2296-851X
buir.contributor.orcidKozat, Serdar|0000-0002-6488-3848
buir.contributor.orcidDibeklioğlu, Hamdi|0000-0003-0851-7808
dc.citation.epage621en_US
dc.citation.spage613
dc.contributor.authorGönç, Kaan
dc.contributor.authorSağlam, Baturay
dc.contributor.authorDalmaz, Onat
dc.contributor.authorÇukur, Tolga
dc.contributor.authorKozat, Serdar
dc.contributor.authorDibeklioğlu, Hamdi
dc.coverage.spatialParis, France
dc.date.accessioned2024-03-07T12:05:32Z
dc.date.available2024-03-07T12:05:32Z
dc.date.issued2023-10-09
dc.departmentDepartment of Computer Engineering
dc.departmentDepartment of Electrical and Electronics Engineering
dc.descriptionConference Name: ICMI '23: Proceedings of the 25th International Conference on Multimodal Interaction
dc.descriptionDate of Conference: 09–13 October 2023
dc.description.abstractIntent classifcation is a key task in natural language processing (NLP) that aims to infer the goal or intention behind a user’s query. Most existing intent classifcation methods rely on supervised deep models trained on large annotated datasets of text-intent pairs. However, obtaining such datasets is often expensive and impractical in real-world settings. Furthermore, supervised models may overft or face distributional shifts when new intents, utterances, or data distributions emerge over time, requiring frequent retraining. Online learning methods based on user feedback can overcome this limitation, as they do not need access to intents while collecting data and adapting the model continuously. In this paper, we propose a novel multi-armed contextual bandit framework that leverages a text encoder based on a large language model (LLM) to extract the latent features of a given utterance and jointly learn multimodal representations of encoded text features and intents. Our framework consists of two stages: ofine pretraining and online fne-tuning. In the ofine stage, we train the policy on a small labeled dataset using a contextual bandit approach. In the online stage, we fne-tune the policy parameters using the REINFORCE algorithm with a user feedback-based objective, without relying on the true intents. We further introduce a sliding window strategy for simulating the retrieval of data samples during online training. This novel two-phase approach enables our method to efciently adapt to dynamic user preferences and data distributions with improved performance. An extensive set of empirical studies indicate that our method signifcantly outperforms policies that omit either offine pretraining or online fne-tuning, while achieving competitive performance to a supervised benchmark trained on an order of magnitude larger labeled dataset.
dc.description.provenanceMade available in DSpace on 2024-03-07T12:05:32Z (GMT). No. of bitstreams: 1 User_feedback-based_online_learning_for_intent_classification.pdf: 1381833 bytes, checksum: 7c3c3dcb8b407a84655b01386b9e2d9b (MD5) Previous issue date: 2023-10-09en
dc.identifier.doi10.1145/3577190.3614137en_US
dc.identifier.isbn9798400700552en_US
dc.identifier.urihttps://hdl.handle.net/11693/114391en_US
dc.language.isoEnglishen_US
dc.publisherAssociation for Computing Machineryen_US
dc.relation.isversionofhttps://doi.org/10.1145/3577190.3614137
dc.source.titleACM International Conference Proceeding Series
dc.subjectOnline learning
dc.subjectContextual bandits
dc.subjectIntent classifcation
dc.subjectMultimodal learning
dc.titleUser feedback-based online learning for intent classification
dc.typeConference Paper

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
User_feedback-based_online_learning_for_intent_classification.pdf
Size:
1.32 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.01 KB
Format:
Item-specific license agreed upon to submission
Description: