User feedback-based online learning for intent classification

Gönç, Kaan; Sağlam, Baturay; Dalmaz, Onat; Çukur, Tolga; Kozat, Serdar; Dibeklioğlu, Hamdi

User feedback-based online learning for intent classification

buir.contributor.author	Gönç, Kaan
buir.contributor.author	Sağlam, Baturay
buir.contributor.author	Dalmaz, Onat
buir.contributor.author	Çukur, Tolga
buir.contributor.author	Kozat, Serdar
buir.contributor.author	Dibeklioğlu, Hamdi
buir.contributor.orcid	Gönç, Kaan\|0009-0009-4563-4369
buir.contributor.orcid	Sağlam, Baturay\|0000-0002-8324-5980
buir.contributor.orcid	Dalmaz, Onat\|0000-0001-7978-5311
buir.contributor.orcid	Çukur, Tolga\|0000-0002-2296-851X
buir.contributor.orcid	Kozat, Serdar\|0000-0002-6488-3848
buir.contributor.orcid	Dibeklioğlu, Hamdi\|0000-0003-0851-7808
dc.citation.epage	621	en_US
dc.citation.spage	613
dc.contributor.author	Gönç, Kaan
dc.contributor.author	Sağlam, Baturay
dc.contributor.author	Dalmaz, Onat
dc.contributor.author	Çukur, Tolga
dc.contributor.author	Kozat, Serdar
dc.contributor.author	Dibeklioğlu, Hamdi
dc.coverage.spatial	Paris, France
dc.date.accessioned	2024-03-07T12:05:32Z
dc.date.available	2024-03-07T12:05:32Z
dc.date.issued	2023-10-09
dc.department	Department of Computer Engineering
dc.department	Department of Electrical and Electronics Engineering
dc.description	Conference Name: ICMI '23: Proceedings of the 25th International Conference on Multimodal Interaction
dc.description	Date of Conference: 09–13 October 2023
dc.description.abstract	Intent classifcation is a key task in natural language processing (NLP) that aims to infer the goal or intention behind a user’s query. Most existing intent classifcation methods rely on supervised deep models trained on large annotated datasets of text-intent pairs. However, obtaining such datasets is often expensive and impractical in real-world settings. Furthermore, supervised models may overft or face distributional shifts when new intents, utterances, or data distributions emerge over time, requiring frequent retraining. Online learning methods based on user feedback can overcome this limitation, as they do not need access to intents while collecting data and adapting the model continuously. In this paper, we propose a novel multi-armed contextual bandit framework that leverages a text encoder based on a large language model (LLM) to extract the latent features of a given utterance and jointly learn multimodal representations of encoded text features and intents. Our framework consists of two stages: ofine pretraining and online fne-tuning. In the ofine stage, we train the policy on a small labeled dataset using a contextual bandit approach. In the online stage, we fne-tune the policy parameters using the REINFORCE algorithm with a user feedback-based objective, without relying on the true intents. We further introduce a sliding window strategy for simulating the retrieval of data samples during online training. This novel two-phase approach enables our method to efciently adapt to dynamic user preferences and data distributions with improved performance. An extensive set of empirical studies indicate that our method signifcantly outperforms policies that omit either offine pretraining or online fne-tuning, while achieving competitive performance to a supervised benchmark trained on an order of magnitude larger labeled dataset.
dc.identifier.doi	10.1145/3577190.3614137	en_US
dc.identifier.isbn	9798400700552	en_US
dc.identifier.uri	https://hdl.handle.net/11693/114391	en_US
dc.language.iso	English	en_US
dc.publisher	Association for Computing Machinery	en_US
dc.relation.isversionof	https://doi.org/10.1145/3577190.3614137
dc.source.title	ACM International Conference Proceeding Series
dc.subject	Online learning
dc.subject	Contextual bandits
dc.subject	Intent classifcation
dc.subject	Multimodal learning
dc.title	User feedback-based online learning for intent classification
dc.type	Conference Paper

Files

Original bundle

Now showing 1 - 1 of 1

Name:: User_feedback-based_online_learning_for_intent_classification.pdf
Size:: 1.32 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.01 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Publications - Computer Engineering
Scholarly Publications - Electrical and Electronics Engineering