Improving the performance of quantized transformers with graph neural networks

Kumbasar, Cihan Eralp

Improving the performance of quantized transformers with graph neural networks

Limited Access

This item is unavailable until:
2025-11-29

Files

B163090.pdf (1.23 MB)

Date

2025-05

Authors

Kumbasar, Cihan Eralp

Advisor

Koç, Aykut

Co-Advisor

Özaktaş, Haldun M.

BUIR Usage Stats

2
views

0
downloads

Abstract

Transformers have become established models in natural language processing (NLP) tasks. Their representational capabilities improve with size, but training and hosting larger models is computationally demanding. The rise in computational overhead leads to an increase in the carbon footprint, raising concerns about the environmental impacts of using these models. Parameter quantization promises to reduce the utilization costs of transformers; however, low-bit quantization generally leads to a notable loss in model performance. To reduce the degradation in model performance due to quantization while maintaining its associated benefits, we introduce BitTransGNN, a novel framework that improves quantized transformer performance by integrating them with Graph Neural Networks (GNNs). Transformers excel in capturing local contextual semantics, while GNNs are competent in representing global structural relationships within data. BitTransGNN makes use of this complementary nature of the two models to improve the representational capabilities of quantized transformers. After presenting our proposed architecture, to extend the utility of BitTransGNN to inductive settings, we then introduce variants of BitTransGNN that encapsulate the knowledge learned by BitTransGNN within a solitary quantized transformer model. Through an extensive set of experiments, we show that BitTransGNN substantially reduces the performance gap between quantized transformers and their full-precision counterparts while retaining the efficiency advantages provided by quantization. Transductive BitTransGNN variants outperform quantized transformer baselines by up to 21% while introducing minimal additional overhead. Inductive BitTransGNN variants improve quantized transformer performance by up to 19% with zero additional inference costs. To evaluate the cost-performance tradeoff, we inspect the model performance and utilization costs of BitTransGNN and the baseline models. We perform further analyses on BitTransGNN outputs to validate the premise that transformers and GNNs focus on highly distinct features, examine the significance of different BitTransGNN components, and discuss potential limitations. The results and findings presented in this thesis contribute to the research on improving the efficiency of neural networks and offer a new perspective on reducing neural model costs without making important sacrifices from model performance.

Keywords

Transformer, Quantization, Binary neural networks, Quantized neural networks, Efficient transformers, Large language models (LLMs), Graph neural network (GNN), Knowledge distillation

Degree Discipline

Electrical and Electronic Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Permalink

https://hdl.handle.net/11693/117129

Collections

Graduate School of Engineering and Science

Language

English

Type

Thesis

Full item page

Improving the performance of quantized transformers with graph neural networks

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Improving the performance of quantized transformers with graph neural networks

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type