Foundations of Computing and Decision Sciences

Title:

"Leveraging Unseen Features along with their PLM-based Representation to Handle Negative Covariate Shift Problem in Text Classification"

Authors:

Nesar Ahmad Wasi, Muhammad Abulaish

Pages:

409-430

DOI:

10.2478/fcds-2024-0020

Abstract:

This paper presents a novel approach to address the problem of negative covariate shift by using unseen features. Covariate shift occurs when there is a drift between the data observed during the training and testing phase of a machine learning model. Covariate shift typically transpires in the negative class as a consequence of the swift evolution of topics discussed therein, which is driven by the characteristics of online social media. Because there is a shift in data, it signals that the data is changing, and it includes features that the trained model did not see during the training phase. We refer to such features as unseen features. To the best of our knowledge, we are the first to use unseen features to address negative covariate shift problem. The proposed approach is compared to three baselines and one state-of-the-art method. The experimental results obtained from a multi-domain sentiment dataset show that the proposed approach outperforms the baselines and state-of-the-art approaches by a significant margin in terms of various performance evaluation metrics.

Open access to full text at De Gruyter Online