Generic placeholder image

Recent Advances in Computer Science and Communications

Editor-in-Chief

ISSN (Print): 2666-2558
ISSN (Online): 2666-2566

Research Article

A Comparative Analysis of Feature Selection Algorithms in Cross Domain Sentiment Classification

Author(s): Lipika Goel, Sonam Gupta*, Avdhesh Gupta, Neha Nandal, Siddhi Nath Rajan and Pradeep Gupta

Volume 17, Issue 3, 2024

Published on: 06 February, 2024

Article ID: e020224226683 Pages: 16

DOI: 10.2174/0126662558276889240125062857

Abstract

Background: Cross-domain Sentiment Classification is a well-researched field in sentiment analysis. The biggest challenge in CDSC arises from the differences in domains and features, which cause a decrease in model performance when applying source domain features to predict sentiment in the target domain. To address this challenge, several feature selection methods can be employed to identify the most relevant features for training and testing in CDSC.

Methods: The primary objective of this study is to perform a comparative analysis of different feature selection methods on the various CDSC tasks. In this study, statistical test-based feature selection methods using 18 classifiers for the CDSC task has been implemented. The impact of these feature selection methods on Amazon product reviews, specifically those in the DVD, Electronics, Kitchen, and TV domains, has been compared. Total 12x18 experiments were conducted for each feature selection method by varying source and target domain pairs from the Amazon product reviews dataset and by using 18 classifiers. Performance evaluation measures are accuracy and f-score.

Results: From the experiments, it has been inferred that the CSDC task depends on various factors for a good performance, from the right domain selection to the right feature selection method. We have concluded that the best training dataset is Electronics as it gives more precise results while testing in either domain selected for our study.

Conclusion: Cross-domain sentiment analysis is a dynamic and interdisciplinary field that offers valuable insights for understanding how sentiment varies across different domains.

Keywords: Sentiment classification, cross-domain, supervised learning, feature selection, TF-IDF, polarity.

Graphical Abstract

© 2024 Bentham Science Publishers | Privacy Policy