Nicoletta Puddu
Multi-class text classification of news data
Maurizio Romano
First
;Maria Paola PriolaLast
2024-01-01
Abstract
Several Multi-class text classification (MCC) strategies, Namely One-Vs-Rest (OVA), One-Vs-One (OVO), Best-of-Best (BOB), and Error-Correcting-Output-Codes (ECOC), are compared in terms of accuracy and computational efficiency. Each strategy is implemented utilizing several classifiers such as Naïve Bayes, Random Forest, Logistic Regression, Neural Networks, Linear Discriminant Analysis, Support Vector Machine, and the recently-introduced Threshold-based Naïve Bayes (Tb-NB). We run a horse race involving the analysis of the 20News-Group dataset, well known in the literature for its complexity. Our results highlight the importance of choosing the right classifier whilst pairing it with an optimal strategy, providing valuable insights for optimizing classifier performance in MCC classification tasks considering both environmental implications and the need for accurate predictions.| File | Size | Format | |
|---|---|---|---|
| SDS2024.pdf open access
Type: versione editoriale
Size 459.12 kB
Format Adobe PDF
|
459.12 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
University of Cagliari