Tamara Yuliett Forbes Hernández
Exploring the Dataset Landscape for Automated Propaganda Detection: A Data-Centric Insight
Usai M.;Mura D. A.;Loddo A.;Sanguinetti M.;Zedda L.;Di Ruberto C.;Atzori M.
2025-01-01
Abstract
The increasing spread of propaganda in digital media has intensified research efforts toward the development of automated detection systems. Central to this task is the availability and quality of annotated datasets, which directly impact model performance, generalizability, and real-world applicability. In this paper, we present a data-centric insight into the current landscape of datasets used for automated propaganda detection. We analyze a representative set of publicly available corpora with respect to key factors such as annotation schemes, label granularity, domain coverage, linguistic diversity, and class balance. This work aims to guide researchers toward more robust, inclusive, and scalable approaches to propaganda detection by emphasizing the foundational role of data quality and structure.| File | Size | Format | |
|---|---|---|---|
| 2025_Exploring the Dataset Landscape for Automated Propaganda Detection_A Data-Centric Insight.pdf open access
Description: Articolo completo
Type: versione editoriale
Size 204.86 kB
Format Adobe PDF
|
204.86 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
University of Cagliari