Giuliano Mion

QUEEREOTYPES: A Multi-Source Italian Corpus of Stereotypes towards LGBTQIA+ Community Members

Manuela Sanguinetti;
2024-01-01

Abstract

The paper describes a dataset composed of two sub-corpora from two different sources in Italian. The QUEEREOTYPES corpus includes social media texts regarding LGBTQIA+ individuals, behaviors, ideology and events. The texts were collected from Facebook and Twitter in 2018 and were annotated for the presence of stereotypes, and orthogonal dimensions (such as hate speech, aggressiveness, offensiveness, and irony in one sub-corpus, and stance in the other). The resource was developed by Natural Language Processing researchers together with activists from an Italian LGBTQIA+ not-for-profit organization. The creation of the dataset allows the NLP community to study stereotypes against marginalized groups, individuals and, ultimately, to develop proper tools and measures to reduce the online spread of such stereotypes. A test for the robustness of the language resource has been performed by means of 5-fold cross-validation experiments. Finally, text classification experiments have been carried out with a fine-tuned version of AlBERTo (a BERT-based model pre-trained on Italian tweets) and mBERT, obtaining good results on the task of stereotype detection, suggesting that stereotypes towards different targets might share common traits.
2024
Inglese
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
9782493814104
European Language Resources Association (ELRA)
N. Calzolari, M.-Y. Kan, V. Hoste, A.Lenci , S. Sakti , N. Xue
13429
13441
13
Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
Comitato scientifico
2024
ita
internazionale
scientifica
Corpus; Italian; LGBTQIA+; Stereotypes
no
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
Teresa Cignarella, Alessandra; Sanguinetti, Manuela; Frenda, Simona; Marra, Andrea; Bosco, Cristina; Basile, Valerio
273
6
4.1 Contributo in Atti di convegno
open
info:eu-repo/semantics/conferencePaper
File in questo prodotto:
File Dimensione Formato  
2024.lrec-main.1176.pdf

accesso aperto

Tipologia: versione editoriale (VoR)
Dimensione 433.45 kB
Formato Adobe PDF
433.45 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Questionario e social

Condividi su:
Impostazioni cookie