Rita Ambu
Is Wikipedia a latent gene ontology?
Dessì, Nicoletta;Atzori, Maurizio
2017-01-01
Abstract
Despite the significant contribution from specialized ontologies and text mining methods, the evaluation of the semantic similarity of genes remains difficult because of the complex functions in which genes are involved. A less exploited resource is Wikipedia that stores more than 10400 articles about human genes: each gene name identifies the corresponding Wikipedia page resuming gene's properties in short sentences where hyperlinks define relationships with other genes in Wikipedia. This paper evaluates the extent to which the Wikipedia can be trusted for assessing the similarity of a gene pair as the distance between their Wikipedia pages. We present a set of experiments that make use of TagMe (a powerful tool for evaluating the distance of two Wikipedia pages based on their annotations) to calculate the semantic similarity of several sets of genes on Wikipedia. Results compare well with gold standards and semantic similarity values evaluated on gene ontologies. The paper demonstrates the effectiveness of Wikipedia in recognizing functional groups of genes, the quality and the wealth of its knowledge about genes as well the accuracy of TagMe.| File | Size | Format | |
|---|---|---|---|
| wetice17 - is wikipedia a latent gen ontology.pdf Solo gestori archivio
Type: versione editoriale
Size 460.84 kB
Format Adobe PDF
|
460.84 kB | Adobe PDF | & nbsp; View / Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
University of Cagliari