Is Similarity Computation using DBpedia different from Wikidata?

Academic Advisor: Dr. Abderrahmane Khiat
Discipline: Ontology Matching
Degree: Master of Science (M.Sc.)
Project:

Requirements:

Java; Semantic Web Technologies: SPARQL, OWL, RDF, Jena; Semantic Similarity Measures

Contents

Ontologies are adopted by various applications in diverse domains, such as, biomedical, geo spatial or legal data, to describe their content and express the semantics of information, since they play an important role in achieving semantic interoperability (Neches  et al. 1991). However, in the context of Web information retrieval or social networks, semantic interoperability is also hampered by the conceptual diversity of used vocabularies and various domain requirements (Euzenat et al., 2013). 
A promising solution to this vocabulary heterogeneity is to apply an alignment (or matching) process to build a bridge between ontologies/linked data to close the semantic gap (Euzenat et al., 2013). Ontology alignment is defined as the process of identification of semantic correspondences (Concepts and Relations/Instances) between entities of different ontologies. These semantic correspondences bridge indeed heterogeneous ontologies together and ensure their semantic interoperability. The automatic identification of semantic correspondences is not a trivial task due to conceptual diversity (Bouquet et al., 2005) at various levels; terminological, structural, conceptual, etc. 

For instance, if we take the following mapping: the concept  "Automobile" of ontology O1 is equivalent to the concept "Car" of ontology O2, however, this mapping could not be detected using string based algorithm, an alternative solution is to use a dictionary or the structure of the ontology, since they have the same type "Vehicle". Using the structure is not an absolute solution because if we take "Cat" and "Dog"  which have the same type "Animal" but in "anatomy domain" they are considered as two different concepts because they have different anatomical structure.

If the Master student is up to the challenges, their designed systems should compete in OAEI evaluation campaign (ontology Alignment Evaluation Initiative http://oaei.ontologymatching.org/). This contest is organized each year and different competitors participate with their systems in order to evaluate the best system against different Benchmarks (ontologies).

Objectives

  • Investigating different similarity measures which use DBpedia (such as the work of Piao (Piao et al.,2015)) and Wikidata. 

  • If such measures do not exist for one of these sources (Wikidata or DBpedia) the student should implement and apply the same or different similarity measures on both Wikidata or DBpedia.

  •  Analyzing the differences of the applied measures. 

  • The next step consists to apply these similarity measures on instance matching benchmark (linked data) to identify the instances that describe the same real object.

References

  • R. Neches, R. E. Fikes, T. Finin, T. Gruber, R. Patil, T. Senator, and W. R. Swartout. Enabling technology for knowledge sharing. AI magazine, 12(3): 36, 1991.

  • S. s. A. Guangyuan Piao and J. G. Breslin. Computing the semantic similarity of resources in dbpedia for recommendation purposes. 2015.

  • P. Bouquet, M. Ehrig, J. Euzenat, E. Franconi, P. Hitzler, M. Krotzsch, L. Serafini, G. Stamu, Y. Sure, and S. Tessaris. Specification of a common framework for characterizing alignment.2005.

  •  J. Euzenat, P. Shvaiko, Ontology matching, volume 18. Springer, 2013.

Websites

http://oaei.ontologymatching.org/