Assessing Domain Coverage of an Ontology

Supervisor: Romana Pernisch (r.pernisch@vu.nl)

Background

When ontologies and knowledge graphs are engineered, they capture the domain at a specific moment in time. However, it is very difficult to assess wheather an ontology is covering the domain as the engineering process is often mostly manual and therefore it is not feasible have read or seen all possible documents that could be considered as part of the domain. Therefore, OntoEval presents a framework to evaluate the domain coverage of an ontology and in this project we want to apply it to our own ontologies/domains.

This will challenge the reproducibility of OntoEval.

Description

In this project you will make use of an existing pipeline to investigate the coverage of a specific domain ontology against a corpus of text. You will specifically use the OntoEval pipeline, but you will have to adjust it to the specific domain that you will be investigating. Two domains that could potentially be used is the domain of Clinical Trial Outcomes or Companion Planting, as for both of them we have ontologies readily available. This project involves pre-training/finetuning an already existing (L)LM.

In the case of Clinical Trial Outcomes, the text against which this evaluation is conducted at the clinical trials themselves, which do not consist of larg amounts of text. Therefore, the technical challenge is to be able to adjust the pipeline to the special conditions of the domain.

In the case of Companion Planting, the texts against which this evaluation could be conducted are general literature on companion planting or websites. For this, first a collection of documents would need to be done, before one can apply OntoEval.

In the case of a MSc thesis, this project can be extended into different directions.

Literature

“OntoEval: an Automated Ontology Evaluation System” [link]