Ontology Embedding using the BERT Model

Supervisor: Jieying Chen (j.chen2@vu.nl)

Abstract

Ontologies have become an indispensable tool for the semantic web, knowledge graph generation, and intelligent systems. Traditionally, ontologies are expressed in symbolic forms, which sometimes lack the ability to capture latent semantics or support tasks such as similarity computation. Recent advances in neural embeddings, particularly the Bidirectional Encoder Representations from Transformers (BERT) model, have shown significant potential in capturing contextual nuances in textual data. Embedding ontologies using the BERT model could bridge the gap between symbolic representations and continuous vector spaces, offering richer semantic interpretations and enabling a host of downstream applications.

Objectives

Analyze the current landscape of ontology embeddings, understanding the capabilities, strengths, and weaknesses of existing methods while spotlighting the unique features of the BERT model in textual embeddings.
Design a framework to convert traditional ontology structures into a form suitable for embedding using BERT, ensuring preservation of hierarchical and relational information.
Adapt and fine-tune BERT on specific ontology datasets, emphasizing the capture of both explicit and implicit semantic information.
Develop an evaluation protocol that compares the BERT-based embeddings with traditional ontology representations in terms of semantic coherence, representation fidelity, and utility in downstream tasks.
Explore the utility of the BERT-embedded ontologies in tasks like semantic search, ontology alignment, and similarity computations, benchmarking against traditional methods.

References

Jiaoyan Chen, Pan Hu, Ernesto Jiménez-Ruiz, Ole Magnus Holter, Denvar Antonyrajah, Ian Horrocks: OWL2Vec*: embedding of OWL ontologies. Mach. Learn. 110(7): 1813-1845 (2021)
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805.