Ethnic AI: Bias Detection and Mitigation in Large Language Models

Supervisor: Jieying Chen (j.y.chen@vu.nl)

Abstract

Large Language Models (LLMs) are widely used in diverse applications ranging from information retrieval to content creation has underscored the importance of their reliability and neutrality. While LLMs have proven to be valuable tools, their training on vast and varied datasets may inadvertently introduce or perpetuate biases present in the data. Employing ontologies, structured representations of knowledge with predefined relationships, can serve as a unique way to validate and rectify biased outputs by benchmarking generated answers against a standardized knowledge base

Objectives

Investigate existing methodologies for bias detection in LLMs, their limitations, and the potential use of ontologies as validators.
Design and curate a comprehensive ontology capturing unbiased representations of various knowledge domains, emphasizing those particularly prone to bias.
Develop a framework that utilizes the curated ontology to compare and contrast the outputs generated by LLMs, highlighting potential deviations that indicate bias.
Design metrics to measure the degree and nature of bias in LLM outputs, providing a standardized way to assess and compare biases across different models.
Propose and implement strategies to rectify identified biases in LLMs, leveraging ontology as a guide for correct and neutral answers.
Test the developed bias mitigation techniques using real-world scenarios and diverse datasets to assess their efficacy and robustness.

References

Dave Lewis, David Filip and Harshvardhan J. Pandit: An Ontology for Standardising Trustworthy AI. https://www.intechopen.com/chapters/76436.