Shopping Data Schema Induction (MSc)

Supervisor: Benno Kruit (

Keywords: Ontology Induction, Rule Learning, Attribute Extraction, Clustering


Online shopping can be fun, but it can be pretty hard to find exactly what you want. To support product search, many e-commerce websites don’t only support text queries, but also provide filters on specific product attributes such as color, size, material, or features, which is known as faceted search. However, for large product catalogues from many vendors, it is infeasible to standardize the attributes that vendors should use to describe their products. In order to support effective faceted search, it is therefore necessary to integrate the provided attributes into a coherent data model that can be used for filtering.


This project aims to examine approaches for inducing a coherent ontology for product descriptions, which describes a set of product types and attributes, how those are related, and what kind of values the attributes may have. The starting point is an analysis of the Multimodal Attribute Extraction dataset, and various attribute clustering approaches. Then, the project may incorporate association rule learning to find useful structures for faceted search.