This publication is the result of a multi-year collaboration with our colleagues in Vienna, Mete Sertkan and Julia Neidhardt. It tackles the problem of establishing a ground truth for the data models in content-based recommender algorithms in the domain of destination recommendation. The article was published open access in Frontiers in Big Data.
Characterizing items for content-based recommender systems is challenging in complex domains such as travel and tourism. In the case of destination recommendation, no feature set can be readily used as a similarity ground truth, which makes it hard to evaluate the quality of destination characterization approaches. Furthermore, the process should scale well for many items, be cost-efficient, and, most importantly, correct. To evaluate which data sources are most suitable, we investigate 18 characterization methods that fall into the following categories: venue data, textual data, and factual data. We make these data models comparable using rank agreement metrics and reveal which data sources capture similar underlying concepts. To support choosing more suitable data models, we capture the desired concept using an expert survey and evaluate our characterization methods toward it. We find that the textual models to characterize cities perform best overall, with data models based on factual and venue data being less competitive. However, we show that data models with explicit features can be optimized by learning weights for their features.
Linus W. Dietz, Mete Sertkan, Saadi Myftija, Sameera Thimbiri Palage, Julia Neidhardt, and Wolfgang Wörndl. “A Comparative Study of Data-driven Models for Travel Destination Characterization.” In: Frontiers in Big Data 5 (Apr. 2022). ISSN: 2624-909X. DOI: 10.3389/fdata.2022.829939