Result-driven Interactive Visual Support of Parameter Selection for Dimensionality Reduction
People without a technical background or knowledge of machine learning (ML) technology (non-technical experts) have become a major target group of ML-applications.
Nonetheless, ML-systems still rarely support them in their informed use. This thesis investigates a new visual interface to explore new ways towards an informed use. The prototypical interface was built upon an existing ML-application in the research project IKON, in which dimensionality reduction (DR) is applied. DR is a widely used tool for the interpretation of high-dimensional ML-results. Due to the effects of information loss caused by reduced dimensionality, it can produce artifacts and transform the same high dimensional result into 2D representations that may vary a lot. Different criteria may be important depending on the task the DR visualization is used for. In the use case IKON the similarity of research projects is based on the embedding of their project abstracts into a multidimensional space.
The thesis is divided into two main tasks: The development of an interface on the one and the development of sorting measures for DR visualizations on the other hand. The goal of sorting visualized results is to enable comparing and evaluating the results against each other. For this, I chose two metrics, which were deemed most suitable for the tasks: the first is a parameter intrinsic to the t-SNE algorithm. The second metric is obtained through a secondary, higher-level dimensionality reduction of the result space, again with t-SNE. I embedded the sorting of results into the interface prototype in a grid-like small multiples visualization, which I developed in a user-centered design process.