Springe direkt zu Inhalt

Leon Sixt:

RenderGAN: Generating realistic labeled data - with an application on decoding bee tags


Computer vision aims to reconstruct high-level information from raw image data e.g. the pose of an object. Deep Convolutional Neuronal Networks (DCNN) are showing remarkable performance on many computer vision tasks. As they are typically trained in a supervised setting and have a large parameter space, they require large sets of labeled data. The costs of annotating data manually can render DCNNs infeasible. I present a novel framework called RenderGAN that can generate large amounts of realistic labeled images by combining a 3D model and the Generative Adversarial Network (GAN) framework. In my approach, the distribution of image deformations (e.g. lighting, background, and detail) is learned from unlabeled image data to make the generated images look realistic. I evaluate the RenderGAN framework in the context of the BeesBook project, where it is used to generate labeled data of honeybee tags. A DCNN trained on this generated dataset shows remarkable performance on real data.

Bachelor of Science (B.Sc.)