Pixel-wise Semantic Segmentation for Low-power Devices
Academic research centers have lately become more interested in autonomous vehicles, particularly in pixel-wise semantic segmentation algorithms. The focus has been upon convolutional neural networks' accurate and efficient predictions of road scene objects as well as on the increase of pixel-wise annotated datasets which allow more accurate and well generalizing networks.
The aim of this study is to investigate how the modification of the existing efficient convolutional architectures can reduce the computational cost and become practically useful on low-power devices. A CNN was, therefore, developed for pixel-wise semantic segmentation of road images based on existing efficient architectures, namely SegNet, SegNet-basic and ENet for low-power devices, such as the Drive PX and Jetson TX series. All modified networks have been trained on mixed datasets made up of real and synthetic images of urban scenes, to produce better generalization and improved accuracy. These trained networks were tested both quantitatively on the Cityscapes validation dataset of 500 annotated images, and qualitatively, using a collection of personally taken photos from varied sources. The most accurate and better generalizing models were chosen for further training, which resulted in a model appropriate for pixel-wise labeling of images from different cameras, including the fisheye camera attached to the autonomous car belonging to the Free University of Berlin.